ABSTRACT
 Title of dissertation: NONLINEAR AND MULTIRESOLUTION
 ERROR COVARIANCE ESTIMATION IN
 ENSEMBLE DATA ASSIMILATION
 Sabrina Rainwater
 Doctor of Philosophy, 2012
 Dissertation directed by: Professor Brian R. Hunt
 Department of Mathematics and
 Ensemble Kalman Filters perform data assimilation by forming a background co-
 variance matrix from an ensemble forecast. The spread of the ensemble is intended to
 represent the algorithm?s uncertainty about the state of the physical system that produces
 the data. Usually the ensemble members are evolved with the same model.
 The first part of my dissertation presents and tests a modified Local Ensemble Trans-
 form Kalman Filter (LETKF) that takes its background covariance from a combination of
 a high resolution ensemble and a low resolution ensemble. The computational time and
 the accuracy of this mixed-resolution LETKF are explored and compared to the standard
 LETKF on a high resolution ensemble, using simulated observation experiments with the
 Lorenz Models II and III (more complex versions of the Lorenz 96 model). The results
show that, for the same computation time, mixed resolution ensemble analysis achieves
 higher accuracy than standard ensemble analysis.
 The second part of my dissertation demonstrates that it can be fruitful to rescale
 the ensemble spread prior to the forecast and then reverse this rescaling after the fore-
 cast. This technique, denoted ?forecast spread adjustment? provides a tunable parameter
 that is complementary to covariance inflation, which cumulatively increases the ensemble
 spread to compensate for underestimation of uncertainty. As the adjustable parameter ap-
 proaches zero, the filter approaches the Extended Kalman Filter when the ensemble size
 is sufficiently large. The improvement provided by forecast spread adjustment depends
 on ensemble size, observation error, and model error. The results indicate that it is most
 effective for smaller ensembles, smaller observation errors, and larger model error, though
 the effectiveness depends significantly on the type of model error.
NONLINEAR AND MULTIRESOLUTION ERROR COVARIANCE
 ESTIMATION IN ENSEMBLE DATA ASSIMILATION
 by
 Sabrina Rainwater
 Dissertation submitted to the Faculty of the Graduate School of the
 University of Maryland, College Park in partial fulfillment
 of the requirements for the degree of
 Doctor of Philosophy
 2012
 Advisory Committee:
 Professor Brian R. Hunt, Chair
 Professor Kayo Ide
 Professor Eugenia Kalnay
 Professor Takemasa Miyoshi
 Professor Ross Salawitch, Dean?s representative
Table of Contents
 List of Tables iv
 List of Figures v
 List of Abbreviations viii
 1 Introduction 1
 1.1 Background: Weather Forecasting and Data Assimilation . . . . . . . . . 1
 1.1.1 The Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . . . . 2
 1.1.2 Experimental Methods and Models . . . . . . . . . . . . . . . . . 4
 1.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
 1.3 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
 2 Mixed Resolution Ensemble Data Assimilation 12
 2.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
 2.2 Methodology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
 2.2.1 ETKF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
 2.2.2 mETKF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
 2.2.3 A side by side comparison . . . . . . . . . . . . . . . . . . . . . 28
 2.2.4 Cost Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
 2.3 Numerical Experiments and Results. . . . . . . . . . . . . . . . . . . . . 33
 2.3.1 Modifications to the Algorithms. . . . . . . . . . . . . . . . . . . 34
 2.3.2 Models. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
 2.3.3 Experiments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
 2.3.4 Parameters Tuned. . . . . . . . . . . . . . . . . . . . . . . . . . 43
 2.3.5 Balancing Limited Computational Resources. . . . . . . . . . . . 46
 2.3.6 Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
 2.4 Summary and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 52
 ii
3 Ensemble Data Assimilation with an Adjusted Forecast Spread 57
 3.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
 3.2 Kalman Filters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
 3.2.1 Ensemble Kalman Filters. . . . . . . . . . . . . . . . . . . . . . 63
 3.2.2 Extended Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . 67
 3.3 Methodology and Theory. . . . . . . . . . . . . . . . . . . . . . . . . . . 68
 3.3.1 Ensemble data assimilation with forecast spread adjustment. . . . 69
 3.3.2 Relation to inflation of the observation error covariance. . . . . . 72
 3.3.3 Which to inflate, P a or P b? . . . . . . . . . . . . . . . . . . . . . 76
 3.3.4 Comparison to the XKF. . . . . . . . . . . . . . . . . . . . . . . 77
 3.4 Experiments and Results . . . . . . . . . . . . . . . . . . . . . . . . . . 79
 3.4.1 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
 3.4.2 Experimental Design . . . . . . . . . . . . . . . . . . . . . . . . 81
 3.4.3 Tuning ? and ? . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
 3.4.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
 3.5 Summary and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 92
 4 Summary and Future Work 96
 Bibliography 100
 iii
List of Tables
 2.1 Varying the size of the low & high resolution ensembles. This table gives
 the analysis RMSEs for our basic scenario with various mixed ensembles.
 A low resolution component with 20 or 30 members does nearly as well as
 with 60. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
 2.2 Summary of scenarios. The limited observation network consists of 40
 observations and the dense observation network has full coverage (960
 observations). The true forcing constant is F t = 15; forcing constants
 different from F = 15 simulate model error. The observation error refers
 to the standard deviation of the error in the simulated observations. . . . . 48
 2.3 The time averaged RMSE over 4500 assimilation steps for various mixed
 and high resolution ensembles in various scenarios. The standard deviation
 of a 500 assimilation step moving average in most cases was between 0.01
 and 0.03. The 4 member large model error case (F = 12.5) and all of
 the 3 member high resolution cases were not as precise, with a standard
 deviation of the moving average between 0.04 and 0.07. . . . . . . . . . 56
 2.4 Standard deviation of fluctuations in the RMSE with time, over the same
 4500 assimilation steps used in Table 2.3. . . . . . . . . . . . . . . . . . 56
 2.5 Time averaged RMSE and standard deviation of fluctuations for our small-
 scale resolving scenario. Mixed resolution RMSEs are accurate to ?0.01
 and high resolution RMSEs are accurate to ?0.02 as measured by the stan-
 dard deviation of a 500 assimilation step moving average. . . . . . . . . . 56
 3.1 Comparison between the XKF RMSE and the ?ETKF RMSEs for ? =
 1, 12 ,
 1
 4 ,
 1
 8 , and 10
 ?6 and k = 40, 60, and 80 with our default parameters. . 91
 iv
List of Figures
 2.1 Sample Model II and Model III states shown simultaneously. Model III
 has short wave coupling at high resolution, while Model II, with a lower
 resolution, is smooth. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
 2.2 Dependence of the analysis RMSE on ? for the 2 & 30 (circle) and 3 & 30
 (diamond) mixed cases in our basic scenario. The results are averaged over
 4500 assimilation steps after a spin up of 500 assimilation steps. When
 ? = 1 only the low resolution perturbations are used to determine the
 analysis update. The flatness of the graph indicates that ? does not need
 extensive tuning. Notice that ? = .2 is close to optimal in both cases
 shown, and is well separated from the large errors we observe for ? < 0.05. 44
 2.3 Top: the RMSE of the analysis generated in our basic scenario with a
 low-resolution model, averaged over 4500 assimilation steps after a 500
 time step spin-up period. Middle: the RMSE of the analysis generated
 in our basic scenario with a high-resolution model, averaged over 4500
 assimilation steps after a 500 assimilation step spin-up period. Bottom:
 the RMSE of the analysis generated in our small-scale resolving scenario
 with a high-resolution model, averaged over 4500 assimilation steps after
 a 500 assimilation step spin-up period. The dashed line indicates the small
 scale variability, i.e. the minimum RMSE with a low resolution model for
 our small-scale resolving scenario. . . . . . . . . . . . . . . . . . . . . . 49
 2.4 The root mean square error at each assimilation step for the initial 1000
 assimilation steps. This data is taken from the run of the 2 high resolution
 and 30 low resolution basic scenario. . . . . . . . . . . . . . . . . . . . . 50
 v
2.5 The average forecast computation time taken for each assimilation time
 step versus the average root mean square error over 4500 assimilation steps
 for left: our basic scenario and right: our small-scale resolving scenario.
 Better computation times and better RMSE (lower error) are both closer
 to the origin. Left: Data is shown for high resolution analysis with 3, 4, 5,
 and 6 members, low resolution analysis with 10, 20, 30, and 60 members,
 mixed resolution analysis on 1 high resolution member with a low resolu-
 tion ensemble of 5, 10, 15, 30, and 60, and mixed resolution analysis on
 a mixed ensemble of 30 low resolution members and 1, 2, and 3 high res-
 olution members. Right: Data is shown for high resolution analysis with
 15, 16, 17, and 18 members, low resolution analysis with 10, 15, and 20
 members, mixed resolution analysis on 1 high resolution member with a
 low resolution ensemble of 10, 15, and 20, and mixed resolution analy-
 sis on a mixed ensemble of 30 low resolution members and 9, 12, and 15
 high resolution members. The dashed black line represents the variation
 in the small scales and the dashed grey line corresponds to twice as much
 variation in the small scales. . . . . . . . . . . . . . . . . . . . . . . . . 54
 3.1 Left: data assimilation with a forecast spread adjustment of ?. Right: stan-
 dard data assimilation with Ro inflated by ?2. The two data assimilation cy-
 cles depicted above are initialized during the forecast phase, i.e. Xb2 = Xb,f . 75
 3.2 Left: the data assimilation cycle on P b and P a including a forecast spread
 adjustment of ? and inflation of P b with ?. Right: the data assimilation
 cycle on P b and P a including inflation of P b with ?b and inflation of P a
 with ?a. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
 3.3 Sample Model II output with observations for 60 grid points, half of which
 are observed (*), observation error ? = 1, smoothing parameter K = 2,
 and forcing constant F = 12. . . . . . . . . . . . . . . . . . . . . . . . . 81
 3.4 Tuning ? (grey) and the effect of ? on ensemble spread during the forecast
 phase (red) and analysis phase (blue) for our default parameters: k = 10,
 F = 14 (F t = 12), ? = 1. Comparison of the ensemble spread during the
 analysis phase (blue) to the actual RMSE for various values of ? (grey).
 The solid grey curve shows that ? tunes to 2.5. We tuned ? using the tuned
 value of ? for ? = 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
 3.5 Tuning ? and the effect of ? on ensemble spread for our default parameters
 with ? = 1. The dotted lines correspond to the background and the solid
 lines correspond to the analysis. The spread is indicated in black and the
 RMSE in grey. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
 vi
3.6 Left: The RMSE as a function of ? for (a) our default parameters: an
 ensemble with k = 10 members, model error due to the forcing constant
 F = 14 where the true forcing is F t = 12, and observation error of 1;
 (b) various ensemble sizes: k = 5, 10, 20, and 40 members; (c) different
 amounts of model error: F = 9, 10, 11, 12, 13, 14, and 15 with F t = 12;
 (d) different amounts of observation error: ? = 0.5, 1, and 2. In each
 of (b), (c) and (d), one of the parameters from (a) (k = 10, F = 14,
 ? = 1) is varied, keeping the other two parameters at their default values;
 the curve graphed in (a) appears in (b), (c) and (d) as well. We restrict
 the range of the y-axis in (a) to better show the structure of the curve.
 Right: Comparing the RMSE when ? = 1 to the minimum RMSE when
 ? is allowed to vary. Subplot (e) shows the percent improvement in the
 RMSE when varying ? with our default parameters, i.e. setting ? = 2.5
 gives results with an RMSE that is 14% smaller than when ? = 1. Also
 shown are the percent improvements versus (f) various ensemble sizes, (g)
 different forcing constants, and (h) different amounts of observation error.
 Each right hand figure (e), (f), (g), and (h) is derived from the same data
 as the figure immediately to its left, i.e. (a), (b), (c), and (d) respectively. . 88
 vii
List of Abbreviations
 EnKF ensemble Kalman filter
 ETKF Ensemble Transform Kalman Filter
 mETKF mixed resolution ETKF
 ?ETKF ETKF with forecast spread adjusted by ?
 LETKF Local Ensemble Transform Kalman Filter
 mLETKF mixed resolution LETKF
 ?LETKF LETKF with forecast spread adjusted by ?
 OSSE Observing System Simulation Experiment
 rms root mean square function
 RMSE time averaged root mean square analysis error
 XKF Extended Kalman Filter
 viii
Chapter 1
 Introduction
 1.1 Background: Weather Forecasting and Data Assimilation
 To predict the weather, we require both a numerical model for the partial differential
 equations governing the evolution of weather and initial conditions for the current atmo-
 spheric state. Since weather is chaotic, the forecast must periodically be corrected to more
 closely match the observations. This corrected model state is called the analysis. The pro-
 cess of forecasting a model state and modifying it according to observations is called data
 assimilation.
 Neither the forecast nor the observations exactly coincide with the true state of the
 weather. Data assimilation balances the uncertainties in the forecast (also called the ?back-
 ground? state) and the observations, and produces an ?analysis? state deemed to be the
 most likely state given the available information.
 1
1.1.1 The Kalman Filter
 A common paradigm for data assimilation is the Kalman filter. The Kalman filter
 analysis can be described by xa = xb + K
 (
 yo ?Hxb
 )
 , where xa and xb are vectors rep-
 resenting the analysis and background states, yo represents the observations, H is a linear
 operator (e.g., an interpolation operator) which transforms model vectors into expected
 observations, K = P bH>
 (
 HP bH> + Ro
 )
 ?1 is called the Kalman gain matrix, and P b
 and Ro represent the background and observation error covariance matrices.
 The Kalman filter is optimal provided the dynamics are linear, the observed quan-
 tities are linear functions of the model state, and the errors are Gaussian. Even when not
 optimal, the Kalman filter often works well for nonlinear models and nonlinear observa-
 tion operators by using a linear approximation.
 Advanced data assimilation techniques require an estimate of the uncertainty in the
 background state. The background uncertainty at a particular time is related to, but dif-
 fers from the uncertainty at previous times. Nonlinear Kalman filters (e.g. XKF, ETKF,
 LETKF) approximately evolve the background error covariance but tend to underestimate
 it. In ensemble Kalman filters, this is typically compensated for by multiplicative inflation
 of the background error covariance prior to analysis.
 In the Extended Kalman Filter (XKF) (Jazwinski 1970; Evensen 1992), the esti-
 mated background error covariance is evolved by linearizing the model around the back-
 2
ground state, which is also being evolved by the model. With the complex weather predic-
 tion models however, the computation cost of the XKF makes it impractical.
 Another way to estimate the background error covariance is through the use of a
 collection of forecasts, called an ensemble. According to sample statistics, the background
 error covariance is approximated by P b ? 1k?1X
 bXb>, where k is the size of the ensemble,
 and Xb is a matrix whose columns are the ensemble perturbations, i.e., the difference
 between the model state and the mean. Most ensemble techniques use the mean of the
 background ensemble to provide the background state. Rather than output a single, optimal
 analysis state, the ensemble is regenerated as k suboptimal states centered around the
 optimal analysis.
 The Ensemble Transform Kalman Filter (ETKF) (Bishop et al. 2001; Wang et
 al. 2004) is one algorithm that regenerates the ensemble after performing the analysis.
 However, forecasting a large ensemble can be costly. Furthermore, artificial correlations
 between model values at distant grid points are induced by computing P b from sample
 statistics. These difficulties are circumvented by performing an analysis for each grid
 point, considering only nearby observations when computing the update. This method is
 called localization and several localization methods have been proposed (Houtekamer and
 Mitchell 1998; Ott et al. 2004; Hamill et al. 2001; Whitaker and Hamill 2002). The Local
 Ensemble Transform Kalman Filter (LETKF) is described in Hunt et al. (2007). Though
 3
my research can be applied to other ensemble methods, most of my research is performed
 with ETKF the or the LETKF.
 1.1.2 Experimental Methods and Models
 I used Observing System Simulation Experiments (OSSEs) to test the accuracy of
 the methods I developed. In OSSEs, one forecasts a model state designated the ?truth?
 and generates observations by adding error. Then, pretending not to know the truth, data
 assimilation is performed, and the analysis is compared to the truth to measure the accu-
 racy in terms of the time averaged, spatial root mean square error. Improvements to data
 assimilation are generally tested on simple models before being implemented and tested
 in more realistic weather models because with a simple model, OSSEs can be run quickly
 and at low computation cost.
 The simple models I used for testing are the Lorenz (2005) Models II, and III . These
 are generalizations of the well known Lorenz 96 model (Lorenz 1996). The domain for
 the Lorenz 96 model is a set of grid points (typically 40) describing a circle. The model
 consists of three terms
 dxn
 dt = (xn+1 ? xn?2) xn?1 ? xn + F (1.1)
 simulating the chaotic effects of advection, diffusion, and forcing respectively. Model II
 4
smooths the Lorenz 96 model over a domain of K times as many grid points (i.e. increas-
 ing the resolution by a factor of K):
 dxn
 dt =
 (
 xn+Kx
 avg
 n?K
 )avg
 ? xavgn?2Kx
 avg
 n?K ? xn + F , (1.2)
 where qavgj refers to the average value of q over the region K grid points in length centered
 at grid point j. Model III simulates even greater resolution, sufficient to describe coupled
 small scale activity. Model III evolves the variables zn = xn + yn according to
 dzn
 dt =
 (
 xn+Kx
 avg
 n?K
 )
 avg
 ? xavgn?2Kx
 avg
 n?K ? xn + F
 +b2 (yn+1 ? yn?2) yn?1 ? byn + c (xn+1yn?1 ? yn?2xn?1) ,
 where xn and yn represent the large scale and small scale components of zn respectively
 (defined precisely in Section 2.3.2, equation (2.25)), b controls the amplitude and speed of
 the small scale activity and c controls the coupling.
 1.2 Objectives
 Ensemble Kalman filters are limited by the computational cost of evolving a large,
 high-resolution ensemble. Larger ensembles correspond to a greater sample size, which
 allows better statistical analysis. On the other hand, high model resolution is desirable
 5
because it improves the individual forecasts at all scales and better resolves the small scale
 activity. So traditional ensemble Kalman filters compromise between the resolution of the
 model and the size of the ensemble. Much of the literature on ensemble Kalman filters uses
 the ensemble mean as the best estimate for the truth, but Gao and Xue (2008) proposed
 updating a high resolution forecast with the background error covariance estimated from
 a low resolution ensemble.
 The goal of the first part of my doctoral research (Chapter 2) is to develop and test
 a technique for mixed-resolution ensemble data assimilation that incorporates information
 from both a high resolution ensemble and a low resolution ensemble. This alleviates the
 trade-off between model resolution and ensemble size, allowing a large low-resolution
 ensemble to bolster a small high-resolution ensemble. My results demonstrate that mixed-
 resolution ensemble data assimilation outperforms single-resolution data assimilation for
 the same computational cost.
 The algorithm I developed for mixed-resolution ensemble data assimilation effec-
 tively estimates the covariance as a weighted average of the background error covariances
 estimated from high and low resolution ensembles. I discovered that the results are not
 very sensitive to exact tuning of the weight parameter. Exploration of the reasons for this
 lead to the second part of my dissertation research.
 The goal of the second part of my doctoral research (Chapter 3) is to improve en-
 6
semble data assimilation for nonlinear models by developing and testing an algorithm in
 which the spread of the ensemble during the forecast does not correspond to the amount of
 uncertainty. With forecast spread adjustment I also aim to connect many aspects of data as-
 similation, specifically the ensemble Kalman filters and the extended Kalman filter, back-
 ground and analysis error covariance inflation, and forecast spread and observation error
 covariance inflation. In the ensemble Kalman filters, the background mean can be thought
 of as a random variable estimating the true model state, with the background ensemble
 representing random sampling according to the distribution of this random variable. In
 other words, we assume that the spread of the ensemble corresponds to the uncertainty in
 the mean. However, I have found advantages to ensemble data assimilation when forecast-
 ing an ensemble with a scaled spread, provided that the ensemble spread is rescaled prior
 to analysis. I call this technique forecast spread adjustment.
 For the idealized case of data assimilation with a linear model, forecast spread ad-
 justment has no effect, i.e. the algorithm with forecast spread adjustment simplifies to the
 standard algorithm without forecast spread adjustment. Thus a discussion of the effects of
 forecast spread adjustment is only relevant and meaningful for a nonlinear model such as
 numerical weather models.
 If the ensemble is clustered too closely about the mean, the true model state could lie
 outside of the spread of the ensemble, especially when model error is substantial. Model
 7
nonlinearity can further accentuate the discrepancy between the ensemble and the truth.
 For a large ensemble with infinitesimal spread, the rescaled perturbations describe a linear
 approximation for the error similar to that of the XKF. Compared to the XKF, ensemble
 Kalman filters can better track nonlinear effects as the ensemble is spread over a larger
 region of the attractor. I show that further expanding the ensemble spread can provide
 additional advantage, especially in the presence of model error.
 An alternate way of describing forecast spread adjustment is in terms of inflating
 the observation error covariance; though, as Sections 3.3.2 and 3.4.3 demonstrate, in the
 cases I examined, only the forecast spread adjustment interpretation correctly ascribes the
 observation, background, and analysis error estimates while the error estimates ascribed
 by the observation error covariance inflation interpretation exaggerate or underestimate
 the true error.
 Though the algorithms given in the literature on ensemble data assimilation with
 multiplicative inflation typically inflate the background error covariance, inflating the anal-
 ysis error covariance instead is often mentioned as being equally viable. Forecast spread
 adjustment can be used to transform an algorithm that inflates the background error co-
 variance into an algorithm that inflates the analysis error covariance or into an algorithm
 that inflates both by different amounts. Furthermore, multiplicative inflation (including
 possibly deflation) of both the background error covariance and the analysis error covari-
 8
ance separately is yet another way of describing ensemble data assimilation with forecast
 spread adjustment and background error covariance inflation.
 1.3 Organization
 My dissertation consists of two research projects, both of which I am submitting for
 publication. They are written as separate articles, so Chapter 2 and Chapter 3 are complete
 and discrete in and of themselves. This introduces some redundancy in my dissertation,
 especially between the introductions to Chapters 2 and 3. Chapter 2 describes mixed-
 resolution ensemble data assimilation. Chapter 3 describes ensemble data assimilation
 with an adjusted forecast spread.
 In Chapter 2, I start by introducing ensemble data assimilation and motivate mixed-
 resolution ensemble data assimilation. I describe the ETKF (Ensemble Transform Kalman
 Filter) algorithm (Bishop et al. 2001; Wang et al. 2004) and I describe how I adapted the
 ETKF algorithm to accommodate ensembles of two resolutions (mETKF). In the limit as
 two ensembles with the same number of members approach the same resolution I describe
 how the mETKF (on two ensembles with the same resolution) compares to the ETKF on
 the entire ensemble. I also describe the cost function that is minimized by my mETKF
 algorithm and how this could be extended to hybrid systems. I actually implemented a
 mixed resolution LETKF (Local ETKF, (Hunt et al. 2007)), in which I perform a mETKF
 9
analysis at each grid point, only allowing nearby observations to influence the analysis. I
 tested the mLETKF with OSSEs using Lorenz (2005) Models II and III as the low and high
 resolution models respectively. I compared the root mean square difference between the
 analysis mean and the truth for various mixed-resolution ensembles using the mLETKF
 and for various single-resolution ensembles using the LETKF respectively. I tested the
 mLETKF in perfect model scenarios and in scenarios with model error. My results show
 the trade-off between forecast computation time and time averaged root mean square anal-
 ysis error for my experiments with the mLETKF and the LETKF, and I discuss how using
 mixed resolution ensembles allows more accurate results for the same forecast computa-
 tion time.
 In Chapter 3, I start by introducing data assimilation, the ensemble Kalman filters
 (specifically ETKF and LETKF), and the extended Kalman filter. I describe ensemble data
 assimilation with forecast spread adjustment. I demonstrate how this reduces to standard
 data assimilation when the model is linear. I show that this is equivalent to observation
 error covariance inflation for an appropriately initialized ensemble and discuss why fore-
 cast spread adjustment is the more natural interpretation for my experiments. Forecast
 spread adjustment with multiplicative covariance inflating can also be described as multi-
 plicative inflation of the analysis error covariance followed by multiplicative deflation or
 inflation of the background error covariance. Furthermore in the limit of a large ensemble
 10
with a small forecast spread, ensemble data assimilation with forecast spread adjustment
 approaches the extended Kalman filter. I tested the LETKF with forecast spread adjust-
 ment via OSSEs on the Lorenz (2005) Model II for different amounts of model error,
 different ensemble sizes, and different amounts of observation error. I also tested how
 the ETKF with forecast spread adjustment compares to the XKF for large ensembles and
 small spread. I discuss the results of these experiments and the potential benefits of fore-
 cast spread adjustment.
 In Chapter 4, I discuss my conclusions and future directions for research.
 11
Chapter 2
 Mixed Resolution Ensemble Data Assimilation
 2.1 Introduction.
 Numerical weather prediction uses a numerical model of atmospheric physics to
 predict the future states of the atmosphere given an estimate for the current atmospheric
 state. Uncertainty about the current state, along with inaccuracies in the model dynamics,
 leads to (greater) uncertainty in the forecast. Quantifying this uncertainty is important for
 interpreting the forecast and for data assimilation. In data assimilation, information from
 a short term forecast is combined with information from recent observations, resulting in
 an estimate of the current atmospheric state used to initialize subsequent forecasts.
 One tool for assessing forecast uncertainty is an ?ensemble prediction system? (Leith
 1974) in which an ensemble of initial conditions are evolved by the model. Ideally, the
 ensemble of forecasts samples the future atmospheric state, thinking of it as a random vari-
 able. Weather services generally make a single forecast at a high resolution, determined
 by their computational and operational constraints, and forecast an ensemble at lower res-
 olution (Toth and Kalnay 1993; Molteni et al. 1996; Toth and Kalnay 1997; Buizza et al.
 12
2005). The resolution may change for different forecast lead times, e.g., forecast 7 days
 ahead, then forecast another 7 days at reduced resolution (Buizza et al. 2007). However,
 at a given lead time all of the ensemble members generally have the same resolution.
 In this chapter, we will consider potential advantages in describing forecast uncer-
 tainty of an ensemble with multiple members at each of two different resolutions. We will
 assess the advantage by proposing a mixed-resolution ensemble data assimilation system
 and comparing its results to the analogous single-resolution assimilation system.
 A common paradigm for data assimilation is the Kalman filter. The Kalman filter
 cycles between the analysis phase and the forecast phase. The ?analysis? generated by
 the Kalman filter is a function of the forecast estimate, called the ?background? state, the
 forecast uncertainty quantified as a background error covariance, new observations, and
 the observation uncertainty quantified as an observation error covariance. The resulting
 analysis state is an updated version of the background state, which better fits the observa-
 tions. Forecasting the analysis provides the background for the next cycle.
 Accurately estimating and evolving the background error covariance is a time con-
 suming process. There is growing interest in estimating the background error covari-
 ance using sample statistics on ensemble forecasts, either independently (as in ensemble
 Kalman filters), or in combination with variational methods (e.g., Lorenc 2003). In gen-
 13
eral, for a k member ensemble of background states
 Xb =
 [
 Xb1 Xb2 . . . Xbk
 ]
 (2.1)
 with mean x?b and perturbations
 Xb =
 [
 Xb1 ? x?b Xb2 ? x?b . . . Xbk ? x?b
 ]
 , (2.2)
 the background error covariance is estimated by
 P b = 1k ? 1X
 bXb>. (2.3)
 The number of members composing the ensemble and the resolution of the model states
 being evolved both contribute to the accuracy of the forecast uncertainty estimated with
 the ensemble and also to the computation time of the forecast.
 Most ensemble data assimilation literature, e.g. Houtekamer and Mitchell (1998),
 Anderson and Anderson (1999), Bishop et al. (2001), Whitaker and Hamill (2002), Ott et
 al. (2004), assumes that all ensemble members have the same resolution. While forecast-
 ing the background ensemble at a lower resolution substantially reduces the computational
 cost involved, it also decreases the accuracy of the analysis. We argue that an ensemble
 14
composed of forecasts evolved at two different resolutions may better characterize forecast
 uncertainty than a single resolution ensemble, within given computation constraints.
 Du (2004) proposed combining low resolution ensemble perturbations with a high
 resolution forecast as part of an ensemble forecast system. Gao and Xue (2008) propose
 data assimilation with a single high resolution forecast and a low resolution ensemble. In
 what they call dual resolution data assimilation, the low resolution ensemble is evolved
 and independently of the high resolution forecast, and the background error covariance
 estimated from the low resolution ensemble is also used to update a single high resolution
 forecast. Gao et al. (2010) extend this technique to use the hybrid method of Lorenc
 (2003). The ensemble perturbations are updated separately, but the ensemble analysis
 mean can be shifted toward or replaced by the high resolution analysis, as in Zhang and
 Zhang (2012).
 One of the disadvantages of pure ensemble methods is rank deficiency of the covari-
 ance, especially for small ensembles. Localization allows smaller ensembles as we discuss
 in Section 2.3.1, but the rank deficiency can still be a disadvantage when the ensemble size
 is too small. The background error covariance used in the ensemble based hybrid meth-
 ods is a weighted average of a full rank static covariance (such as that from a variational
 method) and a reduced rank ensemble estimated covariance (Hamill and Snyder 2000).
 However, the ensemble perturbations are updated separately with the reduced rank ensem-
 15
ble estimated error covariance and data assimilation with single resolution ensembles must
 still balance the available computational resources between the size and resolution of the
 ensemble.
 To achieve the accuracy of a high resolution ensemble for a lower computational
 cost, we have developed a modification of the Local Ensemble Transform Kalman Fil-
 ter (LETKF) (Hunt et al. 2007) for a mixed resolution ensemble composed of a small
 high resolution ensemble and a larger low resolution ensemble. Similarly to the hybrid
 method, e.g. Hamill and Snyder (2000), we form the background covariance as a linear
 combination of the sample covariances of the two ensembles. To limit the effects of inter-
 polation between the two model spaces, we use the combined background covariance in
 observation space as much as possible.
 In contrast to hybrid methods where the static covariance does not influence the
 analysis ensemble perturbations, in mixed resolution ensemble data assimilation both the
 high and low resolution analysis ensemble perturbations are influence by the combined
 background error covariance. Additionally, the mixed resolution combined background
 error covariance could be easily incorporated into hybrid methods e.g. Hamill and Sny-
 der (2000) including frameworks where the cost function is preconditioned e.g. Lorenc
 (2003). These cost functions are described in Section 2.2.4.
 We remark that our method allows the high resolution ensemble to have just one
 16
member, with the background covariance formed entirely from the low resolution ensem-
 ble, as proposed e.g. by Du (2004), Zupanski (2005), Gao and Xue (2008). When the
 high resolution ensemble has multiple members however, both ensembles contribute to
 the background covariance.
 We test this hypothesis by comparing analysis errors in a mixed resolution ensemble
 Kalman filter scheme. In particular, we test the mixed resolution LETKF (mLETKF) on
 two chaotic models designed by Lorenz (2005). The low resolution model (Model II) is
 similar to the Lorenz 96 model (Lorenz 1996; Lorenz and Emanuel 1998) but is smoother
 in that the values at adjacent grid points have a strong positive correlation. The high
 resolution model (Model III) includes short wave coupling in addition to the smoothing.
 In Section 2.2 we motivate and describe the algorithm for performing the mLETKF.
 For simplicity, we discuss the method mainly in the context of no localization, using the
 ETKF (Bishop et al. 2001; Wang et al. 2004) as formulated in Hunt et al. (2007). We give
 a side-by-side comparison of the two algorithms (ETKF and mETKF). In Section 2.3 we
 describe the full algorithms that we use in our experiments, complete with localization,
 interpolation, and reduced impact for distant observations. We describe the low and high
 resolution models (Lorenz (2005) Models II and III ) that we use in our experiments, and
 we describe our experiments. We test the mLETKF most extensively with a simple sce-
 nario of limited observations, testing the LETKF for high resolution ensembles of various
 17
sizes and testing the mLETKF for mixed resolution ensembles varying the size of both the
 high and low resolution components of the ensemble, including ensembles with only one
 high resolution member. We also investigate the potential advantages or disadvantages of
 the mLETKF for more accurate observations with full coverage and more frequent assim-
 ilation, sufficient to resolve the small scale variability, and in less ideal conditions where
 model error was introduced. We discuss the parameters, which we tuned for best accuracy,
 and we discuss balancing limited computation resources for best accuracy. We give our
 results at the end of Section 2.3. In Section 2.4 we conclude that mixed resolution ensem-
 ble analysis is beneficial in all of the scenarios tested, in that a greater accuracy can be
 attained for the same computation time.
 2.2 Methodology.
 Below we describe in detail how we perform a single analysis step with a mixed-
 resolution ensemble, in comparison with an ETKF analysis step. We also describe the
 mETKF in terms of minimizing a cost function and discuss how to incorporate the mETKF
 within hybrid systems.
 In Sections 2.2.1 and 2.2.2 we present the global Ensemble Transform Kalman Fil-
 ter (ETKF) algorithm and the global mixed-resolution ETKF algorithm (mETKF). How-
 ever, we used the Local Ensemble Transform Kalman Filter (LETKF) algorithm and the
 18
mLETKF algorithm in our experiments. In the LETKF (or mLETKF), a separate ETKF
 (or mETKF) analysis is done for each model grid point, using only observations from a
 region local to that grid point. Once the choice of observations is made, the LETKF analy-
 sis is equivalent to (though formulated differently than) the Ensemble Transform Kalman
 Filter (Bishop et al. 2001) with a centered spherical simplex ensemble (Wang et al. 2004).
 It is also equivalent to the local analysis of LEKF (Ott et al. 2004). We discuss localization
 and other minor but practical modifications to our algorithm in Section 2.3.1.
 2.2.1 ETKF
 The Ensemble Transform Kalman Filter (ETKF) algorithm (Bishop et al. 2001;
 Wang et al. 2004) is designed to generate an analysis ensemble of suboptimal state esti-
 mates commensurate with the analysis error covariance indicated by the Kalman filter and
 centered around the Kalman filter optimal analysis state, while shifting the background
 ensemble as little as possible to form the analysis ensemble (Ott et al. 2004).
 We represent the background ensemble with k members by the matrix Xb, with each
 column standing for a different ensemble member, Xbi (see equation (2.1)). The mean of
 the ensemble, x?b, is used as the background state. We constructed the perturbations via
 equation (2.2), writing each ensemble member as Xbi = x?b + Xbi . We denote by Xb the
 matrix of perturbations from the mean (see equation (2.2)). The observations and the ob-
 19
servation error covariance are yo and Ro respectively. Instead of using the background
 error covariance P b explicitly in the computations, we represent the implicit background
 error covariance by a square root matrix U b = (k ? 1)? 12 Xb, where P b = U bU b> (equa-
 tion (2.3)).
 The number of elements (rows) in yo is equal to the number of observations, No, and
 the number of rows in Xb is equal to Nm, the number of grid points in the model space.
 For an ensemble of k members, Xb will have k columns.
 Let h(x) be the forward operator that maps from model space into observation space.
 The matrixYb represents the background ensemble in observation space, i.e.,Ybi = h
 (
 Xbi
 )
 .
 The vector y?b and the matrix Y b represent the mean and perturbations of the background
 ensemble in observation space so that Ybi = y?b + Y bi . If h(x) is linear then it can be
 represented by a [No ? Nm] matrix H , h(x) = Hx. Three intermediary matrices, V b,
 C, and Ma, are created by the algorithm for efficiency during computation; V b has size
 [No ? k], C has size [k ? No], and Ma has size [k ? k]. Similar to U b, the matrix Ua is
 a square root of the analysis error covariance P a = UaUa> . The background ensemble
 Xb, the analysis ensemble Xa, and their respective implicit error covariances U b and Ua
 all have size [Nm ? k].
 We have split the ETKF algorithm into 3 major steps. In Step (1) we construct
 the inputs needed for the ETKF. In Step (2) we create intermediate matrices used in the
 20
analysis. In Step (3) we compute the analysis ensemble.
 1. (a) U b = 1?
 k ? 1
 Xb
 (b) V b = 1?
 k ? 1
 Y b
 2. (a) C = V b> (Ro)?1
 (b) Ma =
 (
 1
 ?
 I + CV b
 )
 ?1
 3. (a) x?a = x?b + U bMaC (yo ? h (x?b))
 (b) Ua = U b (Ma) 12
 (c) Xa = ?k ? 1Ua
 (d) Xai = x?a+Xai
 In Step (3c) the square root of Ma is defined as the unique symmetric positive definite
 matrix whose square is Ma. Using this square root of Ma gives an analysis ensemble with
 members that are the nearest to their background ensemble counterparts when compared
 to the analysis ensembles computed using any other square root (Ott et al. 2004).
 With ? = 1, the analysis mean x?a and covariance P a = UaUa> are consistent with
 the Kalman filter applied to x?b and P b. For a variety of reasons including model non-
 linearity and limited ensemble size, the background error covariance estimated from an
 ensemble forecast tends to underestimate the size of the error in the background mean
 (Houtekamer and Mitchell 1998; Anderson and Anderson 1999; Whitaker and Hamill
 21
2002). The inflation factor ? in Step (2b) effectively multiplies the background error co-
 variance matrix by ?. The value of ? is often determined by tuning to optimize performance
 in a given scenario (Anderson and Anderson 1999; Whitaker and Hamill 2002). One can
 also adjust ? adaptively (Anderson 2007; Li et al. 2009; Miyoshi 2011). Adaptive inflation
 allows ? to depend on location, which can be important when the model and observations
 are spatially heterogeneous.
 2.2.2 mETKF
 High resolution models are more accurate than low resolution models yet take more
 time to run. Among ensembles of the same resolution, ensembles with more members
 provide a better estimate for the background mean and covariance. However, increasing
 either the size or the resolution of the ensemble also increases the computational cost.
 Thus, within given computational constraints, a single-resolution ensemble compromises
 between the size of the ensemble and the resolution of the model. Combining the infor-
 mation in a large, low resolution ensemble with a small, high resolution ensemble could
 produce a better analysis than either resolution produces by itself. Described below is a
 method for mixed resolution ensemble analysis as it applies to the Ensemble Transform
 Kalman Filter (we abbreviate mixed ETKF by mETKF).
 The high resolution background ensembleXbh has size Nh?kh (grid size ? ensemble
 22
size). In the same fashion the low resolution background ensemble Xb` has size N` ? k`.
 The high and low resolution background means are x?bh and x?b` respectively. In the mixed
 algorithm, Xbh (Xb` ) stands for the high (low) resolution background perturbations from
 the high (low) resolution ensemble mean:
 [
 Xbh
 ]
 i = x?
 b
 h +
 [
 Xbh
 ]
 i (2.4)
 [
 Xb`
 ]
 i = x?
 b
 ` +
 [
 Xb`
 ]
 i . (2.5)
 In observation space, members of the high and low background ensembles are denoted
 by
 [
 Yb(h/`)
 ]
 i
 = hh/`
 ([
 Xbh/`
 ]
 i
 )
 = y?b(h/`) +
 [
 Y b(h/`)
 ]
 i
 . The operators hh and h` transform
 high and low resolution model states to observation space. We use parentheses around the
 subscripts h and ` to indicate objects that are related to one of the ensembles but do not lie
 in the corresponding model state space.
 In various equations below, two matrices combine to form a larger augmented ma-
 trix. For matrices A and B with the same number of rows, the augmented matrix
 [
 A B
 ]
 =
 [
 A1 A2 . . . AkA B1 B2 . . . BkB
 ]
 . (2.6)
 is the matrix with leftmost columns encompassing A and rightmost columns encompassing
 B.
 23
When doing analysis on a mixed ensemble, we occasionally need to compare vectors
 and matrices of different resolutions. To create a joint (augmented) ensemble of the back-
 ground perturbations, we interpolate the low resolution background perturbations onto the
 high resolution grid by means of the cubic spline with circular boundary conditions. The
 function spline () denotes an operator interpolating a low resolution field onto the high
 resolution grid. Similarly, high resolution fields are projected onto the low resolution grid
 via the operator proj (). We remark that if the high resolution grid points are a subset of the
 low resolution grid points, then proj () is truly a projection operator; otherwise it involves
 some interpolation as well.
 In the mETKF, as in the ETKF, the background error covariance is never explic-
 itly computed. Instead we estimate U b, the implicit background error covariance (P b =
 U bU b>), and V b, the implicit background error covariance as expressed in observation
 space (Rb = V bV b>), from a weighted combination of the high and low resolution pertur-
 bations:
 U b =
 [ ?
 ?
 k`?1spline
 (
 Xb`
 ) ? 1??
 kh?1X
 b
 h
 ]
 , (2.7)
 V b =
 [ ?
 ?
 k`?1Y
 b(`)
 ?
 1??
 kh?1Y
 b(h)
 ]
 . (2.8)
 Equivalently, Rb, can be written as a weighted average of the high and low resolution
 24
background error covariances
 Rb(h/`) =
 1
 kh/` ? 1
 Y b(h/`)Y b(h/`)
 > (2.9)
 Rb = ?Rb(`) + (1 ? ?)Rb(h) = V bV b> . (2.10)
 The scalar ? determines the relative weight of each ensemble. When ? = 1, only the
 low resolution perturbations are used to determine the analysis update, i.e. Rb = Rb(`).
 Setting ? = 1 is necessary when the high resolution ensemble contains only a single
 member. Even though the high resolution perturbations are ignored, the high resolution
 single member ensemble will still be updated in the analysis. Similarly when ? = 0, the
 low resolution perturbations are ignored and Rb = Rb(h).
 The mETKF algorithm is shown below with steps numbered in a corresponding
 manner to the steps of the ETKF algorithm in Section 2.2.1. The input parameters for the
 mETKF analysis step are: the background ensembles (Xbh andXb`), the weight parameter ?,
 the inflation factor ?, and the observations yo with their error covariance Ro. We construct
 the augmented implicit background error covariance matrices U b and V b in Step (1). Step
 (2) is useful for efficient computation. In Step (3) we perform the analysis. We remark that
 the analysis mean for the low (high) resolution ensemble is updated from the background
 mean for the low (high) resolution ensemble.
 25
1. (a) U b =
 [ ?
 ?
 k`?1spline
 (
 Xb`
 ) ? 1??
 kh?1X
 b
 h
 ]
 (b) V b =
 [ ?
 ?
 k`?1Y
 b(`)
 ?
 1??
 kh?1Y
 b(h)
 ]
 2. (a) C = V b> (Ro)?1
 (b) Ma =
 (
 1
 ?
 I + CV b
 )
 ?1
 3. (a) i. x?ah = x?bh + U bMaC
 (
 yo?hh
 (
 x?bh
 ))
 ii. x?a` = x?b` + proj
 (
 U b
 )
 MaC
 (
 yo?h`
 (
 x?b`
 ))
 (b) Ua = U b (Ma) 12
 (c) split Ua into components corresponding to the low and high resolution ensem-
 bles: Ua =
 [
 Ua(`) Ua(h)
 ]
 i. Xah =
 ?
 kh?1
 1?? Ua(h)
 ii. Xa` =
 ?
 k`?1
 ?
 proj
 (
 Ua(`)
 )
 (d) i. [Xah]i = x?ah + [Xah ]i
 ii. [Xa` ]i = x?a` + [Xa` ]i
 Step (3d) is done for each ensemble member, i.e. 1 ? i ? kh/`.
 We remark that because proj () inverts spline (), the Xa` computed by (3c) is the
 same as if we had formed U b in low-resolution model space (projecting Xbh), performed
 Step (3b) on the low-resolution ensemble, and selected the columns of the low-resolution
 Ua that correspond to the low-resolution ensemble.
 26
When the high resolution ensemble only has one member, the perturbations from the
 mean are zero. Since the background error covariance of the high resolution part cannot be
 computed, no weight should be given to the high resolution component of the covariance.
 This can be done by setting ? = 1, using 0 instead of
 ?
 1??
 kh?1 in Step (1), and setting
 Xah =
 ??0 in Step (3c). Both the low resolution ensemble and the high resolution member
 will be updated from the information in the low resolution perturbations.
 As the low resolution analysis mean ignores the high resolution background mean,
 when ? = 1 the low resolution analysis ensemble from the mETKF is identical to that pro-
 duced from the ETKF on the low resolution ensemble. Thus a code for mixed resolution
 ensemble data assimilation can also accommodate single resolution ensembles by setting
 ? = 1 and Xbh =
 ??0 (or Xbh = ?) for a low resolution ETKF or by setting ? = 0 and
 Xb` =
 ??0 for a high resolution ETKF.
 With some models it might be desirable to compute a single analysis mean for both
 the high and low resolution ensembles, replacing Step 3(a)ii with x?a` = proj(x?ah) and re-
 placing x?ah in Step 3(a)i with x?a representing a state estimate better than x?ah if such exists.
 However, our results were marginally worse when we replaced x?a` with proj(x?ah). Du
 (2004) explores the forecasting advantages this provides when the high resolution ensem-
 ble has only one member.
 27
2.2.3 A side by side comparison
 As a thought experiment, consider splitting a 2k-member ensemble into two equal
 parts, yielding two k-member ensembles of the same resolution. In this section we will
 compare the ETKF on the 2k-member ensemble to the mETKF on the two k-member en-
 sembles. Since both k-member ensembles have the same size and resolution, we weigh
 them equally, using ? = 12 . For each step, the ETKF step is given on the left and the
 comparable mETKF step is given on the right. We label one k-member ensemble with
 subscript ` and the other with subscript h, even though in the present scenario both ensem-
 bles are in the same model space, and there is no need for interpolation or projection.
 1. (a) U b = 1?2k?1Xb [
 1?
 2k?2X
 b
 `
 1?
 2k?2X
 b
 h
 ]
 = U b
 (b) V b = 1?2k?1Y b [
 1?
 2k?2Y
 b(`)
 1?
 2k?2Y
 b(h)
 ]
 = V b
 2. (a) C = V b>Ro?1
 V b>Ro?1 = C
 (b) Ma =
 (
 1
 ?
 I + CV b
 )
 ?1
 (
 1
 ?
 I + CV b
 )
 ?1
 = Ma
 3. (a) x?a = x?b + U bMaC (yo ? h(x?b))
 28
x?bh/` + U bMaC
 (
 yo ? h
 (
 x?bh/`
 ))
 = x?ah/`
 (b) Ua = U b (Ma) 12
 U b (Ma) 12 = Ua =
 [
 Ua(`) Ua(h)
 ]
 (c) Xa = ?2k ? 1Ua
 ?
 2k ? 2Ua(h/`) = Xah/`
 (d) Xai = x?a + Xai
 x?ah/` +
 [
 Xah/`
 ]
 i
 =
 [
 Xah/`
 ]
 i
 The differences are as follows. First, in the mETKF separate means x?h and x?` are com-
 puted for the two ensembles; these means are used explicitly in Steps (3a) and (3d), and
 also in forming the ensemble perturbations in Step (1). Second, the rank of the mETKF
 combined background error covariance is 2k ? 2, while the rank of the corresponding
 ETKF background error covariance is slightly larger (2k ? 1). Thus, for large k the
 mETKF and the ETKF should produce similar results, but for small k, the mETKF will
 be more influenced by sampling error. We note that the ensemble Kalman filter proposed
 by Houtekamer and Mitchell (1998) also uses two equal-size ensembles and updates their
 means separately, but compensates for sampling error by updating the mean from one en-
 semble using only the background covariance computed by the other ensemble. While
 the combined background covariance of mETKF is probably detrimental in this thought-
 experiment scenario, it is an essential feature of mETKF in the mixed-resolution scenario.
 29
2.2.4 Cost Function
 The ETKF and the mETKF can also be considered in terms of minimizing a cost
 function. We also show the cost function for a mixed ensemble based hybrid scheme.
 2.2.4.1 The ETKF Cost Function
 For the ETKF this cost function is
 J?(w) = ??1w>w + (yo ? y?b ? V bw)>Ro?1 (yo ? y?b ? V bw) , (2.11)
 where y?b = h
 (
 x?b
 )
 and ? is the background error covariance inflation. If h is linear then
 h
 (
 x?b + U bw
 )
 = y?b + V bw, and it follows that if wa ? Rk minimizes J? then x?a =
 x?b + U bwa minimizes the alternative cost function
 J(x) = (x? x?b)> (?P b)?1 (x? x?b)+ (yo ? h(x))> Ro?1 (yo ? h(x)) (2.12)
 in the space spanned by the ensemble (Hunt et al. 2007).
 30
2.2.4.2 The mETKF Cost Function
 A similar cost function applies to each resolution in the mETKF:
 J?h/`(w) = ??1w>w +
 (
 yo ? y?b(h/`) ? V bw
 )> Ro?1 (yo ? y?b(h/`) ? V bw) , (2.13)
 where y?b(h/`) = hh/`
 (
 x?bh/`
 )
 . Similarly, when hh/` is linear, then x?ah = x?bh + U bwa and
 x?a` = x?
 b
 ` + proj
 (
 U b
 )
 wa minimize the alternative cost function
 Jh/`(x) = (x? x?bh/`)> (?P b)?1 (x? x?bh/`)+ (yo ? hh/`(x))> Ro?1 (yo ? hh/`(x))
 (2.14)
 in the space spanned by the ensemble, provided wa ? Rkh+` minimizes J?h/`.
 The minimizer wa for J?h is
 wa =
 (
 1
 ?I + V
 b>Ro?1V b
 )
 ?1
 V b>Ro?1
 (
 yo ? hh
 (
 x?bh
 ))
 , (2.15)
 and the equation for the high resolution analysis mean from Step 3(a)i of Section 2.2.2
 gives
 x?ah = x?
 b
 h + U bwa, (2.16)
 For linear hh, x?ah (equation (2.16) and Step 3(a)i) minimizes the cost function Jh (equa-
 31
tion 2.14) in the space spanned by the ensemble. A similar derivation applies to the low
 resolution analysis x?a` .
 In summary, the mETKF analysis minimizes separate cost functions for the high
 and low resolution ensembles. The two ensembles are coupled by the combined pertur-
 bation matrix V b in both cost functions (equation 2.13), or equivalently by the combined
 background covariance matrix P b (equation 2.14).
 2.2.4.3 The Hybrid mETKF Cost Function
 As we have not explored mixed resolution ensemble based hybrid methods, this
 section describes a generalized theoretical cost function. We consider a hybrid method
 where a control forecast xbh is performed at high resolution and ensembles are forecast at
 mid resolution Xbm and low resolution Xb`. We assume that the control forecast is the most
 accurate and hence the best prior estimate.
 The hybrid mETKF cost function similar to that given in Lorenc (2003) and Gao et
 al. (2010) is given by
 J?hybrid(v, w) = v>v + ??1w>w
 +
 (
 yo?y?b??1h
 ((
 P b
 static
 ) 1
 2 v
 )
 ??2V bw
 )>
 Ro?1
 (
 yo?y?b??1h
 ((
 P b
 static
 ) 1
 2 v
 )
 ??2V bw
 )
 ,
 (2.17)
 where ybh = h
 (
 xbh
 )
 and ?21 + ?22 = 1.
 32
As in the previous sections (2.2.4.1 and 2.2.4.2), when h is linear, then xah = xbh +
 ?1
 (
 P b
 static
 ) 1
 2 va + ?2U bwa minimizes the alternative cost function similar to that given in
 Hamill and Snyder (2000):
 J(x) = (x? x?b)> (?21P bstatic + ?22?U bU b>
 )
 ?1 (
 x? x?b
 )
 + (yo ? h(x))>Ro?1 (yo ? h(x)) .
 (2.18)
 Further details on the equivalence between Hamill and Snyder (2000) and Lorenc (2003)
 are provided in Wang et al. (2007).
 2.3 Numerical Experiments and Results.
 To test the mixed resolution LETKF, we used two chaotic models designed by
 Lorenz (2005) to generate the high and low resolution ensembles. Both models are ex-
 tensions of the Lorenz 96 model (Lorenz 1996; Lorenz and Emanuel 1998). The latter
 model is called Model I in Lorenz (2005), and is smoothed to produce Model II, which
 we use as our low resolution model. The high resolution Model III introduces short wave
 coupling in addition to the smoothing.
 We tested the mixed resolution LETKF and the standard LETKF in several cases
 with Observing System Simulation Experiments (OSSEs). In these types of experiments,
 a computer is not only used to generate the forecasts, it is also used to generate simulated
 33
observations. It creates a trajectory that we call the ?truth? and from this it generates obser-
 vations by adding random error to the truth at selected points. In most of our experiments,
 we use a ?limited observation? network consisting of one simulated observation every 24
 Model III grid points. We also consider a ?small-scale resolving? scenario in which we
 have observations at every grid point.
 2.3.1 Modifications to the Algorithms.
 In the LETKF, the ETKF analysis is performed for each grid point, using only the
 observations in a neighborhood surrounding the grid point. This eliminates long-range
 background correlations that may be spurious due to limited ensemble size. In addition,
 global errors may grow in many directions in model space, but fewer directions are needed
 to describe local errors. The result of local analysis is that one member of the ensem-
 ble may be heavily weighted near one grid point, yet lightly weighted near another, so a
 smaller ensemble can be used to cover all of the directions of error growth. See Greybush
 et al. (2011) for further discussion of localization. For a mixed ensemble, the mLETKF
 localization is done in the same fashion as the LETKF, i.e. the mETKF is done on each
 grid point, using only the observations in a neighborhood surrounding the grid point. In
 all cases we use a localization radius of 32 Model III grid points, i.e. 2 or 3 observations
 per local region in our limited observation network.
 34
When observations are sparse, edge effects can cause adjacent grid points to have
 significantly different analyses. For instance, consider the extreme case of a grid point
 whose local region contains no observations, which is adjacent to a grid point with one
 observation on the very edge of its local region. The analysis at the first grid point would
 be the background at that grid point, since no observations means no update. However,
 the second grid point would be updated to more closely align with the observation quite
 a distance from it. To reduce this unrealistic choppiness, in our limited observation sce-
 narios we tapered the influence of distant observations by increasing the observation error
 covariance associated with observations near the edge of a local region (Greybush et al.
 2011). Hunt et al. (2007) suggest multiplying the elements of the inverse observation error
 covariance matrix (Rolocal)?1 by a weight function, which is 1 for observations in the center
 of the local region and dwindles to 0 for observations beyond the edge of the local region.
 Our tapering weight function tapered the elements of (Rolocal)?1 in a trapezoid-like fash-
 ion. For observations on the very edge of a local region, we increased the observation error
 covariance by 4 times. For observations in the exact center of the tapered region, we dou-
 bled the observation error covariance. For observations located elsewhere in the tapered
 region, we multiplied the observation error covariance by an interpolated power of 2. For
 observations in the central half of the local region, we did not change the observation error
 covariance.
 35
With the LETKF, the analyses at different grid points can be done in parallel. If
 grid points are closely spaced, it can be advantageous to calculate the ?weights? wa =
 MaC
 (
 yo ? h
 (
 x?b
 )) (see equation (2.15)) and W a = (Ma) 12 only on a subset of the grid
 points and interpolate these onto rest of the grid points (Yang et al. 2009). These weights
 multiply the background ensemble perturbation matrix U b in Steps (3a) and (3b) of our
 ETKF formulation given in Section 2.2.1. The analysis ensemble at a particular grid point
 j is computed from the interpolated wa(j) and W a(j) by finding the analysis mean and co-
 variance
 x?a[j] = x?b[j] + U b[j]wa(j) (2.19)
 Ua[j] = U b[j]W a(j), (2.20)
 and computing Xa[j] from Steps (3c) and (3d). Here the subscript [j] denotes the rows
 corresponding to grid point j. We performed the analysis at every fourth Model III grid
 point (i.e. the Model II grid) and used weight interpolation to determine the analysis at the
 other Model III grid points.
 2.3.2 Models.
 Following Lorenz (2005), we use a circular grid of 240 points for Model II, and
 for Model III we use 960 points that refine the Model II grid by a factor of 4. To avoid
 36
confusion, we introduce the unit distance defined as the distance between one Model III
 grid point and the next. Thus the circle is 960 units in circumference and Model II grid
 points are 4 units apart. A snapshot of both models, overlaid for comparison, is given in
 Figure 2.1.
 0 100 200 300 400 500 600 700 800 900
 ?10
 ?5
 0
 5
 10
 15
 20
 A Sample Model II and Model III State
 Units Along a Circle (Model III Grid Point Number)
 Value Predicted by the Mode
 l
  
 
Model II
 Model III
 Figure 2.1: Sample Model II and Model III states shown simultaneously. Model III has
 short wave coupling at high resolution, while Model II, with a lower resolution, is smooth.
 Both models use an averaging function on nearby grid points, where the smoothing
 parameter K determines the distance over which the average is taken. When K is odd, the
 averaging function ??K;n refers to the arithmetic mean over the K grid points nearest (and
 including) the grid point indexed by the number n, i.e. grid points n ? K?12 to n + K?12 .
 When K is even, ??K;n is the average over the region of length K grid points, centered at
 grid point number n, with boundary points weighted half as much as interior points, i.e.
 considering the nearest K + 1 grid points, the K ? 1 interior grid points are weighted by
 37
1
 K and the two most distant grid points (n? K2 and n + K2 ) are weighted by 12K .
 The averaging function for an even valued smoothing parameter is related to the
 ??
 notation of Lorenz (2005) according to the the equation
 ?x?K,n = 1K
 K/2?
 j=?K/2
 ?
 xn+j , (2.21)
 where
 J?
 j=?J
 ?
 xn+j =
 (xn?J + xn+J)
 2
 +
 J?1?
 j=?J+1
 xn+j . (2.22)
 We use the
 ??
 notation when describing Model II and Model III and also when defining
 the long wave component of Model III.
 Model II is given by
 dxn
 dt =
 1
 K
 K/2?
 j=?K/2
 ?
 xn+K+j ? ?x?K;n?K+j
 ??x?K;n?2K ?x?K;n?K ? xn + F (2.23)
 when K is even; when K is odd,
 ?(K?1)/2
 j=?(K?1)/2 replaces
 ?? j=K/2
 j=?K/2. F represents the forcing
 constant in this differential equation. When K = 1, no averaging takes place and Model
 II reduces to the Lorenz 96 model (Lorenz 1996; Lorenz and Emanuel 1998). Following
 suggestions in Lorenz (2005), all of our experiments use a smoothing parameter of K = 8
 38
(corresponding to a smoothing over 32 Model III units) and most use a forcing constant of
 F = 15. Some experiments use different values of F to simulate model error.
 For our small-scale resolving scenario, we performed data assimilation every 0.005
 time units and in all other cases we performed data assimilation every 0.05 time units. We
 integrated Model II via the fourth order Runge-Kutta scheme with a time step size of 0.005
 or 0.05/2 respectively.
 In our experiments the truth and the high resolution ensemble were both simulated
 with the high resolution model, Model III, given by the equation
 dzn
 dt =
 1
 K
 K/2?
 j=?K/2
 ?
 xn+K+j ? ?x?K;n?K+j
 ??x?K;n?2K ?x?K;n?K ? xn + F
 +b2 (yn+1 ? yn?2) yn?1 ? byn
 +c (xn+1yn?1 ? yn?2xn?1) (2.24)
 where
 ?(K?1)/2
 j=?(K?1)/2 replaces
 ?? j=K/2
 j=?K/2 when K is odd and x and y are defined as follows
 in terms of z.
 xn =
 I?
 i=?I
 ?
 (
 3I2 + 3
 2I3 + 4I ?
 2I2 + 1
 I4 + 2I2 |i|
 )
 zn+i (2.25)
 yn = zn ? xn
 39
The coefficients b2 and c are parameters of the model that drive the coupling. The number
 I is chosen so that x will not include short waves; the coefficients in the equation defining
 xn filter out non-quadratic fluctuations between n?I and n+I (Lorenz 2005). The vectors
 x and y in equation (2.24) then correspond to the long and short wave components of z
 respectively.
 For Model III, we use a smoothing parameter of K = 32 (smoothing over the same
 length scale as Model II) and a forcing constant of F = 15. Our other parameters were
 I = 12, b = 10, and c = 2.5. This gives short waves that vary about 10 times faster than
 the long waves with about 1/10 the amplitude (Lorenz 2005). Integration was done via
 the fourth order Runge-Kutta scheme with a time step size of 0.05/24 or 0.005/3. While
 the Model III can run without analysis with a larger time step size of 0.05/12 as in Lorenz
 (2005), we found that data assimilation with Model III requires a smaller time step due to
 the instability of evolving with short waves of artificially large amplitude. All of our other
 parameters are consistent with those recommended by Lorenz (2005).
 2.3.3 Experiments.
 As mentioned above, our experiments took place on a gridded circle of size 960
 units, where adjacent Model III grid points are 1 unit apart and adjacent Model II grid
 points are 4 units apart. We tested the mLETKF with a limited observation network of 40
 40
simulated observations spaced evenly every 24 units around the entire circle, and with a
 dense observation network of 960 simulated observations, one at each Model III grid point.
 We simulated observations by adding uncorrelated Gaussian noise to a model run that we
 regard as the ?truth?. In all scenarios below we used the same truth run generated from
 Model III with forcing constant F = 15. In our small-scale resolving scenario we used
 the dense observation network, with observations and analyses every 0.005 time units,
 and observation errors with standard deviation 0.3, which is about the size of the small
 scale variability (Yoon and Ott 2010). In all other cases, we used the limited observation
 network, with observations and analyses every 0.05 time units, and observation errors
 with standard deviation 2. We remark that observing and performing data assimilation
 every 0.05 time units was not sufficient to resolve the small-scale variables yn, which vary
 10 times faster than the large-scale variables xn; this is consistent with the findings of
 Ballabrera-Poy et al. (2009) for a similar model.
 Our basic limited observations scenario and our small-scale resolving scenario have
 a ?perfect? high resolution model (F = 15 for both truth and data assimilation). To
 consider scenarios where both models are imperfect, we used a different forcing constant
 for the ensemble forecasts than for the truth run. We tested two model error scenarios. In
 one scenario we used a forcing constant of F = 14 for both the high and low resolution
 forecasts. In the other scenario we used a forcing constant of F = 12.5. Introducing model
 41
error detracts from the high resolution model?s capability to resolve the small scales, so
 we only tested our small-scale resolving scenario within the perfect model framework.
 We tested our three limited observation scenarios (basic, imperfect model F = 14,
 and imperfect model F = 12.5) with the mLETKF on three different sizes of mixed resolu-
 tion ensembles and with the LETKF on three different sizes of high resolution ensembles.
 In the majority of our mixed resolution experiments, the low resolution component of the
 ensemble contained k` = 30 members and only the size of the high resolution component
 was varied with kh = 1, 2, and 3 members. The high resolution ensembles tested con-
 tained k = 3, 4, and 5 members. In addition, we tested our basic scenario on numerous
 other ensemble size combinations. Similarly, we tested our small-scale resolving scenario
 with the mLETKF on three different sizes of mixed resolution ensembles and with the
 LETKF on four different sizes of high resolution ensembles (15, 16, 17 or 18 members).
 In each of the mixed cases, the low resolution component of the ensemble contained 30
 members and the high resolution component contained 9, 12, or 15 members. We present
 these results in Section 2.3.6.
 We measure the accuracy of each experiment as the root mean square difference
 between the analysis and the truth at the Model III grid points, averaged over 4500 as-
 similation steps. We ran each experiment for a total of 5000 assimilation steps, but we
 ignored the first 500 assimilation steps, giving the ensembles time to spin-up. Specifically,
 42
the accuracy of each experiment is given by
 RMSE = 14500
 5000?
 i=501
 rms
 (
 xa(ti) ? xt(ti)) , (2.26)
 where rms is the root-mean-square function over the 960 grid points, xt(ti) refers to the
 true model state at the ith assimilation step, and xa(ti) refers to the analysis state (best
 estimate) at the ith assimilation step in Model III space, i.e. xa(ti) = x?ah(ti) for mixed
 resolution experiments, xa(ti) = x?a(ti) for high resolution experiments, and xa(ti) =
 spline (x?a(ti)) for low resolution experiments.
 2.3.4 Parameters Tuned.
 For each result below we tuned the covariance inflation, ?, in order to minimize
 the analysis RMSE. While tuning ? substantially improves the accuracy, we found that
 the results were generally less sensitive to changes in ?. Recall that ? determines the
 weight given to the high resolution ensemble versus the low resolution ensemble when
 determining the implicit background error covariance.
 43
0 0.2 0.4 0.6 0.8 1
 0.75
 0.8
 0.85
 0.9
 0.95
 Alpha
 RMS Erro
 r
  
 
2 & 30
 3 & 30
 Figure 2.2: Dependence of the analysis RMSE on ? for the 2 & 30 (circle) and 3 & 30
 (diamond) mixed cases in our basic scenario. The results are averaged over 4500 assimi-
 lation steps after a spin up of 500 assimilation steps. When ? = 1 only the low resolution
 perturbations are used to determine the analysis update. The flatness of the graph indicates
 that ? does not need extensive tuning. Notice that ? = .2 is close to optimal in both cases
 shown, and is well separated from the large errors we observe for ? < 0.05.
 The representative example shown in Figure 2.2 indicates that the RMSE does not
 change substantially for any value of ? between ? = 0.1 and ? = 0.9. When ? = 1, only
 the low resolution perturbations are used to determine the analysis update. Similarly when
 ? = 0, only the high resolution perturbations contribute to the analysis update. Therefore,
 in the mixed case with only one high resolution member, we necessarily use ? = 1. We
 tuned ? for the four mixed cases of 2 or 3 high resolution and 30 low resolution ensemble
 members both with our basic scenario and with model error F = 12.5. Since ? = 0.2 was
 close to ideal for each of these cases, in all of our basic and model error results below, we
 use ? = 0.2. In our small-scale resolving experiments ? tuned to 0.8.
 44
Though we discuss the joint covariance as a weighted average of the high and low
 resolution covariances, the flatness of Figure 2.2 for intermediate values of ? and the larger
 tuned value of ? for our small-scale experiments indicates that this interpretation may be
 too simplistic. Changing the value of ? also changes, after many assimilation cycles, the
 relative variances of the high and low resolution ensembles. Larger values of ? result in
 smaller low resolution perturbations and larger high resolution perturbations. For example,
 in our basic scenario with 3 high resolution members and 30 low resolution members,
 increasing ? from 13 to
 2
 3 roughly halves the variance of the low resolution ensemble and
 increases the variance of the high resolution ensemble by 30%:
 P b` (? = 23) ? 12P b` (? = 13) (2.27)
 P bh(? = 23) ? 1.3 ? P bh(? = 13). (2.28)
 As a result, the low resolution part of the joint covariance P b = ?P b` + (1 ? ?)P bh stays
 about the same as ? changes from 13 to
 2
 3 , while the high resolution part decreases (but by a
 smaller factor than 1? ? decreases). We have not fully explored the relationship between
 ? and ensemble spread.
 In each limited observation experiment, we tried values of ? differing in the hun-
 dredths place and our results below use the value of ? that gave the most accurate RMSE.
 45
Values of ? were between 1.04 and 1.13 where the higher values were optimal when the
 forecast model was less accurate or when the ensemble size was small. In mixed reso-
 lution ensembles, ? tuned to slightly lower values than in single resolution ensembles of
 comparative computational cost. In our small-scale resolving experiments, ? tuned to a
 much larger values, even as large as 1.45, so we tuned ? along a coarser mesh of size 0.05.
 2.3.5 Balancing Limited Computational Resources.
 Computation speed limits the accuracy obtainable in practice. The time taken by the
 forecast is directly proportional to the size of the ensemble, provided that the same model
 is being used to forecast the entire ensemble. More generally, the time taken to evolve the
 ensemble forecast equals the sum of the times taken to evolve each member. Though the
 forecast can be done much more quickly in parallel, computational resources still limit the
 maximum ensemble size.
 The forecast computation times shown follow a linear proportionality with the en-
 semble size. In our experiments, forecasting each high resolution member took about 100
 times longer than forecasting each low resolution member. Therefore, when forecasting
 a mixed ensemble, a relatively large low resolution ensemble can be used without much
 impact on the total forecast computation time.
 However, the analysis time hinges more upon the size of the total ensemble, which in
 46
our experiments is approximately the size of the large, low resolution ensemble. Generally
 speaking, the forecast time limits the size of the high resolution ensemble we can use,
 and the analysis time inhibits how large of a low resolution ensemble we can use. The
 relationship between analysis time and forecast time is situation dependent however, so we
 only give results in terms of forecast time, which is relevant to both ensemble forecasting
 to assess forecast uncertainty as well data assimilation.
 In Table 2.1 we compare results for our basic scenario from mixed resolution ensem-
 bles with high resolution components of 1, 2, and 3 members coupled with low resolution
 components of 5, 10, 20, 30, and 60 members. We found that accuracy does not diminish
 extensively even for as few as 10 low resolution members. In later results, we chose to use
 30 low resolution members as it seemed a good balance between accuracy and time for
 our limited observation scenarios. In our small-scale resolving scenario, we also chose to
 use 30 low resolution members.
 kh = 1 kh = 2 kh = 3
 k` = 5 0.98 0.85 0.80
 k` = 10 0.92 0.83 0.78
 k` = 20 0.91 0.80 0.76
 k` = 30 0.90 0.79 0.76
 k` = 60 0.87 0.79 0.76
 Table 2.1: Varying the size of the low & high resolution ensembles. This table gives the
 analysis RMSEs for our basic scenario with various mixed ensembles. A low resolution
 component with 20 or 30 members does nearly as well as with 60.
 47
2.3.6 Results.
 In order to estimate the ideal accuracy obtainable using the LETKF with the high
 resolution model, we tested our basic scenario and our small-scale resolving scenario on
 increasingly greater ensemble sizes. We summarize our scenarios in Table 2.2. Similarly
 Table 2.2: Summary of scenarios. The limited observation network consists of 40 obser-
 vations and the dense observation network has full coverage (960 observations). The true
 forcing constant is F t = 15; forcing constants different from F = 15 simulate model
 error. The observation error refers to the standard deviation of the error in the simulated
 observations.
 Scenario observation network Forcing constant observation error
 basic limited F = 15 ? = 2
 F = 14 limited F = 14 ? = 2
 F = 12.5 limited F = 12.5 ? = 2
 small-scale dense F = 15 ? = 0.3
 for the low resolution model, we tested our basic scenario for various ensemble sizes.
 These results are graphed in Figure 2.3. The lowest obtainable RMSE in our small-scale
 resolving scenario for assimilation with a low resolution is 0.17 and only when results are
 smaller than that do we say that they resolve the small scale variability.
 48
0 10 20 30 40 50 60 70
 0
 0.5
 1
 1.5
 2
 2.5
 Lo Res Ensemble Size, Basic
 RMS
 E
 0 5 10 15 20 25 30
 0
 0.25
 0.5
 0.75
 1
 Hi Res Ensemble Size, Basic
 RMS
 E
 0 5 10 15 20 25 30
 0
 0.05
 0.1
 0.15
 0.2
 Hi Res Ensemble Size, Small Scale
 RMS
 E
 Figure 2.3: Top: the RMSE of the analysis generated in our basic scenario with a low-
 resolution model, averaged over 4500 assimilation steps after a 500 time step spin-up
 period. Middle: the RMSE of the analysis generated in our basic scenario with a high-
 resolution model, averaged over 4500 assimilation steps after a 500 assimilation step spin-
 up period. Bottom: the RMSE of the analysis generated in our small-scale resolving sce-
 nario with a high-resolution model, averaged over 4500 assimilation steps after a 500 as-
 similation step spin-up period. The dashed line indicates the small scale variability, i.e.
 the minimum RMSE with a low resolution model for our small-scale resolving scenario.
 Our remaining numerical results represent the accuracy of the mLETKF in each of
 our scenarios in terms of the RMSE (equation (2.26)). We compared these mixed res-
 olution results to the accuracy of the LETKF with various ensembles sizes at both low
 49
and high resolution. At each assimilation time, we compared our high resolution fore-
 cast mean with the truth and found the root mean square error over the grid (see equation
 (2.26)). Starting from a random ensemble of model states, we ran each experiment for
 5000 assimilation steps. Figure 2.4 shows a plot of the root mean square error for the first
 1000 assimilation steps, using the 2 high resolution and 30 low resolution mixed ensemble
 in our basic scenario. The RMSE behaves similarly continuing out to 105 assimilation
 steps (the maximum we tested), and in other cases and scenarios. Though spin-up appears
 to occur quickly, we chose to be cautious, using a longer spin-up of 500 assimilation steps.
 0 100 200 300 400 500 600 700 800 900 1000
 0
 0.5
 1
 1.5
 2
 2.5
 Time
 RMS
 E
  
 
mean RMSE
 RMSE
 Figure 2.4: The root mean square error at each assimilation step for the initial 1000 assim-
 ilation steps. This data is taken from the run of the 2 high resolution and 30 low resolution
 basic scenario.
 Excluding this spin-up period, we took the mean and standard deviation of the root
 mean square error over the 4500 remaining assimilation steps. The time averaged errors
 are shown in Tables 2.3 and 2.5 and the standard deviation of the fluctuations about that
 50
mean over time are shown in Tables 2.4 and 2.5. These fluctuations, as depicted in Figure
 2.4 are correlated in time and do not allow a simple calculation of the sampling error in
 the RMSEs given in Tables 2.3 and 2.5. To roughly assess the sampling error, we took
 the standard deviation of a 500 assimilation step moving average, with results summarized
 in the caption of Table 2.3. The standard deviation of the moving average in most cases
 was between 0.01 and 0.03. However, the 3 member high resolution cases and the 4
 member large model error case (F = 12.5) were not as precise, with a standard deviation
 of the moving average between 0.04 and 0.07. This suggests that the sampling error in the
 average RMSEs is at most 1 or 2 in the second decimal place.
 The short term fluctuations in root mean square error for the mixed resolution cases
 are consistently small in all cases (Tables 2.4 and 2.5). For our high resolution cases with
 limited observations, these fluctuations are significantly larger, even for similar sizes of
 average RMSE. Greater error in the model induces larger fluctuations, whereas increasing
 the size of the ensemble decreases the size of the fluctuations.
 51
2.4 Summary and Discussion
 In Section 2.2, we discussed standard ensemble Kalman filter algorithms that esti-
 mate the background error covariance with the matrix expression:
 (
 1?
 k ? 1
 Xb
 ) (
 1?
 k ? 1
 Xb
 ) >
 (2.29)
 where k is the ensemble size and Xb are perturbations from the background mean (equa-
 tion (2.3)). With an ensemble of mixed resolution, we estimated the background error
 covariance (equation (2.7)) in a similar manner as U bU b> , where
 U b =
 [ ?
 ?
 k`?1spline
 (
 Xb`
 ) ? 1??
 kh?1X
 b
 h
 ]
 . (2.30)
 Here k` and kh are the low and high resolution ensemble sizes respectively, Xb` and Xbh are
 the low and high resolution perturbations from the means of their respective ensembles,
 ?spline? represents interpolation from the low resolution to the high resolution grid, and
 ? is a weight based upon the accuracies of the two ensembles and dictating the influence
 of each ensemble in the analysis. We gave the mixed resolution algorithm associated with
 the Local Ensemble Transform Kalman Filter and a comparison to the parent algorithm.
 We tested this mixed resolution ensemble analysis with the Lorenz Models II & III
 described in Section 2.3.2. The low resolution model, Model II, smoothly transitions from
 52
one grid point to the next. The high resolution model, Model III, also contains a short
 wave coupling component.
 We performed experiments with three different limited observation scenarios: basic,
 model error F = 14, and model error F = 12.5. The latter two cases simulate situations
 where the true system dynamics are not perfectly known. We also performed experiments
 with dense, frequent, and more accurate observations to simulate a situation where it is
 possible to resolve the small scale variability.
 We tested ensembles of various sizes. In the high resolution, limited observation
 experiments, we tested ensembles of size 3, 4, 5, and 6. In the mixed resolution experi-
 ments, we reported results with 30 low resolution ensemble members and 1, 2, or 3 high
 resolution members. In our small-scale resolving experiments, we tested high resolution
 ensembles of size 15, 16, 17, and 18 and we tested mixed resolution ensembles with 30
 low resolution members and 9, 12, and 15 high resolution members. We also conducted
 experiments with different numbers of low resolution ensemble members. We summa-
 rize the forecast computation time and the analysis RMSEs for our basic and small-scale
 resolving scenarios using different numbers of ensemble members in Figure 2.5.
 In our mixed resolution and high resolution OSSEs (observing system simulation
 experiments), forecasting the high resolution ensemble took up the bulk of the overall
 forecast computation time. However, introducing an additional high resolution member
 53
0 5 10 15
 0
 0.05
 0.1
 0.15
 0.2
 0.25
 0.3
 0.35
 0.4
 0.45
 Forecast Computation Time
 RMS
 E
  
 
Model III: 15, 16, 17, 18 members 
Model II: 10, 15, 20 members
 Mixed: 1&10, 1&15, 1&20
 Mixed: 9&15, 12&15, 15&15
 std(s1+s2)
 small scale standard deviation
 0 1 2 3 4 5 6
 0
 0.5
 1
 1.5
 2
 Forecast Computation Time
 RMS
 E
  
 
Model III: 3, 4, 5, 6 members
 Model II: 10, 20, 30, 60, 100 members
 Mixed: 1&5, 1&10, 1&15, 1&30, 1&100
 Mixed: 1&30, 2&30, 3&30
 Figure 2.5: The average forecast computation time taken for each assimilation time step
 versus the average root mean square error over 4500 assimilation steps for left: our basic
 scenario and right: our small-scale resolving scenario. Better computation times and better
 RMSE (lower error) are both closer to the origin. Left: Data is shown for high resolution
 analysis with 3, 4, 5, and 6 members, low resolution analysis with 10, 20, 30, and 60
 members, mixed resolution analysis on 1 high resolution member with a low resolution
 ensemble of 5, 10, 15, 30, and 60, and mixed resolution analysis on a mixed ensemble of
 30 low resolution members and 1, 2, and 3 high resolution members. Right: Data is shown
 for high resolution analysis with 15, 16, 17, and 18 members, low resolution analysis with
 10, 15, and 20 members, mixed resolution analysis on 1 high resolution member with a low
 resolution ensemble of 10, 15, and 20, and mixed resolution analysis on a mixed ensemble
 of 30 low resolution members and 9, 12, and 15 high resolution members. The dashed
 black line represents the variation in the small scales and the dashed grey line corresponds
 to twice as much variation in the small scales.
 can be more cost effective than increasing the size of the low resolution ensemble, as in-
 dicated by the star and square data points respectively in Figure 2.5. Comparison between
 the circle and star data points shows in Figure 2.5 shows that mixed resolution analysis
 54
provides an improvement over analysis at high resolution, in the sense that mixed reso-
 lution analysis provides greater accuracy for the same overall forecast computation time.
 Mixed resolution analysis with multiple high resolution members also improves upon re-
 sults with only one high resolution member. In our small-scale resolving experiments, the
 mixed ensemble with only one high resolution member is unable to resolve the small scale
 variability in the truth and the uncorrelated small scale variability of the high resolution
 member increases the error.
 In our experiments with model error, we found a greater advantage of mixed res-
 olution analysis over single resolution analysis than in the corresponding perfect model
 experiment (Table 2.3). We also found that mixed resolution analysis reduces the tempo-
 ral volatility of analysis errors (Table 2.4).
 We remark that our technique could be easily extended to ensembles of 3 or more
 different resolution, albeit with the introduction of more parameters to tune. While we
 haven?t fully explored the effects of tuning ? or varying the size of the low resolution
 ensemble, the results given here give an indication of what one can expect in applying this
 method.
 In conclusion, we found that this mixing technique is beneficial and cost-efficient in
 a simple model.
 55
Time Averaged Mixed Resolution Cases High Resolution Cases
 RMSE 1 high 2 high 3 high 3 high 4 high 5 high+30 low +30 low +30 low
 basic 0.90 0.79 0.76 0.97 0.84 0.78
 F = 14 0.95 0.84 0.82 1.10 0.93 0.89
 F = 12.5 1.06 1.00 0.98 1.53 1.28 1.18
 Table 2.3: The time averaged RMSE over 4500 assimilation steps for various mixed and
 high resolution ensembles in various scenarios. The standard deviation of a 500 assimila-
 tion step moving average in most cases was between 0.01 and 0.03. The 4 member large
 model error case (F = 12.5) and all of the 3 member high resolution cases were not as
 precise, with a standard deviation of the moving average between 0.04 and 0.07.
 Standard Deviation Mixed Resolution Cases High Resolution Cases
 of Fluctuations 1 high 2 high 3 high 3 high 4 high 5 high+30 low +30 low +30 low
 basic 0.13 0.11 0.11 0.19 0.15 0.13
 F = 14 0.14 0.12 0.11 0.24 0.15 0.13
 F = 12.5 0.13 0.11 0.11 0.27 0.19 0.16
 Table 2.4: Standard deviation of fluctuations in the RMSE with time, over the same 4500
 assimilation steps used in Table 2.3.
 Small-Scale Mixed Resolution Cases High Resolution Cases
 Scenario 9 high 12 high 15 high 15 high 16 high 17 high 18 high+30 low +30 low +30 low
 RMSE 0.19 0.15 0.12 0.19 0.17 0.14 0.12
 Fluctuations 0.04 0.03 0.02 0.05 0.05 0.05 0.05
 Table 2.5: Time averaged RMSE and standard deviation of fluctuations for our small-scale
 resolving scenario. Mixed resolution RMSEs are accurate to ?0.01 and high resolution
 RMSEs are accurate to ?0.02 as measured by the standard deviation of a 500 assimilation
 step moving average.
 56
Chapter 3
 Ensemble Data Assimilation with an Adjusted Forecast Spread
 3.1 Introduction.
 Data assimilation determines initial conditions for weather forecasts from the in-
 formation provided by atmospheric observations. Comparing a prior weather forecast to
 observations and taking into account the uncertainties of each, data assimilation attempts
 to provides an optimal estimate for the current state of the atmosphere. Modifying the fore-
 cast for an optimal fit to the observations is called the analysis phase of data assimilation.
 The analysis (optimal estimate) is then evolved during the forecast phase.
 In weather prediction via data assimilation, the observation uncertainty is typically
 characterized as additive errors distributed according to a known Gaussian distribution
 whose covariance is constant in time. Forecast errors are often treated as Gaussian too, but
 the forecast error covariance changes in time and can be difficult to estimate. Ensemble
 data assimilation forecasts an ensemble to provide a statistical sampling of the forecast,
 estimating the forecast error covariance through sample statistics. Ensemble Kalman fil-
 ters fit the ensemble mean to the observations and provide an ensemble of sub-optimal
 57
estimates to represent the remaining uncertainty (Burgers et al. 1998; Anderson and An-
 derson 1999; Bishop et al. 2001; Ott et al. 2004; Wang et al. 2004; Evensen 1994;
 Houtekamer and Mitchell 1998). Forecasting this analysis ensemble provides the sample
 forecast uncertainty for the next assimilation cycle.
 The Kalman filter (Kalman 1960) is optimal for linear models with Gaussian er-
 rors. Extensions to nonlinear models, including ensemble Kalman filters, are suboptimal
 and can diverge even if there is no model error (Jazwinski 1970; Anderson and Anderson
 1999). One reason is that model nonlinearity causes ensemble Kalman filters to under-
 estimate the forecast error covariance relative to the uncertainty in the initial conditions
 (Whitaker and Hamill 2002). Insufficient ensemble covariance can also be caused by un-
 quantified errors in the forecast and observation models. One common way to compensate
 for covariance underestimation is to inflate the forecast error covariance by a multiplicative
 factor larger than one during each analysis step (Anderson and Anderson 1999; Whitaker
 and Hamill 2002; Miyoshi 2011).
 Typically the ensemble is evolved with spread representative of the approximate co-
 variance as described above. However, the estimated covariance is only utilized during
 the analysis phase of data assimilation. In this chapter we investigate the potential advan-
 tages of evolving an ensemble whose spread is not commensurate with the approximate
 covariance by rescaling the ensemble perturbations by a factor ? before the forecast step
 58
and rescaling them by 1/? after the forecast step. We call this technique forecast spread
 adjustment and we remark that this has no net effect with a linear model.
 One reason we expect improvement by varying ? is related to the advantage some-
 times observed in ensemble prediction of the mean of an ensemble forecast versus the
 forecast of the ensemble mean. If the mean of the forecast from an unscaled ensemble
 (? = 1) is more accurate (on average) than the forecast of the mean (corresponding to
 ? ? 0), then the mean of the forecast from an ensemble scaled by ? 6= 1 may be better
 still.
 This method of forecast spread adjustment allows us to relate many aspects of data
 assimilation which were previously considered independently.
 ? As we will later show, for a linear observation operator forecast spread adjustment
 has the same long-term effect on the analysis means as rescaling the observation
 error covariance (e.g., Stroud and Bengtsson 2007) by ?2, but it results in a different
 (and in our experiments, more appropriate) analysis covariance.
 ? Multiplicative covariance inflation, described above, is applied sometimes by rescal-
 ing the forecast ensemble perturbations and sometimes by rescaling the analysis en-
 semble perturbations (e.g., Bonavita et al. 2008, Kleist 2012); when using forecast
 spread adjustment in addition, both ensembles are rescaled by amounts that can be
 varied independently.
 59
? The limit ? ? 0 discussed above also corresponds to evolving the ensemble co-
 variance according to the linear approximation about the mean, as in the Extended
 Kalman Filter (e.g., Jazwinski 1970, Evensen 1992).
 In Section 3.2 we describe various data assimilation algorithms based on the Kalman
 filter, specifically the Extended Kalman Filter (which we abbreviate XKF) and the ensem-
 ble Kalman filters ETKF (Bishop et al. 2001; Wang et al. 2004) and LETKF (Hunt et
 al. 2007). In Section 3.3.1 we describe forecast spread adjustment. We compare forecast
 spread adjustment to alternative formulations that can produce the same analysis means,
 specifically: inflating the observation error covariance with the forecast error covariance
 estimated from the evolved (un-adjusted) forecast ensemble (Section 3.3.2) and inflating
 the analysis error covariance in addition to or instead of the background (forecast) error
 covariance (Section 3.3.3). The main difference between these approaches is that inflating
 the observation error covariance results in a larger background and analysis error covari-
 ance than forecast spread adjustment. We find (Section 3.4.3) that the analysis error co-
 variance specified by forecast spread adjustment better reflects the actual analysis errors
 for our numerical experiments, in which the correct observation error covariance is known.
 Section 3.3.4 compares the Ensemble Transform Kalman Filter (ETKF) and the Extended
 Kalman Filter (XKF) and discusses how forecast spread adjustment can transition from
 the XKF to the ETKF and beyond. In Section 3.4.1 we describe the Lorenz (2005) Model
 60
II used for our experiments, and in Section 3.4.2 we describe our experiments. In Section
 3.4.3 we describe how we tuned the forecast spread adjustment parameter ?, along with
 the multiplicative covariance inflation factor, in order to minimize analysis errors. We also
 discuss the possibility of adaptively inflating both parameters with techniques similar to
 Li et al. (2009). Section 3.4.4 depicts and discusses the results of our experiments. We
 discuss our conclusions and summarize our findings in Section 3.5.
 3.2 Kalman Filters.
 The Kalman filter is an optimal algorithm for data assimilation with a linear model
 with Gaussian errors. As with data assimilation in general, the Kalman filter can be
 described by the cycle of forecasting the previous analysis (optimal estimate) to get the
 background (forecast) and generating a new analysis by updating the background to more
 closely match the observations. Thus we describe the Kalman filter in terms of two phases:
 the forecast phase and the analysis phase. Most data assimilation algorithms for weather
 prediction (ensemble Kalman filters and variational methods) are approximations to the
 Kalman filter.
 The analysis state according to the Kalman filter with no model error assumed is
 given by
 xa = xb + K
 (
 yo ?Hxb
 ) (3.1)
 61
K = P bH>
 (
 HP bH> + Ro
 )
 ?1 (3.2)
 where K is the Kalman gain matrix, xa is the analysis state, xb is the forecast or back-
 ground state, P b is the background error covariance, Ro is the observation error covari-
 ance, yo is the vector of observations, and H is matrix that transforms model space into
 observation space. The error covariance of the analysis state xa is given by
 P a = (I ?KH)P b. (3.3)
 Having determined xa and P a at a certain time t, the forecast phase evolves these to
 the background state xb+ and its error covariance P b+ at time t+ > t. Specifically, in the
 forecast phase from time t to time t+, the model M evolves the analysis state at time t to
 the background state at time t+. The forecast phase is given by
 xb
 +
 = M(xa) (3.4)
 P b+ = MP aM>. (3.5)
 These Kalman filter equations (3.1)-(3.5) are only optimal for perfect model scenar-
 ios with linear models. Model error reduces the accuracy of the background mean beyond
 that accounted for by the evolved background error covariance. Moreover, even without
 62
model error, a nonlinear model tends to cause the ensemble to underestimate the back-
 ground error covariance (Whitaker and Hamill 2002). One way to partially compensate
 for this underestimation of the background error covariance is to inflate P b. Ensemble
 Kalman filters typically inflate P b with a multiplicative covariance inflation factor ?, re-
 placing P b in (3.1)-(3.3) with P binflated = ?P binitial (Anderson and Anderson 1999; Whitaker
 and Hamill 2002). When ? = 1 no inflation is performed.
 3.2.1 Ensemble Kalman Filters.
 In this section we discuss the class of data assimilation techniques known as the
 ensemble Kalman filters (EnKFs). In the ensemble techniques, rather than evolving the
 covariance directly as done in the Kalman filter and the Extended Kalman Filter (XKF),
 an ensemble
 {
 Xbi ? 1 ? i ? k
 }
 is chosen to represent a statistical sampling of a Gaussian
 background distribution, with the sample mean x?b of the background ensemble approxi-
 mating the background state xb in equation (3.1). The subscript i here refers to the index
 of the ensemble member, and all members of the ensemble estimate the background at the
 same time.
 Each ensemble member Xbi can be expressed as a sum Xbi = x?b + Xbi of the mean
 and a perturbation Xbi . Letting Xb and Xb be the matrices with columns Xbi and Xbi re-
 63
spectively, we write
 Xb = x?b + Xb, (3.6)
 where by adding a vector and a matrix we mean adding the vector to each column of the
 matrix. The columns of Xb represent the ensemble of background perturbations.
 The background error covariance is computed from the background ensemble ac-
 cording to sample statistics, namely
 P b = ?k ? 1X
 bXb> , (3.7)
 where k is the size of the ensemble and ? is the multiplicative covariance inflation factor.
 We use similar notation for the analysis ensembleXa with mean x?a and perturbations
 Xa. Ensemble Kalman filters are designed to satisfy the Kalman filter equations (3.1)-(3.5)
 with xb and xa replaced by x?b and x?a. Thus, the analysis ensemble perturbations must
 correspond a square root of the reduced rank analysis error covariance matrix in equation
 (3.3), i.e.,
 1
 k ? 1X
 aXa> = P a. (3.8)
 Different ensemble Kalman filters select different solutions Xa to this equation. In all
 ensemble Kalman filters, the ensemble members are forecast independently. For a deter-
 ministic model M representing a forecast from time t to time t+, the background ensemble
 64
members at time t+ are given by
 Xb+i = M (Xai ) . (3.9)
 3.2.1.1 ETKF
 We tested ensemble data assimilation with forecast spread adjustment on the Ensem-
 ble Transform Kalman Filter (ETKF) (Bishop et al. 2001; Wang et al. 2004) and the Local
 Ensemble Transform Kalman Filter (LETKF) (Ott et al. 2004; Hunt et al. 2007). We
 implemented both methods according to the formulation in Hunt et al. (2007). For speedy
 computation, in these algorithms the analysis phase of the Kalman filter is performed in
 the space of the observations, with the background ensemble transformed into that space
 for comparison.
 In the ETKF, the analysis mean is given by
 x?a = x?b + 1k?1X
 bUY b>Ro?1
 (
 yo ? h(xb)) (3.10)
 U =
 (
 1
 ?I +
 1
 k?1Y
 b>Ro?1Y b
 )
 ?1
 ,
 1 (3.11)
 where h is the (possibly nonlinear) observation operator, and Y b are the background per-
 1The U in equation (3.11) is different from the notation in Chapter 2; UCh3 = MaCh2.
 65
turbations in observation space derived by transferring the ensemble Xb to observation
 space and subtracting the mean y?b = h (Xb), Y b = h (Xb) ? y?b. For a linear observation
 operator, the ETKF analysis mean x?a computed in equation (3.10) equals the Kalman filter
 analysis estimate xa of equation (3.1) assuming xb = x?b and P b = ?k?1XbXb
 > (equation
 (3.7)).
 The ETKF analysis perturbations are given by
 Xa = XbU 12 , (3.12)
 where the exponent 12 represents taking the symmetric square root. These perturbations
 differ from the background perturbations as little as possible while still remaining a square
 root of P a (Ott et al. 2004). For a full discussion of the equivalence and relationship
 between (3.10)-(3.12) and (3.1)-(3.3) see Hunt et al. (2007).
 3.2.1.2 LETKF
 Essentially, the Local Ensemble Transform Kalman Filter (LETKF) performs an
 ETKF analysis at every model grid point, excluding distant observations from the analysis.
 At each grid point, it computes yolocal and Rolocal, truncating yo and Ro to include only the
 observations within a certain distance from the grid point. The LETKF computes the local
 analysis for each grid point from the ETKF analysis equations (3.10)-(3.12) and restrict
 66
the resulting analysis to that grid point alone. Observation localization is motivated and
 explored in (Houtekamer and Mitchell 1998; Ott et al. 2004; Hunt et al. 2007; Greybush
 et al. 2011). We provide a limited motivation here.
 One problem with estimating P b (equation (3.7)) from an ensemble is that sam-
 pling error can introduce artificial correlations between the background error at distant
 grid points. As the analysis increment x?a ? x?b in equation (3.10) is computed in obser-
 vation space, it is possible to eliminate these spurious correlations (along with any real
 correlations) for the analysis at a particular model grid point by truncating the observation
 space beyond a specified distance from the grid point.
 Other methods of localization (e.g., Whitaker and Hamill 2002, Bishop and Hodyss
 2009) modify the background error covariance directly. For all these approaches, local-
 ization is found to greatly reduce the number of ensemble members needed to produce
 reasonable analyses for models with large spatial domains.
 3.2.2 Extended Kalman Filter
 In the Extended Kalman Filter (XKF) (see e.g., Jazwinski 1970), the model evolves
 the analysis xa as in equation 3.4 to determine the subsequent background xb+ , and the
 background error covariance P b+ is determined from P a by linearizing the model at xa.
 For more information on the XKF and how it compares to ensemble Kalman filters, see
 67
Evensen (2003) and Kalnay (2003).
 Like the Kalman filter, the XKF is usually formulated with an additive model error
 term that increases the background covariance. However, for ease of comparison to the
 ETKF, our implementation of the XKF uses multiplicative inflation instead of additive
 inflation.
 One major difficulty with the Extended Kalman Filter is the computational cost of
 evolving the covariance matrix (Evensen 1994; Burgers et al. 1998; Kalnay 2003). This
 makes the model unfeasible for high-dimensional models in a realistic setting. However,
 when using simple test models such as the Lorenz (2005) Model II, other data assimilation
 techniques can be compared to the XKF.
 Another potential drawback of the XKF is its linear approximation when evolving
 the covariance (Evensen 2003). As we find in our experiments, this can be disadvanta-
 geous compared to the nonlinear evolution of both mean and covariance provided by an
 ensemble.
 3.3 Methodology and Theory.
 In this section we formally describe the method of forecast spread adjustment for
 ensemble Kalman filters. One way of implementing the forecast spread adjustment is to
 multiply the analysis perturbations by ? prior to forecasting and, after the forecast, mul-
 68
tiply the forecast perturbations by 1? (Section 3.3.1). This is only beneficial when the
 model is nonlinear. Other authors (e.g., Stroud and Bengtsson 2007) have found bene-
 fits from multiplicative observation error covariance inflation, motivated by observation
 errors other than measurement error. While forecast spread adjustment is equivalent to
 multiplicative observation error covariance inflation by ?2 for an appropriately initialized
 ensemble (Section 3.3.2), our experiments are performed with observations lying on the
 model grid points and perfect knowledge of the exact observation error covariance, so in-
 flating the observation error covariance would be mathematically unmotivated. However
 as adjusting the forecast spread by ? and ?2Ro inflation have similar accuracies after spin-
 up, perhaps some of the benefits of inflating Ro stem from the forecast spread adjustment
 effect.
 3.3.1 Ensemble data assimilation with forecast spread adjustment.
 This section describes the technique of shrinking (or expanding if ? > 1) the en-
 semble perturbations for the forecast and expanding (or shrinking) the ensemble for the
 analysis. After 1finding the analysis through any ensemble data assimilation techniques,
 the background perturbations are obtained by: 2multiplying the analysis perturbations by
 ?, 3evolving the adjusted analysis, and 4multiplying the forecast perturbations by 1? .
 Mathematically, this can be expressed with a matrix transformation S corresponding
 69
to an expansion factor of ?:
 Xb+ = M(XaS)S?1 (3.13)
 where
 S = ?Ik?k + (1 ? ?) ( 1k1k?k) (3.14)
 1k?k =
 ?
 ?
 ??
 ?
 ?
 ??
 1 ? ? ? 1
 .
 .
 .
 .
 .
 .
 .
 .
 .
 1 ? ? ? 1
 ?
 ?
 ??
 ?
 ?
 ?? . (3.15)
 The model integration of the matrix XaS in equation (3.13) refers to forecasting each
 ensemble member (column of XaS) separately, i.e.
 M (XaS) :=
 [
 M({XaS}1) , ? ? ? , M({XaS}k)
 ]
 . (3.16)
 For an ensemble of size k, each column of the [k ? k] matrix 1k?k right transforms
 an ensemble into its row sum so that 1kX1k?k =
 [
 x? ? ? ? x?
 ]
 . Thus right multiplication
 by S transforms each ensemble member Xai into a weighted average of itself (prior to
 adjustment) and the mean of the ensemble
 {XaS}i = ?x?a + (1? ?)Xai . (3.17)
 70
Viewed another way, right multiplication by S scales the ensemble perturbations by
 ?, producing a new ensemble with the same mean
 {XaS}i = x?a + ?Xai . (3.18)
 Notice that
 S?1 = 1
 ?
 Ik?k +
 (
 1?
 1
 ?
 ) ( 1
 k1k?k
 )
 , (3.19)
 so S?1 effectively rescales ensemble perturbations by ??1. We call a forecast contraction
 and analysis re-expansion forecast spread adjustment.
 The full Kalman filter cycle from Xb to Xb+ with the forecast phase described by
 equation (3.13) is given in the following algorithm.
 1. Perform analysis on Xb to find Xa.
 2. Left multiplyXa by S to scale the perturbations Xa by a factor of ?,
 (
 i.e. Xa,f = ?Xa
 )
 Xa,f = XaS. (3.20)
 3. Evolve each member of the resulting ensemble:
 Xb
 +
 ,f
 i = M
 (
 Xa,fi
 )
 . (3.21)
 71
4. Left multiply Xb+,f by S?1 to rescale the perturbations of the evolved ensemble by
 1
 ? :
 Xb+ = Xb+,fS?1. (3.22)
 Here and elsewhere Xb,f and Xa,f refer to the ensemble during the forecast phase. We
 abbreviate the ETKF with forecast spread adjustment and the LETKF with forecast spread
 adjustment (via a scaling factor of ?) as the ?ETKF and the ?LETKF respectively.
 We remark that when ? = 1, ensemble data assimilation with forecast spread adjust-
 ment reduces to standard ensemble data assimilation. Furthermore, if the model is linear,
 forecast spread adjustment has no net effect. When the model is nonlinear, forecast spread
 adjustment could affect the background mean, the background perturbations, and (mini-
 mally) the analysis phase background spread. We expect that the effect on the mean will
 be the most significant.
 3.3.2 Relation to inflation of the observation error covariance.
 Some authors have recommended scaling the reported observation error covariance,
 assuming that it is misrepresented (e.g., Stroud and Bengtsson 2007, Li et al. 2009). As
 we will show below, inflating the observation error covariance is closely related to forecast
 spread adjustment, and thus in many cases both approaches can improve the analyses.
 Indeed, if the observation operator h is linear then we will show that both approaches, if
 72
initialized with the same forecast ensemble, will produce the same analysis means. Having
 the same forecast ensembles implies that during the analysis phase, the two approaches
 will use different ensemble perturbations (related by a factor of ?) and hence different
 covariances.
 3.3.2.1 ?ETKF without Ro inflation.
 Let Xb be the initial background ensemble. In this case, we perform ensemble data
 assimilation with forecast spread adjustment to find Xb+ . The ?ETKF analysis ensemble
 Xa = x?a + Xa is given by equations (3.10)-(3.12) repeated here for convenience
 x?a = x?b + 1k?1X
 bUY bRo?1
 (
 yo ? y?b
 ) (3.23)
 U =
 (
 1
 ?I +
 1
 k?1Y
 b>Ro?1Y b
 )
 ?1
 (3.24)
 Xa = XbU 12 . (3.25)
 The next background ensemble Xb+ is given by equation (3.13)
 Xb+ = M( XaS )S?1. (3.26)
 73
3.3.2.2 ETKF with Ro inflated to ?2Ro.
 Let the initial background ensemble be Xb2 and assume that Xb2 = XbS. Along
 with Xb,f := XbS defined based on equations (3.20) and (3.22), this implies the additional
 equivalences x?b2 = x?b and Xb2 = Xb,f = ?Xb. We perform standard ensemble data
 assimilation to find Xb+2 assuming an observation error covariance of ?2 times its value in
 Case 3.3.2.1. We compare Xb+2 with Xb+and Xb+,f = Xb+S.
 In this case (3.3.2.2), the analysis ensemble is given by equations (3.10)-(3.12), with
 Ro2 = ?2Ro replacing Ro. After algebraic manipulation, these equations are
 U2 =
 (
 1
 ?I +
 1
 k?1?Y
 b>2 Ro?1?Y b2
 )
 ?1
 (3.27)
 x?a2 = x?b2 + 1k?1?X
 b2U2?Y b2Ro
 ?1 (yo ? y?b2) (3.28)
 Xa2 = Xb2U
 1
 2
 2 . (3.29)
 Substituting Xb2 = ?Xb and assuming a linear observation operator so that Y b2 = ?Y b,
 we find that
 U2 = U (3.30)
 x?a2 = x?a (3.31)
 Xa2 = ?Xa. (3.32)
 74
ThusXa2 = XaS. As we defineXa,f := XaS, the ensembles are identical during evolution.
 The next background ensemble is given by
 Xb
 +
 2 = M(Xa2)
 = M(XaS) . (3.33)
 Comparing this to equation (3.26), we see thatXb+2 = Xb+S =: Xb+,f ; therefore x?b+2 = x?b+ .
 In other words, the means remain the same and the perturbations maintain the same ratio.
 We remark that although both cases evolve the same ensemble, the error covariances
 in the two cases differs by the factor ?2.This is demonstrated in Figure 3.1.
 Xb2 = ?Xb P b2 = ?2P b
 Ro2 = ?2Ro
 analysis
 P a2 = ?2P aXa2 = ?Xa
 forecast
 Xb,f = ?Xb P b
 Ro
 analysis
 P aXa,f = ?Xa
 forecast
 Figure 3.1: Left: data assimilation with a forecast spread adjustment of ?. Right: standard
 data assimilation with Ro inflated by ?2. The two data assimilation cycles depicted above
 are initialized during the forecast phase, i.e. Xb2 = Xb,f .
 75
3.3.3 Which to inflate, P a or P b?
 As discussed in the beginning of Section 3.2, due to nonlinearity and model error
 ensemble data assimilation tends to underestimate the evolved error covariance. This is
 typically corrected by inflating the background error covariance prior to analysis, e.g. An-
 derson and Anderson (1999), Whitaker and Hamill (2002). However it would be just as
 theoretically valid to inflate the (posterior) analysis error covariance, e.g. Bonavita et al.
 (2008). With forecast spread adjustment, it no longer becomes a question of inflating
 one or the other; instead forecast spread adjustment provides a continuum connecting P b
 multiplicative inflation and P a multiplicative inflation.
 Consider the standard data assimilation cycle as it applies to P a and P b with mul-
 tiplicative analysis and background covariance inflation of ?a and ?b respectively (Figure
 3.2 Left). Following the algorithms given in Hunt et al. (2007), in our implementation
 of the ?ETKF, the ?LETKF, we only inflate the background error covariance (Figure 3.2
 Right). By setting ? = ??, we transform a data assimilation algorithm that inflates the
 P b,f 1?2 ?P
 b,f
 = P b
 analysis
 P aP a,f = ?2P a
 forecast
 P b,f ?bP b,f = P b
 analysis
 P aP a,f = ?aP a
 forecast
 Figure 3.2: Left: the data assimilation cycle on P b and P a including a forecast spread
 adjustment of ? and inflation of P b with ?. Right: the data assimilation cycle on P b and
 P a including inflation of P b with ?b and inflation of P a with ?a.
 76
background error covariance P b into a data assimilation algorithm that inflates the anal-
 ysis error covariance P a. Furthermore, this comparison demonstrates that ensemble data
 assimilation with an adjusted forecast spread and with multiplicative background error co-
 variance inflation can be alternatively interpreted as standard data assimilation, inflating
 both the background and analysis error covariances. In particular, adjusting the forecast
 spread by ? and inflating the background error covariance by ? is equivalent to inflating
 the background error covariance by ?b = ?/?2 and inflating the analysis error covariance
 by ?a = ?2.
 Those who have found more accurate results when inflating P a in preference to
 P b likely will benefit from forecast spread adjustment with ? > 1. Similarly, if one has
 previously obtained more accurate results with P b inflation in preference to P a inflation,
 forecast spread adjustment with ? < 1 might prove beneficial.
 3.3.4 Comparison to the XKF.
 The Ensemble Transform Kalman Filter (ETKF) (Bishop et al. 2001; Wang et al.
 2004) and the Extended Kalman Filter (XKF) (Jazwinski 1970; Evensen 1992) are similar
 in that they both evolve the covariance matrix in time. The XKF evolves the best estimate
 along with its error covariance. The ETKF evolve a surrounding ensemble which approx-
 imates the error covariance and does not explicitly evolve the best estimate, which is the
 77
mean of the ensemble.
 Previous articles, e.g. Burgers et al. (1998), demonstrate the equivalence between
 the analysis phases of the Extended Kalman Filter and the ensemble Kalman filter (EnKF),
 provided the ensemble is sufficiently large. However, this equivalence does not continue
 into the forecast phase. Starting from the same analysis state xaXKF = x?a and the same
 analysis error covariance P aXKF = 1k?1X
 aXa> , the XKF and the EnKF will evolve different
 states xb
 +
 XKF 6= x?b+ with different error covariances P b+XKF 6= 1k?1Xb
 +Xb+> .
 Varying the ensemble spread changes the evolved ensemble in a nonlinear fashion,
 affecting the mean and perturbations (in both magnitude and direction). In the limit as
 ? ? 0, the ensemble perturbations become infinitesimal and the forecast evolves the
 ensemble mean and the perturbations evolve according to the tangent linear model. If the
 ensemble is large enough that the perturbations span the model space, then a full-rank
 covariance is evolved according to the tangent linear model, just as in the XKF. Therefore
 for a sufficiently large ensemble, the ?ETKF should approach the XKF as ? tends to zero;
 otherwise, it approaches a reduced-rank XKF. Furthermore, when ? = 1, the ?ETKF is the
 standard ensemble transform Kalman filter. Thus for intermediate values of ?, the ?ETKF
 can be considered a hybrid of the XKF and the ETKF.
 In the scenarios we explored where tuning ? was especially effective, ? tuned to
 values greater than one, enhancing the advantages of the ensemble Kalman filter over the
 78
Extended Kalman Filter.
 3.4 Experiments and Results
 Our experiments are observed system simulation experiments (OSSEs). In OSSEs,
 we evolve a truth with the model, adding error at a limited number of points to simu-
 late observations. Treating the truth as unknown, we use data assimilation to estimate
 it. Comparing the analysis to the truth, we estimate the accuracy of the data assimilation
 techniques being tested. Thus we can evaluate and compare the accuracy of data assimila-
 tion techniques. We assessed accuracy via the root mean square error (RMSE) difference
 between the analysis mean and the truth evaluated at every grid point.
 3.4.1 Models
 We tested forecast spread adjustment on the Lorenz (2005) Model II with a smooth-
 ing parameter of K = 2 (smoothing the Lorenz 96 model (Lorenz 1996; Lorenz and
 Emanuel 1998) over twice as many grid points). For a forcing constant F and smoothing
 parameter K, the model at grid point j is evolved according to
 dxn
 dt =
 (
 xn+Kx
 avg
 n?K
 )avg
 ? xavgn?2Kx
 avg
 n?K ? xn + F , (3.34)
 79
where xavgn refers to averaging xn the model state at grid point n with the model state at
 nearby grid points.
 When K = 2, the averaging includes only the grid point itself and the immediately
 adjacent grid points namely
 xavgn =
 1
 4
 xn?1 +
 1
 2
 xn +
 1
 4
 xn+1. (3.35)
 Note that the avg function is applied to (xjxavgi ) by
 (xnxavgm )avg = 14xn?1xavgm?1 + 12xnxavgm + 14xn+1xavgm+1.
 Model II is evolved on a circular grid, in our experiments a circular grid with 60 grid
 points. We evolved the forecasts according to the Runge-Kutta fourth order method over
 a time step size of ?t = 0.05, performing an analysis every time step.
 To generate the observations, we evolved the truth with a forcing constant of F t = 12
 and added independent Gaussian errors with mean 0 and standard deviation ? at every
 other grid point. Thus at each time step, we have 30 evenly spaced observations with
 known error covariance Ro = ?2I30?30. Figure 3.3 gives a snapshot of these observations
 for ? = 1 along with the truth.
 80
0 10 20 30 40 50 60
 ?10
 ?5
 0
 5
 10
 15
 Grid Point Number (i)
 Truth and Observations (
 x i & 
yo i
 )
 Figure 3.3: Sample Model II output with observations for 60 grid points, half of which are
 observed (*), observation error ? = 1, smoothing parameter K = 2, and forcing constant
 F = 12.
 3.4.2 Experimental Design
 We explored the effectiveness of tuning ? in various scenarios with the LETKF. Our
 default parameters for data assimilation are: an observation error ? = 1, an ensemble
 of k = 10 members, and a forcing constant of F = 14 for the ensemble forecast to
 simulate model error. In all cases we use a localization radius of 3 grid points, i.e. 3 to 4
 observations in a local region.
 Keeping the other two parameters constant at their default value, we varied the
 observation error, the ensemble size, and the forcing constant. For instance, we varied
 ? = 12 , 1, 2, keeping k = 10 and F = 14. We tested ensemble sizes of 5, 10, 20, and 40.
 81
We varied the forcing constant from F = 9 to F = 15.
 In each experiment we tuned ? for ? = 1 and plotted the RMSE for different values
 of ?:
 RMSE = 14500
 5000?
 i=501
 rms
 (
 x?a(ti) ? xt(ti)) , (3.36)
 where xt(ti) is the truth at time ti, x?a(ti) is the analysis ensemble mean at time ti, and
 rms is the root mean square function over the 60 grid points, i.e. the RMSE is the time
 averaged root mean square difference between the analysis mean and the truth. In each
 case we compared the minimum RMSE among the various values of ? tested to the RMSE
 when ? = 1 and plotted the percent improvement
 % Imprv. =
 RMSE(? = 1) ? min
 0<?<?
 RMSE(?)
 RMSE(? = 1) . (3.37)
 We also tested how the ?ETKF compares to the XKF for large ensembles and small
 ?. Specifically, we tested the ?ETKF for ? = 1, 12 ,
 1
 4 ,
 1
 8 , and 10
 ?6 and ensembles of size
 k = 40, 60, and 80. For ? = 10?6 and k = 80 we expect the RMSE from the ?ETKF to
 be very similar to the RMSE from the XKF.
 82
3.4.3 Tuning ? and ?
 Since the two cases are identical in the linear setting, forecast spread adjustment
 could be advantageous in certain nonlinear settings. However, the best value of ? is not
 specified mathematically. Hence we tune ? to get the lowest RMSE during data assimi-
 lation. Similarly, we also tune the multiplicative covariance inflation factor ?. Figure 3.4
 depicts the ensemble spread both before (red) and after (blue) adjustment as a function of
 ?.
 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
 0
 0.5
 1
 1.5
 2
 2.5
 3
 3.5
 4
 ?
 RMS
 E
  
 
RMSE: a
 spread: Xa
 spread: Xa,f
 RMSE: b
 spread: Xb
 spread: Xb,f
 Figure 3.4: Tuning ? (grey) and the effect of ? on ensemble spread during the forecast
 phase (red) and analysis phase (blue) for our default parameters: k = 10, F = 14 (F t =
 12), ? = 1. Comparison of the ensemble spread during the analysis phase (blue) to the
 actual RMSE for various values of ? (grey). The solid grey curve shows that ? tunes to
 2.5. We tuned ? using the tuned value of ? for ? = 1.
 83
We define the ensemble spread by?
 1
 N trace
 (
 1
 k ? 1XX
 >
 )
 , (3.38)
 where N = 60 is the length of the circular grid and 1k?1XX
 > is an NxN matrix ap-
 proximating the error covariance as described in equations (3.7) and (3.8). When ? = 1,
 the ensemble spread (equation (3.38)) is intended to estimate the actual RMSE (equation
 (3.36)). As demonstrated in Figure 3.4, when ? 6= 1, the ensemble spread only estimates
 the actual RMSE during the analysis phase. In other words, the forecast spread adjustment
 parameter ? only strongly affects the ensemble spread during the forecast phase (red).
 During the analysis phase (blue), the ensemble spread does not vary significantly with ?.
 The minimum of the solid grey curve corresponds to the tuned value of ?, which for our
 default parameters is ? = 2.5.
 84
1.1 1.12 1.14 1.16 1.18 1.2 1.22 1.24 1.26 1.28 1.3
 0
 0.2
 0.4
 0.6
 0.8
 1
 1.2
 ?
 RMS
 E
  
 
RMSE: a
 spread: Xa & Xa,f
 RMSE: b
 spread: Xb & Xb,f
 Figure 3.5: Tuning ? and the effect of ? on ensemble spread for our default parameters
 with ? = 1. The dotted lines correspond to the background and the solid lines correspond
 to the analysis. The spread is indicated in black and the RMSE in grey.
 Recall (Section 3.3.2) that with observation error covariance inflation, the analysis
 perturbations Xa2 are the same as the perturbations Xa,f for the forecast spread adjust-
 ment. Thus, in this scenario, when ? is tuned to a value different than one, the ensemble
 spread Xa for forecast spread adjustment corresponds much better to the actual analysis
 errors than the spread Xa2 for observation error covariance inflation.
 Figure 3.5 (black) demonstrates that multiplicative background error covariance in-
 flation by ? (Anderson and Anderson 1999; Whitaker and Hamill 2002) carries through to
 both phases of data assimilation. We remark that in this example ? tunes to 1.20 (grey),
 even though the ensemble spread underestimates the RMSE. However, in this case (Figure
 85
3.4) we find that tuning ? as well as ? decreases the error to be commensurate with the
 ensemble spread. To tune ? in our LETKF experiments we let ? = 1 and tested values
 of ? differing by 0.01, choosing the value of ? associated with the lowest analysis RMSE.
 When comparing the ?ETKF to the XKF, we tuned ? = 1.29 for our XKF experiment and
 used that ? for our ?ETKF experiments as well.
 In our experiments we determined the optimal values of ? and ? through tuning.
 However, there are methods that adaptively determine the values of ? (Anderson 2007; Li
 et al. 2009; Miyoshi 2011). In particular, Li et al. (2009) simultaneously estimate values
 for a background error covariance factor and for an observation error covariance factor.
 When determining the analysis mean, these two parameters are equivalent in some sense
 to ? and ? (see Section 3.3.2). Thus, the adaptive techniques explored by Li et al. (2009)
 could potentially be applied to simultaneous adaptive estimation of ? and ?. On the other
 hand, some adjustment may be necessary if ? is mainly compensating for the nonlinearity
 and model bias as opposed to a misspecification of the observation error covariance.
 3.4.4 Results
 Our results are given graphically as both ? versus RMSE; as well as k (ensemble
 size), F (forcing constant), and ? (observation error) versus percent improvement in the
 RMSE. The RMSEs are generally accurate out to the hundredths place (?0.01); the RM-
 86
SEs from five independent trials with our default parameters and ? = 1 are 0.86, 0.86,
 0.87, 0.88, and 0.88. When k = 5, the RMSEs with ? = 1 and with the tuned value of
 ? = 3.5 are less precise with accuracy approximately ?0.04. In all of our other experi-
 ments (LETKF varying k, F , or ?, and ?ETKF vs. XKF) the RMSEs are accurate to about
 ?0.01.
 87
0 1 2 3 4
 0
 0.5
 1
 1.5
 2
 Eta
 RMS
 E
 (d)
  
 
oerr = 0.5
 oerr = 1
 oerr = 2
 0 1 2 3 4
 0
 0.5
 1
 1.5
 2
 (b)
  
 
RMS
 E
 Eta
 k =  5
 k = 10
 k = 20
 k = 40
 0 1 2 3 4
 0.6
 0.8
 1
 1.2
 (a)
 RMS
 E
 Eta
 0 1 2 3 4
 0
 0.5
 1
 1.5
 2
 RMS
 E
 (c)
  
 
Eta
 F =  9
 F = 10
 F = 11
 F = 12
 F = 13
 F = 14
 F = 15
 0.5 1 1.5 2
 0
 10
 20
 30
 Observation Error
 % Improvemen
 t
 (h)
 0 10 20 30 40
 0
 10
 20
 30
 Ensemble Size
 % Improvemen
 t
 (f)
 8 10 12 14 16
 0
 10
 20
 30
 Forcing Constant
 % Improvemen
 t
 (g)
 0
 10
 20
 30
 % Improvemen
 t
 (e)
 Default
 Figure 3.6: Left: The RMSE as a function of ? for (a) our default parameters: an
 ensemble with k = 10 members, model error due to the forcing constant F = 14
 where the true forcing is F t = 12, and observation error of 1; (b) various ensem-
 ble sizes: k = 5, 10, 20, and 40 members; (c) different amounts of model error: F =
 9, 10, 11, 12, 13, 14, and 15 with F t = 12; (d) different amounts of observation error:
 ? = 0.5, 1, and 2. In each of (b), (c) and (d), one of the parameters from (a) (k = 10,
 F = 14, ? = 1) is varied, keeping the other two parameters at their default values; the
 curve graphed in (a) appears in (b), (c) and (d) as well. We restrict the range of the y-axis
 in (a) to better show the structure of the curve. Right: Comparing the RMSE when ? = 1
 to the minimum RMSE when ? is allowed to vary. Subplot (e) shows the percent im-
 provement in the RMSE when varying ? with our default parameters, i.e. setting ? = 2.5
 gives results with an RMSE that is 14% smaller than when ? = 1. Also shown are the
 percent improvements versus (f) various ensemble sizes, (g) different forcing constants,
 and (h) different amounts of observation error. Each right hand figure (e), (f), (g), and (h)
 is derived from the same data as the figure immediately to its left, i.e. (a), (b), (c), and (d)
 respectively.
 88
3.4.4.1 Default Parameters
 When k = 10, F = 14 (F t = 12), and ? = 1, we obtain the most accurate results
 when ? = 2.5, providing a 14% improvement over the RMSE when ? = 1 (0.74 vs. 0.87).
 Although we generally did not retune ? after tuning ?, we remark than doing so further
 improves the RMSE; setting ? = 2.5 and ? = 1.16 improves the RMSE by 19% over
 ? = 1 and ? = 1.20.
 Figure 3.6(a) shows the RMSE for various values of ? assuming the default param-
 eters. The curve is minimized when ? = 2.5. Figures 3.6(b), 3.6(c), and 3.6(d) show
 the effect of varying one of the default parameters (k, F , ?), while keeping the other two
 constant. Thus the curve in 3.6(a) also appears in 3.6(b), 3.6(c), and 3.6(d).
 3.4.4.2 Varying the Ensemble Size
 Figures 3.6(b) and 3.6(f) shows how varying the ensemble size, k, affects the RMSE
 and the tuned value of ?. We find that forecast spread adjustment improves results more
 with smaller ensembles than with large ensembles. We expect that this is because increas-
 ing the ensemble size generally improves the accuracy of the results, so small ensembles
 have more room to improve their accuracy. For ensembles with k = 10 or more members,
 the RMSE improvement due to tuning ? levels out at around 15%.
 89
3.4.4.3 Varying the Model Error
 We simulated model error by forecasting the ensemble with a different forcing con-
 stant than F t = 12 used for our truth run. Figures 3.6(c) and 3.6(g) show how varying the
 amount of model error (i.e. varying F ) can change the effectiveness of tuning ?. Fore-
 cast spread adjustment is most helpful in the presence of model errors that result in to
 larger amplitude oscillations than the truth, such as when F = 13, 14, or 15 and F t = 12.
 Averaging the forecast ensemble to produce the background mean tends to reduce the am-
 plitude of these oscillations, and by increasing ? we intensify this reduction. We believe
 this is compensating for some of the model error. We did not find benefits to tuning ? in
 the perfect model scenario or when the forcing constant for the ensemble was smaller than
 the true forcing constant.
 3.4.4.4 Varying the Observation Error
 Figures 3.6(d) and 3.6(e) show that effectiveness of forecast spread adjustment and
 the tuned value of ? depend on the size of the observation error ?. As with smaller en-
 sembles, larger observation errors imply larger analysis errors and hence greater room
 for improvement. The tuned values of ? for observation errors of ? = 0.5, 1, and 2 are
 ?=4, 2.5, and 1.75 respectively. Thus more observation error corresponds to a smaller
 tuned value of ?. The ensemble spread in each of these three cases is about 1.8. The ensem-
 90
ble spread roughly determines the size of the difference between the mean of the ensemble
 forecast and the forecast of the ensemble mean; thus, our finding of an ideal forecast
 ensemble spread independent of observation error size is consistent with the hypothesis
 above that ensemble averaging is compensating for some of the model error present.
 3.4.4.5 Approximating the XKF with the ?ETKF
 Table 3.1 lists the RMSE from the ?ETKF with forecast spread adjustment parameter
 ? = 1, 12 ,
 1
 4 ,
 1
 8 , and 10
 ?6 and ensembles of size k = 40, 60, and 80.
 k=40 k=60 k=80 XKF
 ? = 1 0.88 0.77 0.76
 ? = 1/2 0.95 0.82 0.81
 ? = 1/4 0.97 0.83 0.82
 ? = 1/8 0.97 0.83 0.83
 ? = 10?6 0.96 0.83 0.83
 XKF 0.83
 Table 3.1: Comparison between the XKF RMSE and the ?ETKF RMSEs for ? =
 1, 12 ,
 1
 4 ,
 1
 8 , and 10
 ?6 and k = 40, 60, and 80 with our default parameters.
 We remark that the RMSE associated with the ?ETKF approaches the XKF-RMSE
 from below (increasing the error) as ? tends to zero and from above (decreasing the error)
 as k becomes large.
 91
3.5 Summary and Conclusions
 Forecast spread adjustment is an add-on to ensemble data assimilation. After the
 analysis phase we expand the ensemble perturbations about their mean via multiplication
 by the factor ?, forecasting the adjusted ensemble. Prior to the next analysis, we read-
 just the perturbations around the background mean in order to form the background error
 covariance matrix.
 For linear models, ensemble data assimilation with forecast spread adjustment by
 any amount reduces to standard ensemble data assimilation (? = 1). For nonlinear mod-
 els however, forecast spread adjustment can be beneficial. Furthermore, a ensemble data
 assimilation with an adjusted forecast spread of ? = 1 is equivalent to standard ensemble
 data assimilation. Thus ensemble data assimilation with tuned forecast spread adjustment
 will always perform at least as well as standard ensemble data assimilation even if the
 parameter ? is only roughly tuned.
 Forecast spread adjustment affects the ensemble spread during the forecast phase, so
 for nonlinear models, an ensemble with a different spread will evolve to a different mean
 with different perturbation amounts and directions.
 Some authors (e.g., Stroud and Bengtsson 2007) have found benefits to inflating the
 observation error covariance, assuming errors in addition to measurement error. In addi-
 tion to correcting the underestimation of the observation error covariance, they might also
 92
be benefiting from the effects of forecast spread adjustment. For an ensemble initialized
 immediately after the model evolution, forecast spread adjustment and observation error
 covariance inflation will produce the same analysis mean and will evolve the same en-
 semble. This is because both maintain the same the ratio between the background and
 observation error covariances. The difference is that when inflating the observation error
 covariance, the size of the assumed errors (P b, Ro, and P a) is ?2 times bigger than those
 assumed with forecast spread adjustment.
 Due to nonlinear effects and model error, an ensemble which appropriately describes
 the uncertainty at the analysis time will underestimate the uncertainty after it is evolved in
 time. To compensate for this, many algorithms implement multiplicative covariance infla-
 tion on either the background (e.g., Anderson and Anderson 1999, Whitaker and Hamill
 2002, Hunt et al. 2007) or the analysis (e.g., Bonavita et al. 2008). Either technique
 is valid and forecast spread adjustment can transition between them. Furthermore, fore-
 cast spread adjustment can be described as doing both, i.e. inflating the analysis error
 covariance by ?2, evolving the ensemble, then deflating the background error covariance
 by ?/?2.
 We demonstrated that the ?ETKF provides a continuum from a reduced-rank Ex-
 tended Kalman Filter (? = 0) to the ordinary ETKF (? = 1) and beyond. If the ensemble
 is sufficiently large, we recover the full-rank Extended Kalman Filter. On the other hand,
 93
we generally found the lowest analysis errors when ? was close to or larger than 1.
 With our standard parameters, the LETKF with forecast spread adjustment improves
 by 14% upon the standard LETKF. In comparison, a 10 member ensemble with forecast
 spread adjustment performs better than a 40 member ensemble without forecast spread
 adjustment. The parameter ? tunes to larger than 1 in all scenarios where we saw improve-
 ments. This corresponds to evolving an ensemble with a larger spread than when ? = 1.
 Our standard parameters include model error induced by changing the forcing constant
 used when evolving the ensemble to a larger value than the true forcing constant. Fore-
 cast spread adjustment proved even more effective in our tests with an even larger forcing
 constant. However, in perfect model scenarios and when the ensemble forcing constant
 was smaller than the true forcing, tuning ? did not significantly improve the RMSE. In
 comparison to the RMSE when ? = 1, tuning ? was just as effective in improving the
 RMSE with ensembles of 40 members as it was with ensembles of 10 members, and was
 even more effective with ensembles of 5 members. Lower observation errors increased the
 tuned value of ? in such a way as to suggest that there is an ideal ensemble spread during
 the forecast phase which is relatively independent of the true error covariance but depends
 upon the accuracy of the model.
 In conclusion, forecast spread adjustment with ? > 1 provided significant improve-
 ment compared to ? = 1 (no adjustment) in some of our experiments with a simple model.
 94
Forecast spread adjustment was particularly effective for smaller ensembles, smaller ob-
 servation errors, and (in some cases) larger model errors. Our interpretation of why ? > 1
 improves results when F > 12 but not when F ? 12 is that larger values of F are as-
 sociated with higher amplitude oscillations and larger values of ? tend to reduce these
 oscillations in the background mean by averaging over a larger ensemble spread.
 95
Chapter 4
 Summary and Future Work
 I developed and tested two techniques for improving ensemble data assimilation.
 The first, which I call mixed resolution ensemble data assimilation, incorporates ensem-
 bles of two different resolutions,combining the accuracy of a high-resolution model with
 the larger ensemble size feasible for a low-resolution model. The second, which I call
 forecast spread adjustment, allows the ensemble spread during the model evolution to be
 different from the forecast uncertainty, increasing or decreasing the nonlinear effects. This
 can improve upon previous techniques that restrict the ensemble spread to be commen-
 surate with the forecast uncertainty during both the forecast and analysis phases of data
 assimilation.
 Chapter 1 introduced data assimilation, in particular ensemble Kalman filters. With
 an ensemble of forecasts, as opposed to a single forecast, we can estimate the uncertainty
 in the forecast. Data assimilation balances the uncertainties in the ?background? (forecast)
 and in the observations and produces a better ?analysis? estimate of the true model state.
 The uncertainties in the background, observations, and analysis are quantified as error
 covariances.
 96
Current ensemble methods assume the ensemble is evolved at the same resolution,
 with possibly one higher resolution control forecast. In Chapter 2, I developed and tested
 an ensemble data assimilation technique which incorporates ensembles of two resolutions.
 Higher resolution is desirable for more accurate forecasts; a larger ensemble is desirable
 for improved sample statistics when estimating the error covariance. However, due to
 limited computational resources, single resolution ensembles compromise between the
 ensemble size and resolution. My research shows that combining a small high-resolution
 ensemble with a large low-resolution ensemble improves the trade-off and improves the
 accuracy of the analysis for the same computation cost when compared to single resolu-
 tion ensembles or even ensembles which include one high resolution control forecast. I
 compared the forecast computation time to the RMSE for different ensemble sizes of both
 mixed and single resolution, assimilating with the mLETKF (mixed-resolution LETKF)
 and the LETKF (Local Ensemble Transform Kalman Filter) respectively. I tested perfect
 model scenarios and scenarios with model error. I also designed and tested a scenario
 where it was possible to resolve the small scale variability of the high resolution model.
 In all cases I showed improvement for mixed resolution ensembles over both single reso-
 lution ensembles and ensemble with only one high resolution member. Future possibili-
 ties for expanding upon this research include: applying it to operational weather models;
 allowing the weight parameter ? that controls the relative influence between the two sub-
 97
ensembles to vary with scale, e.g., allowing the large scale component of the covariance
 to come equally from the high and low resolution ensembles while the small scale compo-
 nent comes only from the high resolution ensemble (C. Bishop, personal communication);
 and more fully exploring the relationship between ? and the ensemble spread.
 In current ensemble methods, the ensemble not only provides background error co-
 variance used to determine the analysis, but the ensemble spread also continues to corre-
 spond to the uncertainty as it is evolved. However, the ensemble only needs to be com-
 mensurate with the background uncertainty during the analysis phase of data assimilation.
 In Chapter 3 I explored the possibility of evolving an ensemble whose spread is not com-
 mensurate with the uncertainty during its evolution, readjusting the spread to be commen-
 surate with the uncertainty during the analysis phase. I call this technique forecast spread
 adjustment. In addition to testing this technique, I also discussed relationships between
 forecast spread adjustment and other techniques. When the forecast adjustment parameter
 ? is small and the ensemble size is large, the ensemble Kalman filter (EnKF) will evolve
 the same covariance as the Extended Kalman Filter (XKF). Previous authors have demon-
 strated the equivalence between the analysis phases of EnKF and the XKF (e.g. Burgers et
 al. 1998); my research connects them during their forecast phases as well. I also show how
 forecast spread adjustment can be used to transform background error covariance inflation
 into analysis error covariance inflation or to combine the two approaches. Some authors
 98
have discussed inflating the observation error covariance as well, assuming that it is un-
 derestimated (Stroud and Bengtsson 2007) or misrepresented (Li et al. 2009). Forecast
 spread adjustment produces the same background and analysis means as observation error
 covariance inflation for ensembles that are initialized during the forecast phase. When the
 observation error is correctly specified however, the background and analysis errors will
 be more commensurate with the covariances specified by forecast spread adjustment rather
 than by observation error covariance inflation.
 My results also show that forecast spread adjustment can lead to significant im-
 provements in the accuracy of ensemble data assimilation when model error is present,
 though the improvement depends on the nature of the model error. In my experiments,
 improvement occurred in cases where ensemble averaging compensates for model biases.
 Future research may include further investigation of the reasons behind improvements due
 to forecast spread adjustment, extensions of the method to help in cases where forecast
 spread adjustment shows little improvement, development of adaptive methods for tuning
 ?, investigation of forecast spread adjustment in conjunction with ensemble forecasting.
 99
Bibliography
 Anderson, J.L, 2007: Exploring the need for localization in ensemble data assimilation
 using a hierarchical ensemble filter. Physica D, 230, 99-111.
 Anderson, J.L., and S.L. Anderson, 1999: A Monte Carlo Implementation of the Nonlin-
 ear Filtering Problem to Produce Ensemble Assimilations and Forecasts. Mon. Wea. Rev.,
 127, 2741-2758.
 Ballabrera-Poy J., E. Kalnay, and S-C. Yang, 2009: Data assimilation in a system with
 two scales?combining two initialization techniques. Tellus A, 61, 539-549.
 Bishop, C.H., B.J. Etherton, and S.J. Majumdar, 2001: Adaptive sampling with the en-
 semble transform Kalman filter. Part I: Theoretical aspects. Mon. Wea. Rev., 129, 420-
 436.
 Bishop, C.H., Hodyss, D., 2009: Ensemble covariances adaptively localized with ECO-
 RAP. Part 1: tests on simple error models. Tellus A 61, 84-96.
 Bonavita, M., L. Torrisi, and F. Marcucci, 2008: The ensemble Kalman filter in an opera-
 tional regional NWP system: Preliminary results with real observations. Q.J.R. Meteorol.
 Soc., 134, 1733-1744.
 Buizza, R., P.L. Houtekamer, Z. Toth, G. Pellerin, M. Wei, and Y. Zhu, 2005: A Compar-
 ison of the ECMWF, MSC, and NCEP Global Ensemble Prediction Systems. Mon. Wea.
 Rev., 133, 1076-1097.
 Buizza, R., J.-R. Bidlot, N. Wedi, M. Fuentes, M. Hamrud, G. Holt, and F. Vitart, 2007:
 The new ECMWF VAREPS (Variable Resolution Ensemble Prediction System). Q.J.R.
 Meteorol. Soc., 133, 681-695.
 Burgers, G., P.J. van Leeuwen, G. Evensen, 1998: Analysis scheme in the ensemble
 Kalman filter. Mon. Wea. Rev., 126, 1719-1724.
 Du,J., 2004: Hybrid ensemble prediction system: A new ensembling approach. Preprints.
 Symp. 50th Anniversary Operational Numerical Weather Prediction, College Park, MD,
 Amer. Meteor. Soc., Paper 4.2.
 Evensen, G., 1992: Using the Extended Kalman Filter with a Multilayer Quasi-
 Geostrophic Ocean Model. J. Geophys. Res. 97, 17905-17924.
 100
Evensen, G., 1994: Sequential data assimilation with a nonlinear quasigeostrophic model
 using Monte Carlo methods to forecast error statistics. J. Geophys. Res. 99, 10143-10162.
 Evensen, G., 2003: The Ensemble Kalman Filter: theoretical formulation and practical
 implementation. Ocean Dyn., 53, 343-367.
 Gao, J., and M. Xue, 2008: An efficient dual-resolution approach for ensemble data
 assimilation and tests with assimilated Doppler radar data. Mon. Wea. Rev., 136, 945-
 963.
 Gao, J., M. Xue, and D. J. Stensrud, 2010: The development of a hybrid EnKF-3DVar
 algorithm for storm-scale data assimilation. 25th Conf. Several Local Storms, Denver,
 CO, Amer. Meteor. Soc., Paper 7.4.
 Greybush, S.J., E. Kalnay, T. Miyoshi, K. Ide, B.R. Hunt, 2011: Balance and Ensemble
 Kalman Filter Localization Techniques. Mon. Wea. Rev., 139, 511-522.
 Hamill, T.M., and C. Snyder, 2000: A Hybrid Ensemble Kalman Filter-3D Variational
 Analysis Scheme. Mon. Wea. Rev., 128, 2905-2919.
 Hamill, T.M., J.S. Whitaker, C. Snyder, 2001: Distance-dependent filtering of back-
 ground error covariance estimates in an ensemble Kalman filter, Mon. Wea. Rev., 129,
 2776-2790.
 Houtekamer, P.L., H.L. Mitchell, 1998: Data assimilation using an ensemble Kalman
 filter technique. Mon. Wea. Rev., 126, 796-811.
 Hunt, B.R., E. Kostelich, and I. Szunyogh, 2007: Efficient data assimilation for spa-
 tiotemporal chaos: a local ensemble transform Kalman filter. Physica D, 230, 112-126.
 Jazwinski, A.H., 1970: Stochastic Processes and Filtering Theory. Academic Press, 376
 pp. Reprint by Dover Publications in 2007.
 Kalman, R.E., A new approach to linear filtering and prediction problems, Trans. ASME,
 Ser. D, J.Basic Eng., 82, 35-45.
 Kalnay, E., 2003: Atmospheric Modeling, Data Assimilation and Predictability. Cam-
 bridge University Press, 341 pp.
 Kleist, D., 2012: An evaluation of hybrid variation-ensemble data assimilation for the
 NCEP GFS. PhD thesis, University of Maryland College Park.
 101
Leith, C. E., 1974: Theoretical skill of Monte Carlo forecasts, Mon. Wea. Rev., 102,
 409-418.
 Li, H., E. Kalnay, and T. Miyoshi, 2009: Simultaneous estimation of covariance inflation
 and observation errors within an ensemble Kalman filter. Q.J.R. Meteorol. Soc., 135,
 523-533.
 Lorenc, A.C., 2003: The potential of the ensemble Kalman filter for NWP?A compari-
 son with 4D-VAR. Q.J.R. Meteorol. Soc., 129, 3183-3203.
 Lorenz, E.N., 1996: Predictability?A problem partly solved. Proc. Seminar on Pre-
 dictability, Vol. 1, ECMWF, Reading, Berkshire, UK, 1-18.
 Lorenz, E.N., 2005: Designing Chaotic Models. J. Atmos. Sci., 62, 1574-1587.
 Lorenz, E.N., and K.A. Emanuel, 1998: Optimal sites for supplementary weather obser-
 vations: Simulation with a small model. J. Atmos. Sci., 55, 399-414.
 Miyoshi, T., 2011: The Gaussian approach to adaptive covariance inflation and its im-
 plementation with the Local Ensemble Transform Kalman Filter. Mon. Wea. Rev., 139,
 1519-1535.
 Molteni, F., R. Buizza, T. N. Palmer, and T. Petroliagis, 1996: The ECMWF ensemble
 prediction system: Methodology and validation. Q.J.R. Meteorol. Soc., 122, 73-119.
 Ott, E., B.R. Hunt, I. Szunyogh, A.V. Zimin, E.J. Kostelich, M. Corazza, E. Kalnay,
 D.J. Patil, and J.A. Yorke, 2004: A local ensemble Kalman filter for atmospheric data
 assimilation. Tellus A, 56, 415-428.
 Toth, Z., and E. Kalnay, 1993: Ensemble Forecasting at NMC: The Generation of Pertur-
 bations. Bull. Amer. Meteorol. Soc., 74, 2317-2330.
 Toth, Z., and E. Kalnay, 1997: Ensemble forecasting at NCEP and the breeding method.
 Mon. Wea. Rev., 125, 3297-3319.
 Stroud, J.R., and T. Bengtsson, 2007: Sequential State and Variance Estimation within
 the Ensemble Kalman Filter. Mon. Wea. Rev., 135, 3194-3208.
 Wang, X., C.H. Bishop, and S.J. Julier, 2004: Which is better, an ensemble of positive-
 negative pairs or a centered spherical simplex ensemble?. Mon. Wea. Rev., 132, 1590-
 1605.
 102
Wang, X., C. Snyder, and T. M. Hamill, 2007: On the theoretical equivalence of differ-
 ently proposed ensemble/3D-Var hybrid analysis schemes. Mon. Wea. Rev., 135, 222-227.
 Whitaker, J.S., and T.M. Hamill, 2002: Ensemble data assimilation without perturbed
 observations. Mon. Wea. Rev., 130, 1913-1924.
 Yang, S-C, E. Kalnay, B. Hunt, and N.E. Bowler, 2009: Weight interpolation for efficient
 data assimilation with the Local Ensemble Transform Kalman Filter. Q.J.R. Meteorol.
 Soc., 135, 251-262.
 Yoon, Y-N, and E. Ott, 2010: On the Propagation of Information and the Use of Local-
 ization in Ensemble Kalman Filtering. J. Atmos. Sci., 67, 3823-3834.
 Zhang, M., and F. Zhang, 2012: E4DVar: Coupling an Ensemble Kalman Filter with
 Four-Dimensional Variational Data Assimilation in a Limited-Area Weather Prediction
 Model. Mon. Wea. Rev., 140, 587-600.
 Zupanski, M., 2005: Maximum likelihood ensemble filter: Theoretical aspects. Mon.
 Wea. Rev., 133, 1710-1726.
 103