ABSTRACT
Title of dissertation: ASSIMILATING SATELLITE OBSERVATIONS
WITH A LOCAL ENSEMBLE KALMAN FILTER
Elana Judith Fertig
Doctor of Philosophy, 2007
Dissertation directed by: Professor Brian R. Hunt
Department of Mathematics
Numerical weather prediction relies on data assimilation to estimate the cur-
rent state of the atmosphere. Generally speaking, data assimilation methods com-
bine information from observations and from a prior forecast state, taking into
account their respective uncertainties. Ensemble-based data assimilation schemes
estimate the forecast uncertainty with the sample covariance from an ensemble of
forecasts. While these schemes have been shown to successfully assimilate conven-
tional observations of model state variables, they have only recently begun to as-
similate satellite observations. This dissertation explores some of the complications
that arise when ensemble-based schemes assimilate satellite observations.
Although ensemble data assimilation schemes often assume that observations
are taken at the time of assimilation, satellite observations are available almost
continuously between consecutive assimilation times. In Chapter 2, we formu-
late a ?four-dimensional? extension to ensemble-based schemes that is analogous
to the operationally used scheme 4D-VAR. Using perfect model experiments with
the Lorenz-96 model, we find that the four-dimensional ensemble scheme can per-
form comparably to 4D-VAR.
Many ensemble data assimilation schemes utilize spatial localization so that
a small ensemble can capture the unstable degrees of freedom in the model state.
These local ensemble-based schemes typically allow the analysis at a given location
to depend only on observations near that location. Meanwhile, the location of
satellite observations cannot be pinpointed in the same manner as conventional
observations. In Chapter 3, we propose a technique to update the state at a given
location by assimilating satellite radiance observations that are strongly correlated
to the model state there. For satellite retrievals, we propose incorporating the
observation error covariance matrix and selecting the retrievals that have errors
correlated to observations near the location to be updated. Our selection techniques
improve the analysis obtained when assimilating simulated satellite observations
with a seven-layer primitive equation model, the SPEEDY model.
Finally, satellite radiance observations are subject to state-dependent, sys-
tematic errors due to errors in the radiative transfer model used as the observation
operator. In Chapter 4 we propose applying state-space augmentation to ensemble
based assimilation schemes to estimate satellite radiance biases during the data as-
similation procedure. Our approach successfully corrects such systematic errors in
simulated biased satellite observations with the SPEEDY model.
ASSIMILATING SATELLITE OBSERVATIONS
WITH A LOCAL ENSEMBLE KALMAN FILTER
by
Elana Judith Fertig
Dissertation submitted to the Faculty of the Graduate School of the
University of Maryland, College Park in partial fulfillment
of the requirements for the degree of
Doctor of Philosophy
2007
Advisory Committee:
Professor Brian R. Hunt, Chair/Advisor
Dr. Ricardo Todling, NASA Advisor
Professor Edward Ott
Professor Istvan Szunyogh
Professor Harry Tamvakis
c? Copyright by
Elana Judith Fertig
2007
Acknowledgments
I owe thanks to many people for making this dissertation possible. I would like
to first thank my entire dissertation committee. Thanks to Brian Hunt, Edward Ott,
and Istvan Szunyogh who helped me develop and pursue relevant research projects.
I have grown immensely from their advice and consideration. I also owe thanks to
Ricardo Todling for the countless hours he dedicated to bring a scaled down version
of the NASA fvGCM operational model to the University of Maryland.
Thanks to the entire Chaos Weather group for giving me the opportunity to
work on real world systems from early in my graduate school career and countless
public speaking opportunities. Eugenia Kalnay has been optimistic and supportive
in all the projects I have pursued. Thanks to Eric Kostelich for his continual coding
efforts in developing and improving the LETKF system. I owe both Ricardo Todling
and Eric Kostelich further thanks for their patience in teaching myself, Hong Li, and
Junjie Liu their systems. I am also grateful for the time I spent working with Jos`e
Arav`equia, Hong Li, and Junjie Liu. Thanks to their patience, advice, teaching,
and friendship during our countless hours coding together, I learned and achieved
things I never would have been able to do on my own.
A special thank you to Gyorgyi Gyarmati, Alfredo Nava-Tudela, and Aleksey
Zimin for fighting to make sure that the computers always ran smoothly. Thanks
to Alverda McCoy for assisting me in jump through all the graduate school hoops.
I would also like to thank David Levermore, the ESSIC staff, and the NASA God-
dard staff for generously arranging and providing the funding for my time at the
University of Maryland.
ii
I would like to thank Arlindo DaSilva, Jim Jeung, Jim Purser, and Jeff
Whitaker for their helpful suggestions for my research. Thanks, too, to all my fellow
graduate students, who were always there as a support. I would like to especially
acknowledge Seung-Jong Baek, Chris Danforth, Amy Finkbiner, Fernando Galaz-
Garcia, JT Halbert, Russ Halper, John Harlim, Matt Hoffman, David Kuhl, Aaron
Lott, Dagmar Merkova, Elizabeth Satterfield, Suzanne Sindi, Poorani Submaranian,
and James White for their suggestions, advice, and conversations - helpful both in
science and beyond. I owe a special thanks to Janet Gusdorff for always listening
to and believing in me.
Last, but certainly not least, I would like to thank my entire family. Thanks
for support from Ari Davidow, Ari Fertig, Arnie Fertig, Gail Fertig, and Judith
Pinnolis. I also thank my grandparents, Leo Goldstein, Sybil Goldstein, Arlene
Kamin, Robert Kamin, Gina Klein, Myron Klein, Eileen Pinnolis, Deborah Shain,
Murray Silverman, Bella Weiner, and Samuel Weiner, for their continual kvelling
about me. Thanks to my parents, Merom and Louise Klein, for helping me keep my
priorities in order and goals in mind. Thanks, too, to my parents, Harriet and Louis
Weiner, for their support and scientific advice. Thanks to my brothers, Ken and
David Weiner, for helping me to keep things in perspective, making me laugh, and
always being interested in my research. Finally, I would like to thank my husband,
Ben Fertig, for his love, humor, and kindness. His continual support of me and
conviction in my abilities from qualifying exams to dissertation has been integral to
my completing my PhD.
iii
Table of Contents
List of Tables vi
List of Figures vii
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Data Assimilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Assimilating Satellite Observations . . . . . . . . . . . . . . . . . . . 8
1.4 Outline of Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.5 Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2 A Comparative Study of 4D-VAR and a 4D Ensemble Kalman Filter 15
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2 Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2.1 4D-VAR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2.2 4D-LETKF . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.5 Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3 Assimilating Nonlocal Observations with a Local Ensemble Kalman Filter 28
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2.1 Ensemble Kalman Filter . . . . . . . . . . . . . . . . . . . . . 31
3.2.2 Localization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2.3 Assimilating Nonlocal Observations . . . . . . . . . . . . . . . 37
3.2.4 SPEEDY Model . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.3 Experimental Design . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.4.1 Grid Point Observations . . . . . . . . . . . . . . . . . . . . . 44
3.4.2 Radiance Observations . . . . . . . . . . . . . . . . . . . . . . 44
3.4.3 Retrieval Observations . . . . . . . . . . . . . . . . . . . . . . 46
3.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.6 Appendix: Covariance Localization . . . . . . . . . . . . . . . . . . . 53
3.7 Appendix: Diagonalization . . . . . . . . . . . . . . . . . . . . . . . . 55
3.8 Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.9 Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4 Satellite Observation Bias Correction with an Ensemble Kalman Filter 63
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.2 Satellite Radiance Biases . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.2.1 Online Bias Correction . . . . . . . . . . . . . . . . . . . . . . 69
4.2.2 Online Bias Correction with an Ensemble Kalman Filter . . . 70
iv
4.3 Experimental Design . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.3.1 SPEEDY Model . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.3.2 Community Radiative Transfer Model . . . . . . . . . . . . . 75
4.3.3 Truth and Observations . . . . . . . . . . . . . . . . . . . . . 77
4.3.4 LETKF and Bias Correction . . . . . . . . . . . . . . . . . . . 79
4.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.6 Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.7 Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
v
List of Tables
3.1 The observation error covariance for the simulated retrieval observa-
tions at a fixed horizontal location. . . . . . . . . . . . . . . . . . . . 62
3.2 The correlation between the errors of observations at a single hori-
zontal point and model levels k and l is given by the entry in this
table corresponding to these model levels. . . . . . . . . . . . . . . . 62
4.1 The global, February average of the temperature analysis RMS er-
ror (K) at each model level for selected combinations of predictors.
Here, 6% variance inflation is applied to the analysis variables and
6% variance inflation is applied to the bias parameters. . . . . . . . . 99
4.2 As for Table 4.2. Here, 6% variance inflation is applied to the analysis
variables and 12% variance inflation is applied to the bias parameters. 99
4.3 The global, February average of the observation RMS error (K) for
representative channels when applying the bias correction parameters
estimated using 6% variance inflation for the analysis variables and
12% for the bias parameters. . . . . . . . . . . . . . . . . . . . . . . . 100
vi
List of Figures
1.1 A schematic analysis cycle used in data assimilation schemes, based
on Figure 1.4.2 in Kalnay (2002). . . . . . . . . . . . . . . . . . . . . 13
1.2 The horizontal observational distribution within?3 hours of a typical
analysis step, from Figure 1.4.1 in Kalnay (2002). . . . . . . . . . . . 14
2.1 The mean analysis error as a function of the analysis window for
4D-VAR and as a function of time between analyses for 4D-LETKF. . 27
2.2 The mean forecast errors as a function of forecast time for 4D-VAR
and 4D-LETKF. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.1 A schematic for the observation selection schemes used for nonlocal,
radiance-like observations. . . . . . . . . . . . . . . . . . . . . . . . . 58
3.2 Locations of rawinsonde observations used for the SPEEDY model. . 59
3.3 Average analysis RMS error from assimilating rawinsondes with dif-
ferent amounts of vertical localization. . . . . . . . . . . . . . . . . . 59
3.4 Average analysis RMS error from assimilating simulated rawinsonde
observations and satellite radiances. . . . . . . . . . . . . . . . . . . . 60
3.5 Average analysis RMS error from assimilating simulated rawinsonde
observations and satellite retrievals. . . . . . . . . . . . . . . . . . . . 60
3.6 Comparing the analysis RMS error from assimilating simulated raw-
insondes and radiances with that from assimilating simulated rawin-
sondes and satellite retrievals. . . . . . . . . . . . . . . . . . . . . . . 61
4.1 Weighting function of representative channels. . . . . . . . . . . . . . 89
4.2 The true bias for two representative channels. . . . . . . . . . . . . . 90
4.3 The correlation between the true bias and each of the predictors for
each satellite channel. . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.4 The 500 hPa temperature analysis rms error for each analysis time. . 92
4.5 The spread in the background bias parameters with predictors p1, p3,
and p4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
vii
4.6 As for Figure 4.4. Here, the analysis variables are inflated by 6%
while the bias parameters are inflated by 12%. . . . . . . . . . . . . . 94
4.7 As for Figure 4.5. Here, the analysis variables are inflated by 6%
while the bias parameters are inflated by 12%. . . . . . . . . . . . . . 95
4.8 Temperature analysis rms error at each model level. . . . . . . . . . . 96
4.9 Observation rms error for each satellite channel. . . . . . . . . . . . . 97
4.10 The results from using p1 to correct only channels above 701.618 cm?1. 98
viii
Chapter 1
Introduction
1.1 Motivation
Weather models forecast the future state of the atmosphere from an estimate
of the current state of the atmosphere. Even if the model perfectly describes the
atmospheric dynamics, for sufficiently long forecast lead times the chaotic nature
of the model can turn small errors in the initial conditions into dramatic errors
in forecasts. To minimize these errors, data assimilation schemes seek an accurate
estimate for the current state of the atmosphere using both current and past ob-
servations. Current observations are incorporated into the analysis when the data
assimilation is performed. Past observations are incorporated by using a forecast of
the initial condition estimated by the data assimilation scheme as the prior forecast
the next time that data assimilation is performed (Figure 1.1, based on Figure 1.4.2
in Kalnay, 2002).
Historically, in order to reduce computational cost, several data assimilation
schemes, including 3D-VAR and PSAS, assumed that the forecast uncertainty is
constant. While such schemes have been successful for operational data assimilation,
their estimates are subject to ?errors of the day? (Kalnay, 2002; Liu et al., 2007).
The next generation of data assimilation schemes avoid these errors by using flow
dependent uncertainties, implicitly in 4D-VAR and explicitly in ensemble Kalman
1
filter schemes. Several studies have shown the potential of the developing ensemble
Kalman filter schemes for operational use (e.g., Houtekamer et al., 2005; Szunyogh
et al., 2005; Whitaker et al., 2004; Houtekamer and Mitchell, 2005; Whitaker et al.,
2007; Liu et al., 2007; Szunyogh et al., 2007).
Many studies with ensemble Kalman filters have focused on assimilating con-
ventional observations, which are direct measurements of model state variables avail-
able at irregular locations (Figure 1.2 based on Figure 1.4.1 in Kalnay, 2002). Al-
though satellite observations do not measure model variables directly, they cover
more of the globe than do conventional observations (Figure 1.2, middle right panel).
Satellite observations are known to improve the estimates for the current state of
the atmosphere in regions like the Southern Hemisphere and oceans, which lack
conventional data (e.g., Kalnay, 2002). Therefore, studies with ensemble Kalman
filters are starting to incorporate satellite observations (e.g., Houtekamer et al., 2005;
Houtekamer and Mitchell, 2005). Ensemble schemes must account for complications
that arise because satellite observations are available continuously between analysis
times, measure weighted vertical averages of the atmospheric state at a single hor-
izontal location, and have significant errors relative to the observation operator in
use. This dissertation examines these aspects of satellite data assimilation with an
ensemble Kalman filter.
The data assimilation methods described in this section are discussed in more
detail in Section 1.2. Then, Section 1.3 describes satellite data assimilation and Sec-
tion 1.4 outlines the complications associated with assimilating satellite observations
that are addressed in the following chapters of this dissertation.
2
1.2 Data Assimilation
Data assimilation estimates the current atmospheric state by combining infor-
mation from a prior forecast of this state, called the background (xb), with atmo-
spheric observations (y). How these two states are weighted in doing this combi-
nation depends on their respective uncertainties as given by their error covariance
matrices (Pb for xb and R for y). This estimate, or analysis state (xa) and possi-
bly its error covariance matrix (Pa) are then used as an initial condition for future
forecasts and the analysis cycle continues (Figure 1.1).
The observations used in data assimilation schemes are assumed to be related
to the atmospheric state, y = h(xt)+epsilon1, where xt represents the unknown true state
of the atmosphere. The error term, epsilon1, has covariance R =angbracketleftbigepsilon1epsilon1latticetopangbracketrightbig, where latticetop denotes
the transpose. The observation operator h transforms atmospheric model states
into observation space. The h operator primarily performs spatial interpolations for
most conventional observations, but may be a more complex, nonlinear operator for
observations like satellite radiances.
Assuming that the errors of the background and observations are Gaussian
distributed and unbiased, a least-squares approach can combine these measures
to obtain the analysis state. This state can be found by maximizing a likelihood
function, or equivalently, minimizing the following cost function
2J(x) =parenleftbigx?xbparenrightbiglatticetopPb?1parenleftbigx?xbparenrightbig+ (y?h(x))latticetopR?1 (y?h(x)) (1.1)
(e.g., Kalnay, 2002). Some operational data assimilation schemes, including 3D-
VAR, find the analysis state by directly minimizing this cost function using a con-
3
stant background error covariance Pb. This latter assumption is made for com-
putational efficiency, but may degrade the quality of the analysis and introduce
time-varying ?errors of the day? (Kalnay, 2002; Liu et al., 2007). 4D-VAR schemes
(the ?strong constraint? formulation is described in more detail in Chapter 2) over-
come these limitations by modifying the cost function in Equation (1.1) to seek the
model trajectory which best fits the all available observations within some time win-
dow. In this way, 4D-VAR can implicitly account for time dependent errors (e.g.,
Th?epaut et al., 1993).
While the data assimilation techniques described above accurately estimate the
initial state of the atmosphere, they do not directly incorporate dynamic estimates
of the forecast uncertainty. By using approximations to the Kalman filter equations,
ensemble-based schemes can efficiently estimate and utilize the forecast uncertainty
when obtaining the analysis. We formulate the Kalman filter equations and related
ensemble-based schemes in the remainder of this section.
In addition to the assumptions made about the observations, for the purpose
of presenting the classical Kalman filter technique, we also assume that the analysis
state and its error covariance are propagated by a linear forecast model M. Then,
if at analysis time tn?1 the analysis state is xan?1 and its uncertainty is given by the
covariance matrix Pan?1, the background state and covariance at time tn are given
by
xbn = Mxan?1, (1.2)
Pbn = MPan?1Mlatticetop. (1.3)
4
In this way, an estimated analysis state with Gaussian errors is forecasted into
a background state with Gaussian errors, and the Kalman filter (Kalman, 1960;
Kalman and Bucy, 1961) can be applied to estimate the analysis state and its
covariance at time tn.1 Specifically, the analysis equations are given by
xa = xb +Kparenleftbigy?h(xb)parenrightbig, (1.4)
Pa = (I?KH)Pb, (1.5)
K = PbHlatticetopparenleftbigHPbHlatticetop +Rparenrightbig?1 , (1.6)
where H is the linearization of the observation operator h around the background
H = ?h?x(xb). If h is linear, then the analysis state given by Equations (1.4) and (1.6)
is the minimum of the cost function in Equation (1.1). Furthermore, in this case
the Kalman filter provides the minimum variance unbiased estimate for the analysis
state (Kalman, 1960; Kalman and Bucy, 1961), and, if the dynamics were linear,
then the Kalman filter is equivalent to 4D-VAR in the limit of long time windows.
In the formulation of the Kalman filter in the previous paragraph, we assumed
that the forecast model is linear. For a nonlinear forecast model, the extended
Kalman filter provides an approach to replace the linear model in Equation (1.2)
with the nonlinear forecast model, and linearizes this model about the forecast
trajectory to obtain the matrix M in Equation (1.3).
If the model state has dimension m, then Pa, Pb, and M are m?m matrices.
For operational weather models, m is more than 107 ?108. Therefore, propagating
1In the remainder of this section we consider all variables at time tn only, so we drop the time
index.
5
the error covariance as proposed in Equation (1.3) is prohibitively expensive to
implement. Ensemble Kalman filter techniques reduce this cost by approximating
Equation (1.3) with the sample covariance of an ensemble of forecasts made from
a set of initial conditions whose perturbations from the analysis state are chosen
according to the analysis covariance in Equation (1.5). If there are enough ensemble
members and forecast model is roughly linear over the forecast time, this approach
will ideally come close to the optimality of the Kalman filter analysis (Evensen,
1994).
Specifically, ensemble Kalman filters forecast a set of perturbed initial condi-
tions, creating an ensemble of k forecasts, each denoted by xb(i). The background
state xb is the ensemble mean of these states
xb = k?1
ksummationdisplay
i=1
xb(i). (1.7)
The sample covariance of the ensemble of forecasts yields a time dependent back-
ground error covariance,
Pb =
summationtextk
i=1
parenleftbigxb(i) ?xbparenrightbigparenleftbigxb(i) ?xbparenrightbiglatticetop
k ?1 . (1.8)
The analysis state and its error covariance are obtained by substituting Equations
(1.7) and (1.8) into Equations (1.4) - (1.6).
The ensemble Kalman filter also determines an ensemble of analysis states
based on the analysis state and its error covariance. The forecast model is then
applied to evolve each analysis ensemble member to obtain the background ensem-
ble at the next analysis step. The means of generating this ensemble differentiates
6
ensemble-based schemes. In many ensemble schemes, the analysis for the ith en-
semble member is given by
xa(i) = xa +Xa(i). (1.9)
Here Xa(i) represents the ith column of an m?k matrix ?square root? matrix Xa,
for which Xa (Xa)latticetop = (k ?1)Pa. The choice of the particular square root further
distinguishes several ensemble Kalman filter schemes (Tippett et al., 2003).
Unlike the standard modifications of the Kalman filter, such as the extended
Kalman filter, an ensemble Kalman filter does not require explicitly linearizing the
forecast model; the ensemble forecast implicitly gives information about the lin-
earization. Furthermore, Pb has only k ? 1 linearly independent perturbations,
where we assume k to be small compared to m. Presuming that this limited num-
ber of ensemble members captures the model dynamics, the number of degrees of
freedom in the Kalman filter equations is reduced. Several ensemble schemes, includ-
ing the Ensemble Transform Kalman Filter (ETKF, Bishop et al., 2001), the Local
Ensemble Kalman Filter (LEKF, Ott et al., 2004), and the Local Ensemble Trans-
form Kalman Filter (LETKF, Hunt et al., 2007), exploit this low-dimensionality to
solve Equations (1.4) - (1.6) in the space spanned by the ensemble, reducing the
cost associated with solving these equations.
In most operational settings, ensemble Kalman filter schemes are only com-
putationally feasible for fewer than 100 ensemble members. These few ensemble
members are unable to capture all of the unstable degrees of freedom in the global
model state. Furthermore, globally using so few ensemble members may introduce
7
spurious correlations in the background error covariance estimated in Equation (1.8).
However, as shown by Patil et al. (2001), atmospheric dynamics tend to be low di-
mensional within local regions. Therefore, many ensemble Kalman filter schemes
employ some form of localization. Ideally, the limited ensemble size will be suffi-
cient to represent the state uncertainty within these local regions (Ott et al., 2004;
Szunyogh et al., 2005).
1.3 Assimilating Satellite Observations
Satellites observe the radiance emitted from the atmosphere at particular win-
dow frequencies continually during their orbit. Accordingly, these instruments can
provide data assimilation systems a wealth of information about traditionally data
poor regions, such as the Southern Hemisphere and oceans (Figure 1.2). However,
this information is less straightforward to assimilate than conventional observations,
which directly measure model state variables.
Because satellites observe continuously, they provide information at greater
frequency than the analysis is performed. However, in many of the data assimi-
lation schemes presented in the previous section (but not all, e.g., 4D-VAR), the
observations are assumed to be taken at the analysis time. Therefore, the data
assimilation schemes must account for the model dynamics between analysis times
when assimilating satellite observations. 4D-VAR schemes naturally handle this be-
cause they consider model trajectories over an analysis time window. Evensen and
van Leeuwen (2000), Anderson (2001), and Hunt et al. (2004) have made similar
8
modifications to the ensemble Kalman filter schemes to assimilate asynchronous ob-
servations. These ?four-dimensional? ensemble-based schemes are beginning to be
applied to operational models and data (Houtekamer and Mitchell, 2005; Szunyogh
et al., 2007; Whitaker et al., 2007).
Further challenges arise because, unlike conventional observations, satellite
radiance observations only indirectly represent the atmospheric variables used in
the forecast models, such as temperature and humidity. For example, the case of
radiance observations, the relationship between the observations and the relevant
model state variables (i.e., temperature and humidity) is given by a radiative transfer
model. Accordingly, the observation operator for each radiance observation is given
by
h(x) = ??(0)??(0)B(T(0)) +
integraldisplay ?
0
B?(T(z))???(z)?z dz, (1.10)
where ? represents the observing frequency, z = 0 represents the surface, ??(0)
is the emissivity of the surface, B?(T(z)) is the Planck function, and ???z is called
the weighting function because it indicates amount which a radiance observation
depends on the Planck function at a particular height (e.g., Liou, 2002). Because
Equation (1.10) involves an integral over all model levels, the radiance observations
depend on the atmospheric state in a fundamentally nonlocal manner. This nonlocal
nature presents a problem for previous ensemble Kalman filter schemes that rely on
localization to assimilate radiances (see Chapter 2). Furthermore, there are often
systematic errors in the radiative transfer model, which often arise from errors in
calibrating the spectroscopic data associated with the satellite (Eyre, 1992). These
9
errors can introduce a state dependent bias to the satellite observations, contradict-
ing the assumption of unbiased observation errors made in all the data assimilation
systems and harming the resulting analysis.
1.4 Outline of Thesis
In this dissertation, we explore some of the complications arising from assim-
ilating satellite observations that are described in the previous section. Each of the
following chapters has been prepared for publication in Tellus A. Therefore, each
chapter has been written to stand alone and may have slight differences in notation.
In Chapter 2, we formulate a four dimensional Ensemble Kalman Filter (the
4D-Local Ensemble Transform Kalman Filter, 4D-LETKF) based on Hunt et al.
(2004) to assimilate observations available between analysis times, such as satellite
observations. This scheme minimizes a cost function similar to that in a 4D-VAR
method described in Section 1.2 in the space spanned by the ensemble trajectories.
Using perfect model experiments with the Lorenz-96 model, we compare assimilation
of simulated asynchronous observations with 4D-VAR and 4D-LETKF. We find
that both schemes have comparable error when 4D-LETKF is performed sufficiently
frequently and when 4D-VAR is performed over a sufficiently long analysis time
window. We explore how the error depends on the time between analyses for 4D-
LETKF and the analysis time window for 4D-VAR. This chapter has appeared in
Tellus A as Fertig et al. (2007b).
In Chapter 3, we explore the nonlocality associated with satellite observa-
10
tions and the challenges it presents for local ensemble Kalman filter schemes. We
propose a technique to assimilate these nonlocal observations in which the obser-
vation operator is applied to the global model state and then appropriate obser-
vations are selected to estimate the atmospheric state for each model grid point.
The issue of how to choose appropriate observations is investigated with numerical
experiments using the ensemble scheme LETKF on a seven layer primitive equa-
tion model, the the Simplified Parameterizations, primitivE-Equation DYnamics
(SPEEDY) model. We run numerical experiments that assimilate simulated point
and nonlocal radiance-like observations and numerical experiments that assimilate
simulated point and retrieval-like observations. For radiances, the best analysis
results are obtained from a scheme that updates the state at a given location by
assimilating all those radiance observations that depend strongly on the model state
near that location. For retrievals, we assume a known non-diagonal error covariance
matrix. We use a scheme that updates the state at a given location by assimilating
retrieval observations near that location (?local retrievals?) and also retrieval ob-
servations with errors correlated to those of the local retrievals. The analysis that
results from assimilating these retrievals has comparable accuracy to the analysis
obtained from assimilating radiances.
Finally, in Chapter 4 we consider the use of an ensemble Kalman filter to
correct satellite radiances for their state dependent biases relative to the observa-
tion operator in use. Our approach is to use state-space augmentation to estimate
satellite biases during the ensemble data assimilation procedure. We illustrate our
approach by applying it to LETKF to assimilate simulated biased AIRS brightness
11
temperature observations on the SPEEDY model. The bias parameters estimated
by LETKF successfully reduce the analysis error and observation bias.
12
1.5 Figures
Background
value x
b
error covariance P
b
Observations
value y
error covariance R
Data Assimilation Scheme
analysis state x
a
error covariance P
a
Initial conditions
Weather Model
Operational
Forecast
6 hr forecast
Figure 1.1: A schematic analysis cycle used in data assimilation schemes, based on
Figure 1.4.2 in Kalnay (2002).
13
!150 !100 !50 0 50 100 150
!80
!60
!40
!20
0
20
40
60
80
(a) RAOBS
!150 !100 !50 0 50 100 150
!80
!60
!40
!20
0
20
40
60
80
(b) AIRCRAFT
!150 !100 !50 0 50 100 150
!80
!60
!40
!20
0
20
40
60
80
(c) SAT WIND
!150 !100 !50 0 50 100 150
!80
!60
!40
!20
0
20
40
60
80
(d) SAT TEMP
!150 !100 !50 0 50 100 150
!80
!60
!40
!20
0
20
40
60
80
(e) SFC SHIP
!150 !100 !50 0 50 100 150
!80
!60
!40
!20
0
20
40
60
80
(f) SFC LAND
Figure 1.2: The horizontal observational distribution within ?3 hours of a typical
analysis step, based on Figure 1.4.1 in Kalnay (2002). Here, rawinsonde (RAOBS),
aircraft, satellite derived wind (SAT WIND), surface ship (SFC SHIP), and sur-
face land (SFC LAND) are considered to be conventional observations. As their
name suggests the surface observations are only available to measure surface vari-
ables. SAT TEMP represents the typical horizontal distribution of satellite radiance
observations.
14
Chapter 2
A Comparative Study of 4D-VAR and a 4D Ensemble Kalman
Filter: Perfect Model Simulations with Lorenz-96
2.1 Introduction
Operational data assimilation schemes traditionally have assimilated available
observations as though they, or their innovations (observation minus forecast, in
observation space), occurred at the analysis time. With a growing number of obser-
vations from instruments such as satellites, many observations are available between
analysis times. However, employing three dimensional data assimilation schemes
to match the frequency of these asynchronous observations would be prohibitively
expensive and could introduce imbalances in the resulting analysis states. New
generations of data assimilation schemes, most notably the 4D-VAR and Ensemble
Kalman Filter (EnKF) techniques, are able to more accurately take into account
the timing of asynchronous observations.
One approach to assimilate observations at various time is 4D-VAR (see Le
Dimet and Talagrand, 1986; Courtier et al., 1994; Rabier et al., 1998, 2000). This
assimilation system is currently the operational data assimilation scheme at the
European Centre for Medium-Range Weather Forecasts and Meteo-France, and is
being developed for operation at centers including the Canadian Meteorological
15
Centre and the Japan Meteorological Agency. ?Strong constraint? 4D-VAR seeks
a model trajectory that best fits the available observations during a specified time
window before the analysis time.
Another approach to data assimilation is the EnKF technique (Evensen, 1994;
Burgers et al., 1998; Houtekamer and Mitchell, 1998; Anderson, 2001; Bishop et al.,
2001; Whitaker and Hamill, 2002; Ott et al., 2004; Zupanski, 2005). These schemes
evolveanensembleofmodeltrajectoriestoestimatebackgrounduncertainty. Though
not yet operational, experiments, such as those of Houtekamer et al. (2005), Szun-
yogh et al. (2005), and Whitaker et al. (2004), have shown the potential of EnKF
for operational data assimilation. In these implementations of EnKF on operational
models, observations are still assimilated as though they were taken at the analysis
time. As Lorenc (2003) points out, when assimilating asynchronous observations
with EnKF, one should use the time sequence of ensemble states between analy-
sis times to account for model state correlations in time as well as space. In this
way, Evensen and van Leeuwen (2000), Anderson (2001), and Hunt et al. (2004) ex-
tend EnKF to accurately assimilate asynchronous observations at the correct time.
Their methods have been successfully applied to operational models by Houtekamer
and Mitchell (2005), Szunyogh and Kostelich (Personal Communication, 2006), and
Whitaker et al. (2007).
Inthispaper, wepresent4D-LETKF,asimplifiedversionofthefour-dimensional
Ensemble Kalman Filter in Hunt et al. (2004), and compare it to 4D-VAR in a per-
fect model scenario using the Lorenz-96 system (Lorenz, 1996). In Section 2.2, we
describe our implementation of 4D-VAR, and then derive 4D-LETKF using a mod-
16
ification of the 4D-VAR cost function. In Section 2.3, we describe our experimental
design and present our numerical results. We find that the two methods yield similar
results when the time between analysis is short enough for 4D-LETKF and when
the analysis time window is long enough for 4D-VAR, and we discuss our results
further in Section 2.4.
2.2 Formulation
2.2.1 4D-VAR
Since we consider a perfect model scenario in this paper, we formulate here a
strong constraint 4D-VAR scheme. It seeks an initial condition close to the back-
ground state (determined by the prior analysis) from which the resulting exact model
trajectory remains closest to the observations over the analysis time window, [t0,tn].
More precisely, it minimizes a cost function:
J (x(t0)) = 12parenleftbigx(t0)?xbparenrightbiglatticetopB?1parenleftbigx(t0)?xbparenrightbig (2.1)
+12
nsummationdisplay
l=0
(yol ?Hl (x(tl)))latticetopR?1l (yol ?Hl (x(tl)))
wherelatticetopdenotesthetranspose, x(t0)isan m-dimensionalthemodelstateatthestart
of the analysis window. The model state at each observation time is obtained by
integrating the nonlinear model from x(t0). xb is the the m-dimensional background
forecast at the same time and B is an m?m background error covariance matrix,
which is typically constant, homogeneous, and isotropic. The observation state
at time tl is given by the sl-dimensional vector yol . Rl is the associated sl ? sl
17
observation error covariance matrix, for l = 0,...,n. The observation operator, Hl,
maps the model state x(tl) to the observation space at time tl. For this formulation,
observations taken at different times are assumed to have uncorrelated errors.
The cost J is only a function of the initial model state, x(t0). Therefore, once
we determine x(t0) that minimizes the total cost J, the integrated state x(tn) is
the 4D-VAR analysis at time tn. For this study, we obtain the minimum using a
BFGS algorithm adapted from Numerical Recipes in Fortran (Press et al., 1992, pg.
418). This algorithm requires the gradient of the cost function, which we compute
using the adjoint technique presented in Lawson et al. (1995). For large systems,
Hessian preconditioning is often employed to ensure that the minimization algorithm
converges quickly on an accurate state. In this study, we judged preconditioning to
be unnecessary because here the BFGS algorithm converges to the minimum state
quickly (generally after about 20 iterations).
2.2.2 4D-LETKF
EnKF replaces the time-independent background error covariance matrix B
with a time-dependent sample covariance matrix Pb = (k?1)?1Xb(Xb)latticetop, where Xb
is an m?k matrix of background ensemble perturbations. That is,
Xb =bracketleftbigxb(1) ? ?xbvextendsinglevextendsinglexb(2) ? ?xbvextendsinglevextendsingle???vextendsinglevextendsinglexb(k) ? ?xbbracketrightbig, (2.2)
where xb(i) denotes the ith ensemble member and ?xb denotes the ensemble mean.
In 4D-LETKF, the deviation of a model solution x(t) from the background mean
state ?xb(t) is approximated between analysis times by a linear combination of the
18
background ensemble perturbations by:
x(t) ? ?xb(t) +Xb(t)w (2.3)
where w ? Rk is a time-independent weight vector. As in Hunt et al. (2004), the
analysis determines which weight vector makes this linear combination ?best fit?
the observations over the analysis time window, in the sense of minimizing a cost
function like (2.1).
The projection of x(t) to the observation space at time tl is approximated by:
Hl(x(tl)) ? Hl(?xb(tl) +Xb(tl)w) ? Hl(?xb(tl)) +Yblw. (2.4)
The ith column vector of the sl ? k matrix Ybl is defined to be Hl(xb(i)(tl)) ?
Hl(xb(i)(tl)), where Hl(xb(i)(tl)) represents the ensemble mean of the projection of
the background state on observation space.
Replacing B by Pb(t0) and xb by ?xb(t0) and substituting approximations (2.3)
and (2.4) into the cost function (2.1) yields the modified cost function:
?J(w) = 1
2(k ?1)w
latticetopw + 1
2
nsummationdisplay
l=1
[yol ?Hl(?xb(tl))?Yblw]latticetop (2.5)
? R?1l [yol ?Hl(?xb(tl))?Yblw].
Here, the m-dimensional minimization problem is reduced to a k-dimensional prob-
lem, reducing the cost of implementation if the ensemble size k is less than the
number of model variables m.
The minimum of (2.5) occurs at
wa = ?Pa
parenleftBigg nsummationdisplay
l=1
Yblatticetopl R?1l (yol ?Hl(xb(tl)))
parenrightBigg
, (2.6)
19
where ?Pa is the k ?k matrix given by:
?Pa =
parenleftBigg
(k ?1)I+
nsummationdisplay
i=1
Yblatticetopl R?1l Ybl
parenrightBigg?1
. (2.7)
These equations correspond to the Kalman filter analysis mean and covariance equa-
tions; here the background mean and covariance for w are 0 and (k?1)?1I, respec-
tively. Now, the model state corresponding to (2.6) is the mean analysis state:
?xa = ?xb +Xbwa, (2.8)
The analysis ensemble is generated as follows:
xa(i) = ?xa +XbW(i). (2.9)
where W(i) is the ith column of the matrix W =
bracketleftBig
(k ?1)?Pa
bracketrightBig1/2
. The forecasts from
these analysis ensemble states are then used for the next analysis as the background
ensemble states.
We remark that a cost function similar to (2.5), but without the linear ap-
proximation (2.4) to the observation operator, is used in the Maximum Likelihood
Ensemble Filter (Zupanski, 2005). The analysis (2.6) - (2.9) is equivalent to the En-
semble Transform Kalman Filter (Bishop et al., 2001) with the Centered Spherical
Simplex Ensemble (Wang et al., 2004). It is also equivalent, though formally less
similar, to the analysis described in Hunt et al. (2004) and Ott et al. (2004). In this
paper, the analysis is performed locally like in Ott et al. (2004), as described in the
next section.
20
2.3 Results
We test both 4D-VAR and 4D-LETKF on Lorenz-96, a toy model with vari-
able x in m-equally spaced points around a circle of constant latitude. The jth
component is propagated in time following differential equation:
dxj
dt =
1
120 [(xj+1 ?xj?2)xj?1 ?xj + F] (2.10)
where j = 1,...,m represents the spatial coordinate (?longitude?).
Following Lorenz (1996), like the study of Ott et al. (2004), we choose the
external forcing to be F = 8 and the number of spatial elements to be m = 40. We
have inserted the factor 1120 so that, according to Lorenz?s estimate, the time scale
of the dynamics roughly matches that of a global weather model, with t measured
in hours. We use a fourth-order Runge-Kutta scheme for time integration of (2.10)
with timestep ?t = 1.5 hours.
We perform all simulations by assuming a perfect model scenario. That is,
a long integration from an arbitrary initial condition is assumed to be the ?true?
state. We create the observations, yo, by adding uncorrelated random noise with
standard Gaussian distribution (mean 0, variance 1) to the true state. To simulate
the asynchronous observations, we make 10 uniformly distributed observations at
every timestep (1.5 hours). We rotate the observation locations so that for every 6
hour period, we make one observation available at each model grid point.
To ensure consistency between the 4D-VAR and 4D-LETKF experiments, the
assimilation experiments use the same truth and observations. At each analysis time
we compute the analysis error as the Root-Mean-Square (RMS) difference between
21
the true state and the analysis ensemble mean for 4D-LETKF (or simply the analysis
state for 4D-VAR). Furthermore, we also compute the RMS difference between the
mean forecast from the analysis ensemble for 4D-LETKF (or the forecast from 4D-
VAR) and the true state at 6 hour intervals up to a 5 day forecast. We then average
the analysis errors and the forecast errors over time by taking the RMS over T/1.5n
analysis cycles, where the analysis is done every n timesteps (1.5n hours) and T is
the total length of the simulation. In our simulations, we choose T = 120,000 hours
(approximately 13.5 years). We vary the value of n to examine how the analysis
error depends on the time between analyses for 4D-LETKF and on the analysis time
window for 4D-VAR.
For the 4D-VAR experiments in this perfect model scenario, we obtain the
constant background error covariance matrix B for each analysis window iteratively.
We initially run 4D-VAR for T/1.5n analysis cycles using an arbitrary background
covariance matrix B0 and compute the covariance B1 of the differences between the
true and analysis states at all of the analysis times. Next, we run 4D-VAR using
B1 as the background error covariance matrix and again compute the covariance B2
of the differences between the truth and background. We repeat this process until
the average analysis error does not change significantly. To ensure optimality, we
then replace the error covariance matrix found by the iterative algorithm, B, with
?B and tune ? to empirically minimize the analysis RMS error. For all analysis
windows in this study, ? was found to be close to one. This covariance matrix is
similar to that used for the constant error covariance for OI by Wang et al. (2006).
In this scenario, we found that the analysis errors obtained from this method were
22
similar to the analysis errors obtained when using a covariance matrix generated
with the NMC method (Parrish and Derber, 1992).
For 4D-LETKF, we obtain the analysis ensemble at each grid point by com-
puting the 4D-LETKF equations using only the observations within a local region.
That is, for each grid point we make a separate computation of (2.6)-(2.9) using
only the rows and columns of Yl and Rl corresponding to the observations in its
region. Following Ott et al. (2004), the local region for a grid point is centered at
that grid point and contains a total of 13 grid points. We present results produced
from an ensemble of 15 members. Similar to Ott et al. (2004), we found that addi-
tional ensemble members did not significantly benefit the analysis when using this
localization. We apply multiplicative variance inflation, as in (Whitaker and Hamill,
2002), to compensate for the effects of model nonlinearity and limited ensemble size.
Multiplicative inflation replaces the background covariance matrix Pb with (1+r)Pb
for some r > 0; we do this in LETKF by replacing (k?1)I with (k?1)I/(1+r) in
(2.7). We tune r to minimize the analysis RMS error. For these studies, the optimal
inflation factor r increased with the length of the analysis window.
We show the average analysis error as a function of the analysis time window
in Figure 2.1 for 4D-LETKF (solid) and 4D-VAR (dashed). For 4D-LETKF, which
should not use the same observation in more than one analysis, the analysis window
must correspond to the time between analyses, but for 4D-VAR the time between
analyses can be chosen independently of the analysis window. We see that the
average analysis error of 4D-LETKF grows with the time between analyses, and the
average analysis error of 4D-VAR decreases with the length of the analysis window.
23
The 4D-VAR scheme remains stable for a larger range of analysis windows than
4D-LETKF. The mean analysis RMS error of 4D-LETKF appears to saturate at
approximately 0.23 for analysis windows between 0.25 and 1 days, while for 4D-VAR
the mean analysis error appears to approach a similar value for analysis windows
between 4 and 4.5 days.
For the model we consider here it is computationally feasible to use a large
enough ensemble that no localization is necessary. In Figure 2.1 we also display
results for 4D-LETKF (dot-dashed) with 50 ensemble members and no localization;
that is, the analysis at each grid point uses all of the observations. In this case, the
average analysis error is 5?10% better than for the smaller (15 member) ensemble
with localization.
To further compare the analysis of 4D-VAR and 4D-LETKF, we compare their
average forecast errors as a function of forecast time in Figure 2.2. The average
forecast error at initial time 0 is the average analysis error. For 4D-VAR we forecast
from the 4 day window analysis (dashed), while for 4D-LETKF we run the forecast
from the 1 day window analysis. For each method, the analysis window we use
yields near optimal results (for that method) according to Figure 2.1. Because 4D-
LETKF provides initial conditions for an ensemble forecast, consider both the mean
of an ensemble forecast (solid) and a forecast from the mean analysis state (dot-
dashed) to compare with the 4D-VAR forecast. We observe that the 4D-VAR and
4D-LETKF forecasts from the analysis mean have comparable mean RMS errors.
The advantage of the ensemble forecast becomes apparent after two and a half days.
(We expect that an ensemble forecast initialized with appropriate perturbations
24
from the 4D-VAR analysis would yield a similar advantage.)
2.4 Summary
In assimilating the asynchronous observations considered here, 4D-VAR and
4D-LETKF yield comparable average analysis and forecast errors when 4D-VAR is
performed with a long enough analysis time window and when 4D-LETKF is per-
formed sufficiently frequently. In an operational setting, the time between analyses
(and thus the time window for 4D-LETKF) may not be adjustable, while the 4D-
VAR analysis window can be chosen as large as computational constraints allow.
For the scenario of this paper, our results indicate that if the desired time between
analyses is 1 day or less, the mean analysis from 4D-LETKF with 15 ensemble
members is of similar quality to the 4D-VAR analysis with a 4 day time window.
Our results for a 50 member ensemble, shown in Figure 2.1, indicate that the loss
of accuracy in 4D-LETKF due to a limited ensemble size does not depend signifi-
cantly on the time between analyses. Since this large ensemble allows for full rank
covariances and a global analysis, we presume that the loss of accuracy as the time
between analyses increases is due primarily to model nonlinearity in the following
sense: the longer the time between analyses, the greater the background ensemble
spread becomes during the analysis time window, and, therefore, the less accurate it
is to approximate model trajectories as linear combinations of ensemble trajectories.
While 4D-VAR is not strongly affected by nonlinearity, in practice model error will
affect both methods. Its effect will increase the errors as the analysis time window
25
grows.
The introduction of model error could further distinguish the two 4D data
assimilation schemes. 4D-VAR generally accounts for model error by using the
?weak constraint formulation?, which adds additional terms of the cost function.
In an Ensemble Kalman Filter, additional amounts or different types of variance
inflation are generally used to counteract model error. While this is the simplest
approach to take in 4D-LETKF, it is also possible to minimize a modified cost
function in the space spanned by the ensemble trajectories.
As implemented in this paper, both 4D-VAR and 4D-LETKF assimilate the
asynchronous observations at comparable computational cost. However, the imple-
mentation of 4D-LETKF can be dramatically sped up by computing the analysis for
each grid point in parallel (Szunyogh et al., 2005). Furthermore, like other Ensem-
ble Kalman Filters, the 4D-LETKF data assimilation scheme is model independent
and thus can easily be adapted to new and evolving models without the human cost
involved in determining the adjoint of the model for 4D-VAR.
2.5 Figures
26
0.5 1 1.5 2 2.5 3 3.5 4 4.5
0
0.1
0.2
0.3
0.4
ANALYSIS WINDOW (Days)
RMS
AVERAGE ANALYSIS RMS ERROR
Figure 2.1: The mean analysis error as a function of analysis window for 4D-VAR
(dashed). The mean analysis error is shown as a function for time between analyses
for 4D-LETKF with 15 ensemble members and localization (solid) and 50 ensemble
members and no localization (dot-dashed). The 15 (50) ensemble member 4D-
LETKF experiments used inflation factors of 10% (8%) for the 0.5 day, n = 8,
analysis window, 23% (14%) for the 1 day, n = 16, analysis window, and 65% (24%)
for the 1.75 day, n = 28, analysis window. The 4D-LETKF analysis diverged for all
inflation factors below 100% for analysis windows longer than 1.75 days and 2 days
in the 15 and 50 ensemble member experiments, respectively.
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
0.5
1
1.5
AVERAGE ERROR of 5 DAY FORECAST
TIME IN DAYS
RMS
Figure 2.2: Mean forecast errors as function of forecast time for 4D-VAR with 4
day (dashed) and for a 4D-LETKF with 1 day analysis window using an ensemble
forecast (solid) and forecast from the mean analysis state (dot-dashed). The mean
analysis error of each scheme is at time 0.
27
Chapter 3
Assimilating Nonlocal Observations with a Local Ensemble Kalman
Filter
3.1 Introduction
Localization is essential to many ensemble-based Kalman filter data assimi-
lation schemes (Houtekamer and Mitchell, 1998; Keppenne, 1999; Anderson, 2001;
Hamill et al., 2001; Houtekamer and Mitchell, 2001; Whitaker and Hamill, 2002;
Ott et al., 2004). When implementing an ensemble-based scheme globally, spurious
correlations in the background ensemble typically arise between distant atmospheric
events. These correlations are due to the practical necessity of using a limited num-
ber of ensemble members. Local implementations of these ensemble-based schemes
eliminate such spurious long-range correlations. Essentially, localization reduces the
required ensemble size because it reduces the dimensionality of the estimated state
(Szunyogh et al., 2005, 2007).
Local ensemble-based methods typically allow the analysis at a given grid point
to depend only on observations within a surrounding local region. The observations
in this local region are assumed to be strongly correlated with the model state
at the analysis location. Assimilation experiments suggest that choosing a region
containing only a subset of the horizontal model grid points and vertical model
28
levels benefits the ensemble-based Kalman filter analysis (e.g., Hamill et al., 2001;
Szunyogh et al., 2005; Houtekamer et al., 2005; Whitaker et al., 2007). However,
a challenge arises when assimilating satellite observations in this local framework.
Many satellite channels provide radiance observations sensitive to the dynamics
over a broad layer of the atmosphere (e.g., Liou, 2002). Consequently, resulting
observations are correlated to the atmospheric state over several model levels and
selecting which satellite observations to use to perform a given local analysis is not
straightforward.
Houtekamer et al. (2005) and Houtekamer and Mitchell (2005) address this
problem for Advanced Microwave Sounding Unit-A (AMSU-A) radiance observa-
tions by treating each observation as if it were a local observation made at the
pressure level for which the corresponding satellite channel is most sensitive. With
this vertical coordinate, they then assimilate those observations within the local re-
gion. While Houtekamer and Mitchell (2005) demonstrated that with this selection
strategy the AMSU-A observations have benefited the analysis, it is unknown to
what extent the localization prevents the analysis from extracting information from
these observations. In particular, an observation assigned to a particular model
level may still depend strongly on the dynamics at more distant levels. Therefore,
the local analysis at those distant levels may benefit from assimilating that obser-
vation. However, that observation may not be selected to update that level because
its assigned model level is too distant.
Another potential approach is to assimilate retrievals instead of the radiance
observations. We note, however, that satellite retrievals suffer from a complication
29
similar to that described for radiances. Retrievals invert sets of nonlocal satellite
observations (e.g., radiances) to obtain local observations of the analysis (model)
variables. The errors of the resulting point observations become correlated over
broad layers of the atmosphere. That is, some of the off-diagonal elements of the
observational error covariance matrix become nonzero. Thus selecting only those
retrievals located within the local region may neglect correlations between observa-
tions.
This study examines methods for assimilating observations sensitive to broad
layers of the atmosphere with ensemble-based schemes. In particular, we advocate
and test the following methodology. For remotely sensed observations, which are re-
lated to the dynamics at multiple levels of the atmosphere, we apply the observation
operator globally before performing localization. Because the observation operator
is applied to the global model state before any localization is performed, there is
flexibility in choosing which observations are assimilated in each local analysis. For
radiance observations, we can perform the analysis at a given grid point by assimi-
lating any observations related to the dynamics in the associated local region. For
retrieval observations, we incorporate a non-diagonal observation error covariance
matrix in the assimilation and select those observations located in the local region
and any other retrieval observations with errors significantly correlated to those in
the local region.
Wetesttheabovetechniqueforamixtureofsimulatedpoint, nonlocalradiance-
like, and correlated retrieval-like observations with the seven level SPEEDY prim-
itive equation model of Molteni (2003). In this paper, we assimilate these obser-
30
vations with a specific ensemble Kalman filter scheme. In particular, we use the
localization set-up proposed in Ott et al. (2004) with the algorithmic improvements
of Hunt et al. (2007). The latter is referred to as the Local Ensemble Transform
Kalman Filter (LETKF, Hunt et al., 2007).
A brief introduction to ensemble-based Kalman filter data assimilation tech-
niques and the LETKF scheme are presented in Section 3.2. This section explains
the methods of localization employed and then develops a means to select nonlocal
observations corresponding to a local region in model space. Section 3.2 also intro-
duces the SPEEDY model, which we use in our perfect model experiments presented
in Section 3.3. The results and a summary of conclusions are presented in Sections
3.4 and 3.5, respectively. These sections suggest updating the state at a location
by assimilating all nonlocal observations correlated to the model state near that
location. These results are obtained for simplified simulated satellite observations
using a particular ensemble Kalman filter scheme. However, we expect the results
will hold more generally when assimilating real satellite observations with any data
assimilation scheme requiring localization.
3.2 Background
3.2.1 Ensemble Kalman Filter
Kalman filter based data assimilation schemes obtain a best guess (or ?anal-
ysis?) for the state of the atmosphere by combining a forecast started in the past
(called the ?first guess? or ?background?) and observations. The basic formulation
31
is as follows (Kalman, 1960; Kalman and Bucy, 1961),
Pa = (I?KH)Pb, (3.1)
xa = xb +Kparenleftbigy?hparenleftbigxbparenrightbigparenrightbig, (3.2)
K = PbHlatticetopparenleftbigHPbHlatticetop +Rparenrightbig?1 . (3.3)
Here, xa and xb are column vectors containing the analysis and background states,
respectively. These states have corresponding error covariance matrices Pa and Pb.
The observations, y, are assumed to be given by y = h(xt) + ?, where xt denotes
the unknown true state and ? represents the measurement error. The corresponding
observation error covariance matrix is given by R = angbracketleftbig??latticetopangbracketrightbig. The angle brackets
denote an ensemble average over realizations of the random noise vector ?. The
superscript latticetop denotes the matrix transpose. Note that the observation operator h
converts from model to observation space. This observation operator is linearized
around the background mean to obtain a matrix H for use in the Kalman filter
equations. If the observation operator and model equations are linear, and if ? is
a zero mean Gaussian random vector, then this Kalman filter analysis is the best
linear unbiased estimate of the system state and its error covariance (Kalman, 1960;
Kalman and Bucy, 1961)).
After each analysis, one must propagate the analysis state and its covariance
forward in time to obtain the background state and covariance for the next analysis.
Due to limitations of computational power, the size of the covariance matrices make
it essentially impossible to directly evolve these covariances in the context of opera-
tional data assimilation for numerical weather prediction. To address this problem,
32
ensemble based Kalman filter schemes evolve the atmospheric model state starting
from a set of k perturbed initial conditions, creating an ensemble of k forecasts. Each
forecast is denoted by xb(i), where i = 1,2,...,k. The background state is given by
xb = xb(i), where the over-bar represents the ensemble mean. The covariance is now
estimated using the sample covariance
Pb = (k ?1)?1 XbXblatticetop =
summationtextk
i=1
parenleftbigxb(i) ?xbparenrightbigparenleftbigxb(i) ?xbparenrightbiglatticetop
k ?1 , (3.4)
where the ith column of the matrix Xb is given by xb(i) ?xb. In contrast to some
other approaches, this formulation does not require linearization of the model. Fur-
thermore, Pb has rank at most k?1. Thus, if k is chosen small enough, the number
of degrees of freedom in the Kalman Filter equations can be greatly reduced. How-
ever, while this truncation reduces the size of the matrix, it can also introduce severe
sampling fluctuations for realistically affordable ensemble sizes k. As discussed in
Section 3.2.2, this latter problem can be efficiently addressed by ?localization?.
Equation (3.4) is substituted into the Kalman Filter equations (3.1)?(3.3) to
give:
Pa = (k ?1)?1parenleftbigXb ?KYbparenrightbigXblatticetop, (3.5)
xa = xb +Kparenleftbigy?hparenleftbigxbparenrightbigparenrightbig, (3.6)
K = (k ?1)?1 XbYblatticetopbracketleftbig(k ?1)?1 YbYblatticetop +Rbracketrightbig?1 , (3.7)
where Yb is the matrix of ensemble perturbations in observation space. That is, Yb
takes the place of HXb by letting the ith column of Yb be
Yb(i) = hparenleftbigxb(i)parenrightbig?h(xb(i)), (3.8)
33
where h(xb(i)) represents the ensemble mean of the background ensemble in obser-
vation space. This substitution avoids the cost involved in finding the linearized
observation operator required by HXb.
Equations (3.5)?(3.7) form the basis for many ensemble-based Kalman filter
algorithms (Evensen, 1994; Burgers et al., 1998; Houtekamer and Mitchell, 1998; An-
derson, 2001; Bishop et al., 2001; Whitaker and Hamill, 2002; Ott et al., 2004). How-
ever, as posed here these equations are costly to solve because the matrix inverted
in Equation (3.7) has as many dimensions as the number of observations, which is
typically as large as O(106). Several methods (i.e., Houtekamer and Mitchell, 2001;
Anderson, 2003; Whitaker and Hamill, 2002) reduce the cost of an ensemble-based
scheme by assimilating observations sequentially, so that each evaluation of the right
hand side of Equation (3.7) involves only the inverse of a scalar. Alternately, several
schemes are made efficient by choosing a basis in which they need only invert a
matrix of size roughly k ?k (i.e., Bishop et al., 2001; Ott et al., 2004; Hunt et al.,
2007). In this study, we use an efficient modification of the scheme of Ott et al.
(2004), formulated in Hunt et al. (2007) which has been called the Local Ensem-
ble Transform Kalman Filter (LETKF). Following the notation of the latter paper,
Equations (3.5)?(3.7) become
?Pa =bracketleftbig(k ?1)I+YblatticetopR?1Ybbracketrightbig?1 , (3.9)
Pa = Xb?PaXblatticetop, (3.10)
xa = xb +Xb?PaYblatticetopR?1parenleftbigy?hparenleftbigxbparenrightbigparenrightbig. (3.11)
Here, ?Pa is the representation of Pa in the space spanned by the ensemble pertur-
34
bations.
All ensemble-based Kalman filter methods must also obtain an analysis en-
semble. The LETKF algorithm obtains this analysis for each ensemble member by
letting
xa(i) = xa +Xa(i), (3.12)
where Xa(i) is the ith column of the matrix
Xa = Xb
parenleftBig
(k ?1)?Pa
parenrightBig1/2
, (3.13)
where byM1/2 we mean the symmetric square root of a symmetric matrixM. Notice
that by Equation (3.10),
Pa = (k ?1)?1 XaXalatticetop. (3.14)
3.2.2 Localization
In a local region, the ensemble size may be sufficient to represent a large por-
tion of the state uncertainty (Szunyogh et al., 2005). Therefore, ensemble-based
Kalman filter schemes typically apply some form of localization, whose effect is to
make the analysis at a given location depend only on the observations within an
appropriate region in model space. A feature differentiating ensemble-based data
assimilation schemes proposed in the literature is their method of localization. Many
ensemble-based Kalman filter techniques assimilate observations sequentially, and
perform localization by limiting the influence of an observation on the analysis to
a surrounding local region (Anderson, 2001; Hamill et al., 2001; Houtekamer and
Mitchell, 2001; Whitaker and Hamill, 2002). In practice, this is accomplished by ta-
35
pering the entries in the background error covariance matrix to zero outside the local
region of influence. We refer to this localization as covariance localization and de-
scribe its implementation in the Appendix (Section 3.6). The sequential assimilation
of observations neglects correlations between observational errors which, if present,
are represented by nonzero off-diagonal entries in the error covariance matrix R. If
these off-diagonal entries are significant, they can be taken into account by simulta-
neously assimilating observations in correlated batches (Houtekamer and Mitchell,
2001). Alternatively, R can be diagonalized by a coordinate transformation in ob-
servation space, and the transformed observations can be assimilated sequentially
(Parrish and Cohn, 1985; Anderson, 2003). As we discuss in the Appendix (Section
3.7), diagonalizing R can be problematic when observation error correlations are
nonlocal, because the transformed observations are not associated with a specific
location and thus their influence is hard to locate.
Alternately, Ott et al. (2004) employ localization in model grid space. In this
scheme, a local region is defined for each model grid point. The local region is
centered at that grid point and can, for example, be a box with length 2l+1 model
grid points in both horizontal coordinates and depth 2v + 1 model levels. The
parameters l and v should be tuned in a given application to optimize performance.
The model state at the center grid point is updated using all the observations within
the local box. With this model grid localization, the analysis at each grid point
is independent and observations are assimilated simultaneously. These properties
enable parallel implementations of grid point localization to be realized naturally
(Szunyogh et al., 2005, 2007). Because of this computational advantage, we use grid
36
point localization in this paper?s numerical experiments. However, we expect the
results obtained to hold regardless of which localization method is employed.
3.2.3 Assimilating Nonlocal Observations
In this section, we discuss options for selecting which nonlocal observations to
use in the local model grid analysis for a given grid point. Because we may select
observations that depend on model variables outside the local region defined for
uncorrelated point observations, we apply the observation operator to the global
model state for each ensemble member. We include only those rows of Yb and h(xb)
corresponding to the selected observations, truncate the observation error covariance
matrix accordingly, and solve the resulting ensemble-based Equations (3.9)?(3.11)
exactly.
Selecting which satellite radiances to assimilate is complicated by the fact that
they do not have a single well-defined vertical location. We note that the weighting
function at a particular model point indicates the sensitivity of that observation
to the state at that model grid point (Liou, 2002). Houtekamer et al. (2005) and
Houtekamer and Mitchell (2005) choose to assign radiance observations to the model
level for which the magnitude of the weighting function is largest. They use this
location to select observations within the local region in model space. Throughout
this paper, we refer to this method for selecting radiance observations as ?maximum-
based selection?. We represent this scheme graphically in Figure 3.1(a).
An alternative to the maximum-based selection is to examine the weighting
37
function for each nonlocal observation, and choose to assimilate it if a ?significant?
weight is assigned to any model state vector component within the local region; oth-
erwise, this observation is not used in the assimilation. We refer to this observation
selection as ?cutoff-based selection? and represent it graphically in Figure 3.1(b).
Maximum-based selection may be regarded as a particular case of cutoff-based se-
lection in which the only significant weight is the largest weight. The Appendix
(Section 3.6) applies the cutoff-based selection to covariance localization schemes.
In Section 3.3, we run a series of numerical experiments to compare the maximum-
based and cutoff-based observation selection schemes. We note that we avoid the
additional complication of the nonlinearity of the satellite observation operator in
this study by letting the observation operator be a linear weighting function.
As mentioned earlier, satellite retrievals are nonlocal in that their errors can
be strongly correlated across significant distances (e.g., Kalnay, 2002). These cor-
relations are manifested in nonzero values in the off-diagonal terms of the observa-
tion error covariance matrix, R. The magnitude of the correlation (Rij/radicalbigRiiRjj)
between the errors of two observations, i and j, can be used to select which corre-
lated observations to assimilate. That is, we assimilate all the retrieval observations
located within the local region and all other retrieval observations for which the
magnitude of the correlation of their errors with the errors of any observation in the
local region is above a certain threshold. We retain the entries in the observation
error covariance matrix corresponding to the selected retrievals and use the resulting
error covariance matrix as R in the analysis (Equations (3.9)?(3.11)).
In the numerical experiments in Section 3.4, we tune the threshold value used
38
to select retrieval observations. For comparison, we also assimilate only observations
inside the local region, neglecting the correlations with more distant observations.
In this case, we scale the entries of the observation error covariance by a constant
value and tune that constant to obtain the analysis with the smallest error.
3.2.4 SPEEDY Model
The primitive-equations-based SPEEDY (short for Simplified Parameteriza-
tions, primitivE-Equation DYnamics) model is a global atmospheric circulation
model (Molteni, 2003). The model inputs are zonal wind, u, meridional wind, v, ab-
solute temperature, T, specific humidity, q, and surface pressure, ps, at a resolution
of 96 zonal grid points, 48 meridional grid points, and seven sigma levels (0.950,
0.835, 0.685, 0.510, 0.340, 0.200, 0.080). To solve the equations, the input variables
are converted to the vertical component of vorticity ?, horizontal divergence ?, T,
q, and log(ps) at seven sigma levels. The primitive equations are formulated in
terms of these variables and integrated in spectral space. Triangular truncation is
performed at wavenumber 30 (T30). The resulting forecasts are converted back into
the model input variables.
As detailed in Molteni (2003), the SPEEDY model uses the 1981-90 reanaly-
sis from the European Centre for Medium-Range Weather Forecasts (ECMWF) to
obtain its boundary conditions at the bottom of the atmosphere for fields including
topographical height, land-sea mask, sea ice fraction, sea surface temperature (SST),
snow depth, bare-surface albedo, surface temperature and moisture in the top soil
39
layer, and land-surface covered by vegetation. The SPEEDY model utilizes sim-
plifications of the basic physical parameterization schemes present in many general
circulation models (GCMs) for processes such as convection, large-scale condensa-
tion, clouds, short-wave radiation, long-wave radiation, surface fluxes, and vertical
diffusion. We employ the SPEEDY model for the numerical experiments in this
study because it retains many of the characteristics of operational GCMs, but at
orders of magnitude lower computational cost.
3.3 Experimental Design
We explore the methods proposed in Section 3.2.3 for selecting nonlocal ob-
servations carrying out perfect model experiments with the SPEEDY model. The
known ?true? state of the atmosphere is a 180 time step (roughly one and a half
month) integration of the SPEEDY model. Following Miyoshi (2005), the initial
condition for the truth results from a year long run of the SPEEDY model from
its default rest state on January 01, 1981. For consistency, we assimilate the same
observations of this truth for all of the numerical experiments. To determine the
quality of the assimilations, we convert the analysis and true states to pressure coor-
dinates, at pressure levels (925, 850, 700, 500, 300, 200, 100 hPa). We then compute
the global root mean square (RMS) error between the true and analysis states at
each pressure level. The average RMS error refers to the average of the RMS error
over the last month of the assimilation cycle.
Every six hours, we create simulated local and nonlocal remotely-sensed ob-
40
servations. We obtain local ?point? observations by adding zero mean Gaussian-
distributed noise to the modified truth. The noise has standard deviation 1 m/s for
u and v, 1 K for T, 0.0001 kg/kg for q, and 1 hPa for ps. Following Miyoshi (2005),
we simulate a rawinsonde network by retaining only those point observations which
are closest to the real rawinsonde locations at the 12Z synoptic time. For example,
the corresponding observation distribution for ps is shown in Figure 3.2.
We obtain simulated remotely sensed radiances by computing weighted aver-
ages of T. For given horizontal location (i,j), the radiance-like observations are of
the form y(i,j) = Hxt(i,j) + ?. The vector xt(i,j) contains the ?true? temperature for
all levels at the horizontal grid point (i,j). The noise, ?, is a zero mean Gaussian
distributed random vector with zero mean and standard deviation 2 K. We choose
a larger standard deviation for satellite observations than for gridpoint observations
to simulate the relative difficulties associated with assimilating real satellite observa-
tions. The observation resulting from applying a single row of H to the model state
xt(i,j) represents a single satellite ?channel?. That is, for channel n, the simulated
radiance observation is
y(i,j,n) =
7summationdisplay
m=1
Hnmxt(i,j,m) + ?, (3.15)
where xt(i,j,m) is the ?true? temperature at the horizontal location (i,j) and model
level m and y(i,j,n) is the corresponding radiance observation for channel n. For this
study, we have as many radiance observations as model levels (i.e., n runs from 1 to
41
7). We take Hnm = 2?|n?m|?1 if |n?m| ? 3, and 0 otherwise; so that
H =
?
??
??
??
??
??
??
??
??
??
??
??
?
1/2 1/4 1/8 1/16 0 0 0
1/4 1/2 1/4 1/8 1/16 0 0
1/8 1/4 1/2 1/4 1/8 1/16 0
1/16 1/8 1/4 1/2 1/4 1/8 1/16
0 1/16 1/8 1/4 1/2 1/4 1/8
0 0 1/16 1/8 1/4 1/2 1/4
0 0 0 1/16 1/8 1/4 1/2
?
??
??
??
??
??
??
??
??
??
??
??
?
. (3.16)
We retain the simulated radiances at every horizontal model grid point (i,j) to
simulate the relative density of satellite observations.
In this preliminary study, we wish to explore the methods proposed in Section
3.2.3 in the case of retrieval observations with known non-diagonal error covariance
matrices. Therefore, we consider idealized retrievals for which we can quantify the
full observation error covariance matrix, without the complications that typically
arise from horizontal correlations in retrievals or correlations between the retrieval
and background errors. Because our observation operator H is invertible, we do
not need to use any background information in our retrieval scheme, so that the
resulting retrievals can be assimilated correctly with an appropriate non-diagonal
error covariance matrix (Joiner and da Silva, 1998). Accordingly, we create idealized
retrieval observations yret(i,j) from our simulated radiance observations y(i,j), which we
obtain by solving at each horizontal grid point the equationy(i,j) = Hyret(i,j). Thus the
observation error covariance corresponding to the resulting retrievals, Rr, becomes
Rr = H?1RH?1latticetop (e.g., Joiner and da Silva, 1998). Here, R is the observation error
42
covariance matrix of the radiance observations, which is four times the identity
in this case. The resulting observation error covariance matrix for the retrievals
at a fixed horizontal point is shown in Table 3.1. Retrieval observation errors at
two different horizontal grid points are uncorrelated. However, those at the same
horizontal location but at different model levels are correlated. The magnitude of
the correlation between retrieval observation errors at a single horizontal point is
shown in Table 3.2.
The significant anti-correlation between errors at adjacent levels can be re-
duced by blending the observational information with a background state, based
on the climatology of the model, similarly to operational practice. However, as we
discussed above, this introduced other error correlations that are harder to account
for in data assimilation. Indeed, we find that ?regularizing? the retrievals in this
manner increased the analysis errors, despite the reduced vertical correlations. We
discuss these results further in Section 3.5.
The ensemble-based Kalman filter experiments on the SPEEDY model adapt
the LETKF code from Miyoshi (2005) and Harlim and Hunt (2007) to assimilate
nonlocal observations. Based on Miyoshi (2005), all the experiments presented use
20 ensemble members and a local region containing 7 meridional and 7 zonal grid
points. We determine the optimal vertical depth of the local regions from assimi-
lation experiments using the local ?point? observations alone. In these studies, we
fix the depth of the local region and assimilate all point observations within that
region. We then use the local region resulting in the smallest analysis error as the
basis for applying the observation selection schemes when assimilating the simulated
43
nonlocal radiance and retrieval observations.
3.4 Results
3.4.1 Grid Point Observations
We first assimilate the local ?rawinsonde? observations to determine the opti-
mal depth of the local regions. We consider a variety of depths for the local regions
and assimilate all the local observations within these regions. In Figure 3.3, we
compare the analysis RMS error of zonal wind (panel (a)) and temperature (panel
(b)) for local regions one level deep (solid) and a scheme which uses local regions
one level deep at the surface and top of the model and local regions three levels deep
elsewhere (dashed). The analysis RMS error is smallest when the local regions are
only one level deep. Using local regions containing more than three levels further
degraded the analysis (results not shown). For the ensemble size of 20 and horizon-
tal localization of 7?7 grid points, we find that these results hold over all analysis
times and for all other model variables (results not shown).
3.4.2 Radiance Observations
Figure 3.4 shows the time averaged RMS error as a function of vertical level
for a mixed observing network of rawinsonde and radiance observations and an
observing network of rawinsonde observations alone (thick solid). We compare the
cutoff-based and maximum-based selection schemes for the radiance observations.
Specifically, in the cutoff-based selection scheme, we assimilate those channels of the
44
radiance observations for which the observation operator has a nonzero weight (thin
solid line), a weight of at least 0.125 (dash-dotted), and a weight of at least 0.25
(dashed). In the maximal-based selection scheme, we assimilate those channels for
which the maximal weight 0.5 (dotted) within the local region.
Regardless of the observation selection scheme used to assimilate the radiances,
adding them to the observing network reduces the analysis RMS error for both the
zonal wind and temperature. The RMS error for the temperature, the analysis
variable most directly related to the radiances, does not vary dramatically with the
number of observations selected for assimilation. However, the RMS error of the
zonal wind is larger for the maximum-based selection scheme than for the cutoff-
based selection schemes. For the cutoff-based selection scheme, the zonal wind
RMS error generally decreases when additional radiances with smaller magnitudes
of the weighting function in the local region are added to the assimilated data set.
However, there is a slight disadvantage to assimilating radiance observations from
all channels with nonzero weight in the local region. The analysis RMS error of the
remaining variables behaves similarly to that of the zonal wind (results not shown).
Furthermore, it is possible that the maximum-based observation selection scheme
would perform better if it were assimilating more satellite radiances. However, we
find that extending the depth of the local regions and applying the maximum-based
selection to the extended local region degrades the analysis (results not shown).
45
3.4.3 Retrieval Observations
We also update 1-level deep local regions by assimilating simulated rawinsonde
observations and retrievals. In this case, we vary the number of retrieval observa-
tions assimilated in any local region and truncate the observation error covariance
matrix, R, to reflect the choice of observations. Specifically, we assimilate retrieval
observations only within the local region (dotted), observations correlated to those
in the local region with a correlation coefficient of magnitude at least 0.25 (dashed),
0.15 (dot-dashed), and 0.05 (thin solid line). The resulting average analysis RMS
error at each level is shown in Figure 3.5. The thick solid line shows the RMS er-
ror from assimilating the rawinsondes alone. Assimilating those retrievals located
in the local region has a positive impact on the analysis. However, we find that
increasing the number of retrieval observation assimilated and incorporating their
error covariance matrix further improves the analysis for temperature, zonal wind,
and the other model variables.
For these simulated retrievals, Table 3.2 shows that the correlation coefficient
between neighboring observation errors is roughly ?0.6, indicating that they are
strongly anti-correlated. By assimilating observations only within the local region,
we are neglecting these strong (anti-)correlations between observation errors and,
therefore, not we do not obtain an optimal analysis. On the other hand, assimi-
lating retrievals whose errors are correlated and incorporating a non-diagonal error
covariance matrix can account for these strong (anti-)correlations accurately.
When assimilating real retrieval observations, elements of the matrix R are
46
unknown. Often, R is assumed to be a diagonal matrix, whose entries are tuned
to optimize the analysis. Because our local region is only one level deep, when we
assimilate only those retrievals located strictly within the local region, we in fact
use only the diagonal entries of R when assimilating only those retrievals located
strictly within the local region. Therefore, we tried scaling these diagonal entries
by factors both larger and smaller than one, but found that the analysis was not
significantly improved (results not shown).
On the other hand, when we assimilate correlated retrievals from outside the
local region, the results in Figure 3.5 do use nonzero off-diagonal error covariances.
We repeated these experiments setting the off-diagonal entries in R to zero, and
found that the analysis errors got larger but were significantly better than the er-
rors obtained using only retrievals in the local region (results not shown). Thus,
even when the correlations between retrieval errors cannot be quantified, including
retrievals whose errors are correlated to those in the local region may improve the
analysis.
In our experiments, the observation operator and corresponding error covari-
ance are known perfectly for both the simulated radiances and retrievals. Therefore,
the simulated radiance and retrievals contain the same information and should pro-
vide similar analyses when using the same amount of that information. To verify
whether this is the case for our data assimilation system, we compare the analysis
RMS error for the best results obtained from assimilating the simulated rawinsonde
and radiance observations (using cutoff-based selection to assimilate those radiances
with a weight of at least 0.125 in the local region, solid line) to those obtained from
47
assimilating the simulated rawinsonde and retrieval observations (selecting obser-
vations in the local region and those retrieval observations with a coefficient of
magnitude at least 0.15, dashed-line). Figure 3.6 shows the RMS error as a func-
tion of vertical level. In both fields, the RMS error is similar at all levels. Similar
results hold for the other analysis variables (results not shown). The comparable
accuracy of the analyses from assimilating radiances and retrievals suggests that in
each case we are using the observation information consistently when we extend the
local region for the satellite observations and also incorporate the observation error
covariance matrix for retrievals.
3.5 Conclusions
In this paper, we compare different vertical localization approaches when using
an ensemble Kalman filter to assimilate a combination of simulated rawinsonde and
satellite observations for the SPEEDY model. In Section 3.4 we compare the RMS
analysis error for each approach. We have also computed but do not include here
forecast errors, which yield a similar comparison.
Using a 20 member ensemble, we find that the analysis error associated with
assimilating simulated rawinsonde observations is smallest when using 1-level deep
local regions. The dramatic increase in the analysis error from increasing the depth
of local regions reflects the coarse vertical resolution of the seven level SPEEDY
model. Increasing the depth of the local regions here by even one grid point in
depth increases their extent by hundreds of hPa. In more realistic weather models,
48
increasing the depth of the local regions does not cause the local regions to span
such a large extent of the atmosphere. Therefore, for more realistic weather models,
slightly increasing the depth of the local regions has only a slight impact on the
quality of the analysis. However, dramatically increasing the depth of these local
regions degrades the analysis in perfect model experiments with the NCEP GFS
model (Szunyogh et al., 2005). Houtekamer et al. (2005); Whitaker et al. (2007);
Szunyogh et al. (2007) also find that significant vertical localization is necessary to
get the best analysis from rawinsonde observations.
The numerical experiments described in Section 3.4 suggest that when using
vertical localization, a properly chosen cutoff-based selection for which radiance-like
observations to assimilate is better than either maximum-based selection. With
cutoff-based selection, we update the model state at a given location by assimilating
radiances for which the value of the weighting function is ?sufficiently? large in the
local region from which rawinsonde observations are assimilated, rather than using
only the radiances for which the maximum of the weighting function is in the local
region as in maximum-based selection (see Section 3.2.3). In our experiments with
simulated rawinsonde and radiance observations, we obtain the least analysis error
using cutoff-based selection and a 1-level deep local region. We find that maximum-
based selection results in greater analysis error, whether we retain the 1-level deep
local region or enlarge it. The advantage of cutoff-based selection is most significant
for analysis variables other than the temperature. These variables depend on the
observed radiances only through their correlations with the temperature.
We infer that for purposes of vertical localization, radiance-like observations
49
should be considered wherever they depend significantly on the model state rather
then only where they depend maximally on the model state. The numerical experi-
ments here represent a simplified case, in which the observation operator is a known
linear weighting function. Radiance observation operators are typically complicated
nonlinear functions, of which the weighting function is only a component. Nonethe-
less, the weighting function can still be extracted and indicates the sensitivity of
that observation to the model variables at a given location. One therefore can use
cutoff-based selection to choose appropriate radiances for a given local region when
using real data.
We note that maximum based selection is a equivalent to cutoff based selection
with a large cutoff value. Therefore, we expect the analysis from the two schemes to
be most different when the cutoff value is small enough and the weighting function
is broad enough to incorporate significantly more observations in the analysis than
would maximum based selection (Figure 3.1). Accordingly, we expect cutoff based
selection to benefit the analysis for radiances with broad weighting functions. In
particular, we expect the benefit of using cutoff-based selection to the analysis to
be greatest when the vertical distance over which the weighting functions exceeds
the cutoff is comparable to or larger than the vertical localization distance. On
the other hand, if the weighting function is sharply peaked, both the maximum
and cutoff selection schemes will select similar observations for assimilation and,
therefore, provide an analysis of comparable quality, regardless of the depth of the
local regions.
Our numerical experiments suggest that one also should assimilate retrievals
50
from a region that is larger than the local region used for the assimilation of point
observations. We find that the number of observations selected for assimilation
influences the analysis of the model variables more strongly for our simulated re-
trievals than for our simulated radiances. Using a non-diagonal error covariance
matrix and assimilating observations with errors strongly correlated to those in the
local region improves the analysis in all variables. The analyses obtained by as-
similating the retrieval observations in this way have similar average RMS errors to
those obtained with the assimilation of the radiances at the optimal selection of the
analysis parameters.
As we discussed in Section 3.3, the simple retrieval algorithm used in this
study exaggerates the correlations between errors of the retrieval observations over
what they would be when using a typical retrieval algorithm. Nonetheless, the errors
of the resulting retrievals would still have significant correlations. Accordingly, we
expect that the analysis will benefit from incorporating a non-diagonal observation
error covariance matrix and including more retrievals than those in the local region
for point observations, although the benefit may be less significant than for the
simplified retrievals used in the numerical experiments here.
A further complication from retrieval observations arises because the corre-
lations between their errors are often unknown. In practice, they are generally
assumed to be uncorrelated and the diagonal entries of the observation error covari-
ance matrix R are tuned to provide the optimal analysis. However, for the simplified
retrieval observations used here, we observe that simply scaling the diagonal entries
of R only slightly improves the analysis in some of the model variables. We gain the
51
most significant advantage from assimilating those observations with strongly corre-
lated errors simultaneously, using a non-diagonal R. Therefore, when retrieval error
correlations are unknown, including additional retrieval observations suspected of
having errors correlated errors to those in the local region and incorporating these
correlations in the estimate for the observation error covariance matrix may still
improve the analysis. Similar techniques could be applied to account for known
horizontal correlations in retrieval errors.
As we discussed in Section 3.3, many retrieval algorithms introduce correla-
tions between the retrieval observation errors and errors in the background state
used in the data assimilation. Therefore, directly assimilating these retrievals may
not provide an optimal analysis, as we observed for simulated regularized retrievals.
Several techniques for reducing the unwanted observation-background error correla-
tions are described in the literature (e.g., Joiner and da Silva, 1998; Rodgers, 2000)
either by modifying the retrieval observations themselves or by introducing a non-
local observation operator. In the latter case, a cutoff-based selection similar to the
one proposed for radiance observations should be applied. Future studies should
consider these techniques for more realistic retrieval observations than the idealized
retrievals used here.
In summary, spatial localization is essential to many ensemble-based data as-
similation schemes. Typically the local regions used for ensemble-based Kalman
filter schemes on operational models contain more levels than those used for the
SPEEDY model. However, these regions are still small enough that relevant satel-
lite observations can depend on the model variables outside the local region. The
52
numerical experiments in Section 3.4 and the discussion of Section 3.2 suggest that
one should localize the impact of observations differently depending on the observa-
tion type. Nonlocal observations do not necessarily need to be used in the analysis
for all model variables on which they depend, but they need to be used in the anal-
yses at the location of all model variables they are related to sufficiently strongly,
either through the observation operator (as for satellite radiances) or through cor-
related errors (as for retrievals). The distance over which nonlocal observations
influence the analysis is a parameter that can be tuned independently of the lo-
calization distance for point observations. In this paper, we have expressed this
parameter as a cutoff for the radiance weighting function or the correlation between
retrieval errors, but if these quantities are not known, one can tune the localization
distance directly.
3.6 Appendix: Covariance Localization
An alternative to the grid point localization scheme in this paper is covariance
localization (Anderson, 2003; Bishop et al., 2001; Hamill et al., 2001; Houtekamer
and Mitchell, 2001; Whitaker and Hamill, 2002). This localization scheme limits
the distance over which each observation affects the analysis by tapering the entries
of the background matrix to zero beyond a fixed horizontal distance L and vertical
distance V . Each entry of the background covariance matrix Pb relates two model
variables, and here distance is measured between the locations of these model vari-
ables. In a sequential assimilation scheme, the term Pb in Equation (3.3) then causes
53
entries in the Kalman gain matrix K to be zero beyond the horizontal distance L
and vertical distance V from the location of the observation(s) being assimilated.
Houtekamer et al. (2005) implement this approach by modifying Equation
(3.7) such that
K = (k ?1)?1bracketleftbig?V ??H ?parenleftbigXbYblatticetopparenrightbigbracketrightbigbracketleftbig(k ?1)?1 ?V ??H ?parenleftbigYbYblatticetopparenrightbig+Rbracketrightbig?1 , (3.17)
Here, ?H acts on a covariance matrix by multiplying each entry by a function that
depends on horizontal distance, decreasing from one at a distance of 0 to zero at a
distance of L or greater (This function is often chosen to be approximately Gaussian,
as in Gaspari and Cohn (1999)). Similarly, ?V multiplies by a function of vertical
distance that is zero beyond distance V . In this implementation, distances are
between model variable locations and observation locations in XbYblatticetop, and between
observation locations in YbYblatticetop.
With covariance localization, one can allow nonlocal observations to have a
larger region of influence than point observations as follows. To use cutoff-based
selection, as described in Section 3.2.3, for vertical localization, one should first
determine an appropriate vertical localization distance V for point observations. For
nonlocal observations, increase V to ?V , where ?V ?V is the distance over which the
cutoff (for the weighting function in the case of satellite radiances or for correlation
in the case of retrievals) is exceeded. Then, define a modified vertical localization
operator ??V that tapers covariances to zero beyond the vertical distance ?V by scaling
vertical distances by a factor of V/?V . Each nonlocal observation must also be
assigned a nominal location; for satellite radiances, one can use the maximum of the
54
weighting function. Finally, replace ?V by ??V in Equation (3.17) when assimilating
nonlocal observations. This will be simplest in a sequential scheme, assimilating
nonlocal observations separately from point observations.
3.7 Appendix: Diagonalization
In the context of a sequential ensemble-based scheme, Anderson (2003) pro-
poses to account for spatial correlations of observations errors by diagonalizing
the observation error covariance matrix. Because the observation error covariance
matrix R is symmetric, it can be diagonalized by an orthogonal transformation
R = S?Slatticetop. The columns of S contain the orthonormal eigenvectors of R and ?
has the corresponding eigenvalues along its diagonal. Recall that observations are
given by y = h(xt) + ?, where angbracketleftbig??latticetopangbracketrightbig = R. Instead of assimilating the original
retrieval, Anderson (2003) creates a new observation y? = Slatticetopy = Slatticetoph(xt) +Slatticetop?.
For this observation, R? =angbracketleftbigSlatticetop??latticetopSangbracketrightbig= SlatticetopRS = ?, which is diagonal. The result-
ing observations have uncorrelated errors. Their observation operator can be taken
to be h? = Slatticetoph. This operator contains all the sensitivity to nonlocal behavior.
Therefore, we could select which observations to assimilate to update a given lo-
cal region by examining the magnitude of the linearized observation operator. We
could use this magnitude as the basis of maximum-based or cutoff-based observation
selection.
However, applying the eigenvalue decomposition makes it problematic to dis-
tinguish observations that are strongly correlated to the dynamics within a local
55
region. To better understand this we recall that for observations which have only
slightly correlated errors, the corresponding off diagonal entries of the observation
error covariance matrix, R, are small. For example, in the numerical experiments in
Section 3.4, the R matrix for retrieval observations at a given horizontal grid point is
given in Table 1. Its structure suggests that an accurate analysis would result from
assimilating only those observations with errors that are most strongly correlated
to the errors of those observations located within the local region. However, the
corresponding eigenvectors typically do not maintain this local structure. We have
found that most entries of these eigenvectors have approximately the same magni-
tude. For example, for the experiments in Section 3.3, the matrix whose columns
correspond to the eigenvectors of R is
S =
?
??
??
??
??
??
??
??
??
??
??
??
?
0.24 0.44 0.53 0.48 0.33 ?0.28 0.21
0.36 0.47 0.24 ?0.12 ?0.39 0.52 ?0.40
0.44 0.30 ?0.25 ?0.51 ?0.20 ?0.39 0.45
0.48 0.00 ?0.44 0.00 0.63 0.00 ?0.43
0.44 ?0.30 ?0.25 0.51 ?0.20 0.39 0.45
0.36 ?0.47 0.24 0.12 ?0.39 ?0.52 ?0.40
0.24 ?0.44 0.53 ?0.48 0.33 0.28 0.21
?
??
??
??
??
??
??
??
??
??
??
??
?
. (3.18)
Because the magnitude of all these terms is comparable, there would be no means
to distinguish which of the observations y? one should assimilate.
Nonetheless, some ensemble-based Kalman filter algorithms require diagonal-
ization to assimilate observations with correlated errors. In these cases, observations
can still be selected based on their correlation coefficients, as described in Section
56
3.2.3. The R matrix could then be truncated to reflect this choice of observations.
Applying the diagonalization to the truncated R matrix would ensure that only
those observations with errors most strongly correlated those of observations in the
local region are assimilated to update the state in that region.
57
3.8 Figures
(a) Maximum-Based Se-
lection
(b) Cutoff-Based Selec-
tion
Figure 3.1: A schematic for the observation selection schemes used for nonlocal,
radiance-like observations. The curve represents the weighting function at each
model level. The rectangles represent several local regions around these model
levels. In practice, these regions overlap, though not depicted as such in this figure.
The shaded rectangles represent local regions that assimilate this observation.
58
Figure 3.2: For the SPEEDY model experiments in this paper, we simulate rawin-
sonde observations at the model grid points closest to real rawinsonde locations at
the 12Z synoptic time developed by Miyoshi (2005). Here, we show the locations
of the model grid points of the rawinsonde stations for the ps observations. Similar
locations are used for the other variables, and the observation density varies with
height as for real rawinsonde observations.
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
100
200
300
400
500
600
700
800
900
Pressure level (hPa)
Analysis RMS error (m/s)
(a) Zonal Wind (u)
0 0.250.5 0.751 1.251.5 1.752 2.252.5
100
200
300
400
500
600
700
800
900
Pressure level (hPa)
Analysis RMS error (K)
(b) Temperature (T)
Figure 3.3: The average analysis RMS error as a function of vertical model level
from assimilating simulated rawinsonde observations. The depth of the vertical local
region contains at most 1 level (solid) and 3 levels except at the model surface and
top where local regions are 1 level deep.
59
0 0.5 1 1.5 2 2.5 3 3.5
100
200
300
400
500
600
700
800
900
Pressure level (hPa)
Analysis RMS error (m/s)
(a) Zonal Wind (u)
0 0.25 0.5 0.75 1 1.25 1.5 1.75 2
100
200
300
400
500
600
700
800
900
Pressure level (hPa)
Analysis RMS error (K)
(b) Temperature (T)
Figure 3.4: The average analysis RMS error as a function of vertical model level
in from assimilating simulated rawinsonde observations and satellite radiances. We
select rawinsonde observations within a local region 1 level deep (thick solid) and
we also assimilate the radiance observations if the value of the h operator there
is greater or equal to 0.5 (dotted), 0.25 (dashed), 0.125 (dash-dotted), and 0.0625
(thin solid).
0 0.5 1 1.5 2 2.5 3 3.5
100
200
300
400
500
600
700
800
900
Pressure level (hPa)
Analysis RMS error (m/s)
(a) Zonal Wind (u)
0 0.25 0.5 0.75 1 1.25 1.5 1.75 2
100
200
300
400
500
600
700
800
900
Pressure level (hPa)
Analysis RMS error (K)
(b) Temperature (T)
Figure 3.5: The average analysis RMS error as a function of vertical model level from
assimilating simulated rawinsonde observations alone (thick solid) and rawinsondes
with satellite retrievals. We assimilate retrievals within the local region (dotted) and
also those retrieval observations outside the local region with a correlation coefficient
of at least 0.25 (dashed), 0.15 (dash-dotted), and 0.05 (thin solid).
60
0 0.5 1 1.5 2 2.5 3 3.5
100
200
300
400
500
600
700
800
900
Pressure level (hPa)
Analysis RMS error (m/s)
(a) Zonal Wind (u)
0 0.25 0.5 0.75 1 1.25 1.5 1.75 2
100
200
300
400
500
600
700
800
900
Pressure level (hPa)
Analysis RMS error (K)
(b) Temperature (T)
Figure 3.6: We compare the best result from assimilating the simulated rawinsonde
and satellite radiance observations (solid) and the simulated rawinsonde and satellite
retrieval observations (dashed) by plotting the average analysis RMS error as a
function of vertical model level.
61
3.9 Tables
Model Level 1 2 3 4 5 6 7
1 36.7 -33.3 8.2 -4.6 9.4 -6.1 1.3
2 -33.3 61.0 -37.5 11.0 -10.8 13.1 -6.1
3 8.2 -37.5 60.5 -36.9 11.5 -10.8 9.4
4 -4.6 11.0 -36.9 59.3 -36.9 11.0 -4.6
5 9.4 -10.8 11.5 -36.9 60.5 -37.5 8.2
6 -6.1 13.1 -10.8 11.0 -37.5 61.0 -33.3
7 1.3 -6.1 9.4 -4.6 8.2 -33.3 36.7
Table 3.1: The observation error covariance for the simulated retrieval observations
at a fixed horizontal location.
Model Level 1 2 3 4 5 6 7
1 1.00 -0.70 0.17 -0.10 0.20 -0.13 0.04
2 -0.70 1.00 -0.62 0.18 -0.18 0.21 -0.13
3 0.17 -0.62 1.00 -0.62 0.19 -0.18 0.20
4 -0.10 0.18 -0.62 1.00 -0.62 0.18 -0.10
5 0.20 -0.18 0.19 -0.62 1.00 -0.62 0.17
6 -0.13 0.21 -0.18 0.18 -0.62 1.00 -0.70
7 0.04 -0.13 0.20 -0.10 0.17 -0.70 1.00
Table 3.2: The correlation between the errors of observations at a single horizontal
point and model levels k and l is given by the entry in this table corresponding to
these model levels.
62
Chapter 4
Satellite Observation Bias Correction with an Ensemble Kalman
Filter
4.1 Introduction
Satellite observations provide information essential to numerical weather pre-
diction, particularly in traditionally data poor regions like the Southern Hemisphere
(e.g., Kalnay, 2002). However, satellite radiance observations are subject to biases
due to errors in the instrument calibration and in the observation operator that
maps the model variables to observation variables (e.g., Eyre, 1992). These biases
must be corrected in order to best utilize the information from radiance observa-
tions. In this study, we focus on correcting biases from errors in the observation
operator.
Whenassimilatingradianceobservationsdirectly, manydataassimilationschemes
rely on o?ine corrections of these radiance observations (Eyre, 1992; Harris and
Kelly, 2001). O?ine bias correction schemes, however, have to have some impor-
tant limitations. For example, their parameters have to be tuned manually and they
must be re-tuned to account for new instruments, new forward operators, and biases
which evolve as the instrument ages (Dee, 2005). Therefore, some variational data
assimilation schemes correct for radiance observation biases as part of the assimi-
63
lation procedure (Derber and Wu, 1998; Dee, 2005) using state space augmentation
(Friedland, 1969). These techniques append bias parameters to the model state
vector and update these parameters as part of the data assimilation process. Typ-
ically the uncertainty in the bias parameters is assumed to be uncorrelated to the
uncertainty of the forecast model state (Derber and Wu, 1998; Dee, 2005).
While ensemble schemes have been found to assimilate radiance observations
efficiently (Houtekamer et al., 2005; Houtekamer and Mitchell, 2005), their ability to
likewise account for radiance biases online has not been demonstrated. Baek et al.
(2006), Keppenne et al. (2005), and Zupanski and Zupanski (2006) apply state space
augmentation to estimate low dimensional model biases with an ensemble Kalman
filter. The approaches of Baek et al. (2006) and Zupanski and Zupanski (2006)
also takes into account the correlation between uncertainties in the model bias and
uncertainties in the forecast due to uncertainties in the initial conditions and the
chaotic nature of the forecast models.
In this paper, we use state-space augmentation for ensemble-based assimila-
tion schemes to estimate radiance observation biases online. We apply this set-up to
assimilate simulated unbiased rawinsonde observations and simulated biased AIRS
radiances employing, as an example, a specific ensemble Kalman filter data assimi-
lation scheme and a specific atmospheric model. In particular, for our data assim-
ilation scheme we employ the algorithmically efficient Local Ensemble Transform
Kalman Filter (LETKF) of Hunt et al. (2007) which is based on the localization
scheme of Ott et al. (2004), and for our atmospheric model we use the ?Simplified
Parameterizations, primitivE-Equation DYnamics? model (SPEEDY) of Molteni
64
(2003). To represent the fact that the true form of the satellite radiance biases is
unknown in reality, we create the simulated AIRS radiances with one form of bias
and assume that the bias has another form during the assimilation. In both cases,
we use the prototype Community Radiative Transfer Model (pCRTM) as the ob-
servation operator for the AIRS radiances. We simulate biased AIRS radiances by
modifying the pCRTM in a nonlinear manner as suggested in Watts and McNally
(2004), while our online bias correction estimates coefficients of a linear combination
of predictive quantities proposed by Cameron (2003) and Harris et al. (2004). With
this bias correction scheme, we significantly reduce the observation and analysis
RMS errors associated with assimilating these simulated biased AIRS observations.
In Section 2, we describe the typical forms of the bias assumed for satellite
radiances and how they are usually corrected. We further describe the LETKF data
assimilation scheme and the state-space augmentation bias correction technique for
LETKF in this section. We describe the setup for the perfect model experiments
we perform with the SPEEDY model in Section 3. The results of these experiments
are described in Section 4 and conclusions are provided in Section 5.
4.2 Satellite Radiance Biases
Satellite radiance biases arise from measurement biases and errors in the radia-
tive transfer model, which transforms model states into radiance space. Therefore,
a radiance observation, y, is modeled to be of the form
y = ?hparenleftbigxt,?parenrightbig+ ?. (4.1)
65
Here, xt is the true state of the atmosphere, ? is the mean zero, random component
of the radiance error, and ?h is a radiative transfer operator, modified to incorporate
bias parameters ?. Because it includes the bias parameters, ?h will ideally correct
for the biases associated with errors in the satellite calibration or the unmodified
radiative transfer model, h. The bias parameters, ?, differ for each channel of each
instrument (Eyre, 1992; Derber and Wu, 1998; Harris and Kelly, 2001).
Typically, the radiance bias is assumed to have two sources, scan-angle bias
and air-mass bias. The scan angle bias depends on latitude and scan angle, while
the air mass bias depends on the atmospheric state. We only consider the air-mass
bias in this paper, but note that the techniques described in this paper could be
applied to estimate the scan-angle bias as well.
Several operational centers, including the National Centers for Environmental
Prediction (NCEP) and the European Centre for Medium-Range Weather Forecasts
(ECMWF), assume that the air-mass bias is a linear combination of a small set of
model state variables, called predictors. That is,
?hparenleftbigxt,?parenrightbig= h(x) +
Nsummationdisplay
i=1
?ipi (x), (4.2)
where h(x) is the radiative transfer model derived from physical principles and each
pi is one of N predictors (Eyre, 1992; Derber and Wu, 1998; Harris and Kelly, 2001).
Each coefficient ?i is assumed to be a global constant that depends on the satellite
instrument and channel. In this study, we use a subset of the predictors found to
correct AIRS biases (Cameron, 2003; Harris et al., 2004). Specifically, we choose
the following predictors:
66
? p1: Constant term (i.e., p1 = 1).
? p2: 200 - 50 hPa thickness.
? p3: 850 - 300 hPa thickness.
? p4: Surface skin temperature (i.e., model surface temperature).
? p5: The modeled radiance itself (i.e., p5 = h(x)).
Throughout the rest of this paper we will refer to each predictor by its index.
In the numerical experiments with the SPEEDY model presented in this paper,
the thicknesses used for p2 and p3 are found by computing the difference between
the geopotential height at the appropriate levels. The SPEEDY model obtains the
surface skintemperature used for p4 fromthe sea-surface temperature over ocean and
by modifying the land-surface temperature over land to incorporate the forecasted
surface pressure (Molteni, 2003).
Watts and McNally (2004) propose an alternate form for the radiance biases.
They choose a physical modification of the radiative transfer model. Normally, the
radiative transfer model is given by
h(x) = ?(0)?(0)B(T(0)) +
integraldisplay ?
0
B(T(z))??(z)?z dz, (4.3)
where z = 0 is at the surface, ?(0) is the emissivity of the surface, T(z) is the
temperature profile associated with the state x at the latitude and longitude of the
satellite observation, and B(T(z)) is the Planck function. The term ???z is called the
weighting function because it characterizes the contribution to the radiance from a
67
given height when combined with the Planck function evaluated there. The function
?(z) represents the transmittance from height z and is given by
?(z) = exp
parenleftbigg
?
integraldisplay ?
0
?(z)?(z)dz
parenrightbigg
. (4.4)
Here, ?(z) is the absorption coefficient from height z and ?(z) is the density at that
height. Watts and McNally (2004) assume that the greatest source of error in the
radiative transfer model arises from poor knowledge of ?(z), and they assume that
?(z) has a constant fractional error, ?. That is, they make the following replacement
? ? ??, (4.5)
so that
?(z) ? ?(z)?. (4.6)
To correct biases from other sources, they add a global constant, ?, to the modified
radiative transfer model. Again, ? and ? are assumed to be global constants which
depend only on the satellite and the channel. For most AIRS long-wave temperature
channels, ? ? 1.05 (Watts and McNally, 2004).
We use the bias form proposed by Watts and McNally (2004) with ? = 1.05
and ? = 0 to simulate biased radiance observations. We use the more standard bias
correction form in Equations (4.1) and (4.2) to estimate these biases. The goal of
the present paper is not to address which functional form of the bias estimate should
be used to correct real radiance biases, but rather to show that we can accurately
estimate bias correction parameters with an ensemble Kalman filter.
68
4.2.1 Online Bias Correction
Bias parameters for radiance observations can be estimated online as part of
the data assimilation by using state space augmentation (e.g., Derber and Wu, 1998;
Dee, 2005). In the data assimilation schemes using such an approach, the analysis
updates an augmented state vector,
z =
?
??
?
x
?
?
??
?, (4.7)
based on the observed information, where x is the model state and ? is the vector of
bias parameters for all satellite instruments and channels. That is, the analysis seeks
the augmented state vector, za, which best fits its background state (or first-guess,
zb) and the available observations, y.
Because systematic errors in satellites change slowly in time, the bias param-
eters are typically assumed to remain constant between successive analysis times.
That is, the evolution of the bias vectors is modeled by persistence, so that
?bt = ?at?1, (4.8)
where ?at?1 is the updated bias vector at the previous analysis time and ?bt is the
background bias vector at the current analysis time.
The state space augmentation approach also requires an estimate for the back-
ground error covariance of the augmented state vector, z. For simplicity, many 3D-
VAR and 4D-VAR implementations assume that the error of each bias parameter
is uncorrelated to the errors of the other bias parameters, to the errors of the bias
parameters for other satellite channels and instruments, and to the errors of the
69
model state (Derber and Wu, 1998; Dee, 2005). Such an assumption is not made in
the ensemble-based Kalman filter scheme we describe in the next section.
4.2.2 Online Bias Correction with an Ensemble Kalman Filter
Our ensemble Kalman filter bias correction scheme uses an ensemble of aug-
mented state vectors to estimate a flow-dependent background error covariance, Pb
(Baek et al., 2006). In the following formulation, considering that based on Equation
(4.7) each augmented state vector z contains the model state x and bias parameters
?, the modified observation operator is given by ?h(x,?) = ?h(z). An ensemble of
k forecasts of these augmented state vectors are obtained from model runs starting
from a set of k initial conditions, where zb(i) represents the forecast corresponding
to the ith ensemble member (i = 1,...,k). Here, forecasts of the atmospheric state
portion x of z are obtained by integrating the atmospheric model, while forecasts
of the bias vector portion ? of z are obtained from persistence (Equation (4.8)).
The background error covariance can be obtained from the sample covariance of the
ensemble members,
Pb = (k ?1)?1 ZbparenleftbigZbparenrightbiglatticetop . (4.9)
Here, latticetop denotes the matrix transpose. Zb is the matrix of ensemble perturbations
for the augmented state,
Zb =
bracketleftbigg
zb(1) ?zb zb(2) ?zb ... zb(k) ?zb
bracketrightbigg
, (4.10)
where the ?background state? zb is the ensemble mean,
zb = k?1
ksummationdisplay
i=1
zb(i). (4.11)
70
Ensemble schemes generally use the Kalman filter equations to seek the linear
combination of the ensemble members which best fits the available observations.
Applying this methodology to the ensemble of augmented state vectors, ensemble
schemes can simultaneously update atmospheric state and bias parameters. In what
follows, we formulate the resulting analysis equations using LETKF (Hunt et al.,
2007), but note that a similar formulation could be applied to any ensemble Kalman
filter scheme. We choose LETKF in this paper because, like its predecessor LEKF
(Ott et al., 2004), it has been found to be an accurate and efficient data assimilation
scheme for operational weather models (Szunyogh et al., 2005, 2007; Whitaker et al.,
2007). Applying LETKF to the ensemble of augmented state vectors, the analysis
equations become
za = zb +Zb?PaparenleftbigYbparenrightbiglatticetopR?1
parenleftBig
y? ?h(zb(i))
parenrightBig
, (4.12)
Pa = Zb?PaparenleftbigZbparenrightbiglatticetop , (4.13)
?Pa =bracketleftBig(k ?1)I+parenleftbigYbparenrightbiglatticetopR?1YbbracketrightBig?1 . (4.14)
Here, ?h maps the background state into observation space for all the observations
to be assimilated. The jth row of the vector ?h(zb(i)) is the ensemble mean of
the background ensemble in observation space for the jth observation. If the jth
observation is believed to be biased, then the corresponding row of the operator ?h is
a function of the model state and the bias parameters. Otherwise, the corresponding
row ?h is a function of the model state alone. This operator is also used to define the
matrix of ensemble perturbations in observation space, Yb. Its ith column is given
71
by
Ybi = ?hparenleftbigzb(i)parenrightbig? ?h(zb(i)). (4.15)
The information provided by the analysis error covariance matrix, Pa, is
used to generate the analysis ensemble consistent with the resulting analysis state.
LETKF uses Equation (4.13) to determine the analysis of the augmented state for
the ith ensemble by
za(i) = za(i) +Zai, (4.16)
where Zai is the ith column of Za = (k ?1)?1/2 Zb
parenleftBig?
Pa
parenrightBig1/2
.
Computational restrictions often limit the number of ensemble members which
can be used to obtain the analysis in Equations (4.12) ? (4.14). Therefore, the
resulting ensemble of augmented state vectors may not be sufficient to capture the
global dynamics. However, a limited number of ensemble members is presumed
sufficient to capture the dynamics within local regions (Patil et al., 2001; Ott et al.,
2004; Oczkowski et al., 2005; Szunyogh et al., 2005; Kuhl et al., 2007; Kalnay et al.,
2006).
Inordertousealimitednumberofensemblemembers, manyensembleschemes,
including LETKF, use localization (Anderson, 2001; Hamill et al., 2001; Houtekamer
and Mitchell, 2001; Whitaker and Hamill, 2002; Ott et al., 2004; Hunt et al., 2007).
As in the Local Ensemble Kalman Filter (LEKF Ott et al., 2004), in LETKF the
analysis for each grid point is obtained by solving Equations (4.12) ? (4.14) exactly
using only those observations in a local region centered at that grid point (Ott et al.,
2004; Hunt et al., 2007). Here, these regions contain 2l + 1 gridpoints in the two
72
horizontal directions and 2v +1 gridpoints in the vertical direction, for some choice
of l and v. This type of localization technique (Ott et al., 2004) allows fast par-
allel implementation (Szunyogh et al., 2005, 2007). Because satellite observations
depend on deep layers of the atmosphere, they cannot be selected for assimilation
in the local regions this way. Based on Fertig et al. (2007a), we use a cutoff based
localization for the satellite observations. Specifically, we use a satellite observation
in the analysis at a grid point if its weighting function is above a cutoff value, ?,
somewhere within the local region around that grid point.
Although the LETKF analysis of model variables utilizes localization, we as-
sume that the bias parameters are global constants that depend on the instrument,
channel, and ensemble member. We take the following steps to obtain a global
analysis for these parameters while still obtaining a local analysis for the model
states. For each grid point, we obtain a local augmented state vector by append-
ing those bias parameters corresponding to biased observations that are within the
local region to the model state vector. We then update the local augmented state
vector for that grid point as described above. Finally, we replace the estimated bias
parameter at every grid point by a weighted global average of the analysis estimate
for that bias parameter. The weighted average that we use is a surface area average
weighting the grid points by the inverse of the corresponding analysis bias variance.
That is, the global analysis for the ith ensemble member of a component ? of the
bias state vector is given by
?a(i) =
summationtext
j
cos(?j)?a(i)j
?2jsummationtext
j
cos(?j)
?2j
, (4.17)
73
where j indexes all the model grid points, ?j is the latitude at grid point j, and
?a(i)j is the analysis for the ith ensemble member at grid point j. The variance of
the analysis bias ensemble at grid point j is represented by ?2j and is given by
?2j = (k ?1)?1
ksummationdisplay
l=1
parenleftBig
?a(l)j ??aj
parenrightBig2
, (4.18)
where ?aj is the ensemble mean of the bias analysis at grid point j.1 If instead the
parameter varies zonally, as for scan-angle bias, this average could be taken over all
model grid points at a single latitude or range of latitudes. Because this averaging
is employed, the dimensionality of the augmented system is not greatly increased
by including the bias parameters. Therefore, we do not require many additional
ensemble members to estimate the observation biases, in contrast to Baek et al.
(2006) where a pointwise estimate of the model bias term required a relatively large
ensemble.
4.3 Experimental Design
4.3.1 SPEEDY Model
The numerical experiments in this paper are performed using the Simpli-
fied Parameterizations, primitivE-Equation DYnamics (SPEEDY) model (Molteni,
2003). Because the SPEEDY model evolves the primitive equations, it reflects the
dynamics of a full general circulation model. However, its computational cost is
1In the numerical experiments in this paper we do not observe a significant difference in the
analysis using simpler means of weighting the average (e.g., by the number of observations in each
local region, results not shown).
74
significantly lower than that of operational models and it has been widely used for
climate simulations (Molteni, 2003; Miyoshi, 2005). This computational efficiency
is obtained by performing triangular truncation at wavenumber 30 (T30) and us-
ing simplified parameterization for several physical processes, including convection,
large-scale condensation, cloud physics, radiation, surface fluxes, and vertical diffu-
sion. The boundary conditions at the bottom of the atmosphere for variables in-
cluding the sea-surface temperature and the land-surface temperature are obtained
from the 1981-1990 ECMWF reanalysis (Molteni, 2003).
The SPEEDY model obtains its forecasts by solving the primitive equations
for vorticity, ?, divergence, ?, absolute temperature T, specific humidity, q, and
the logarithm of the surface pressure, log(ps). The resulting values are converted
to output variables of zonal wind, u, meridional wind, v, absolute temperature, T,
specific humidity, q, and surface pressure, ps, at 96 zonal grid points, 48 meridional
grid points, and seven sigma levels (0.950, 0.835, 0.685, 0.510, 0.340, 0.200, 0.080)
or seven pressure levels (925, 850, 700, 500, 300, 200, 100 hPa). Miyoshi (2005)
modified the SPEEDY model code for the purpose of data assimilation, using values
of u, v, T, q, and ps at seven sigma levels as initial conditions for each six hour
forecast of the SPEEDY model.
4.3.2 Community Radiative Transfer Model
In this paper, we use the Prototype Community Radiative Transfer Model
(pCRTM, Han et al., 2005) to generate the simulated satellite observations and for
75
the corresponding observation operator. This model was developed by the Joint
Center for Satellite Data Assimilation (JCSDA) to provide the community with a
unified code to simulate satellite radiances. As part of this mission, the framework
of the pCRTM and its more advanced counterpart, the CRTM, are both general
enough to be easily adapted to simulate radiances from most satellite instruments
and from most atmospheric model states. Furthermore, the more advanced CRTM
is currently used by the operational NCEP Global Forecast System (GFS) and
the Gridpoint Statistical Interpolation System (GSI) assimilation schemes. In this
paper, we use the pCRTM to simulate satellite radiance to take advantage of its
relative simplicity.
The pCRTM determines the radiance and brightness temperature from an
input profile of pressure, temperature, water vapor, and ozone. Here, the profile
values for pressure, temperature, and water vapor are obtained from the SPEEDY
model. For simplicity, in this study we assume that the ozone profile has a constant
value of 0.06720 ppmv. Furthermore, the numerical experiments in this paper are
not concerned with scan angle bias since we assume that the satellite observes the
atmosphere directly below the location of the satellite.
The pCRTM determines the radiance by solving a discrete form of Equation
(4.3). This equation is given by
h(x) = Rsurface +
topsummationdisplay
i=1
B(Ti)??i, (4.19)
where Rsurface is the contribution to the radiance from the surface and Ti is the
temperature at model level i. The values of ? at each level of the profile are derived
76
from information about the spectral and transmittance coefficients for the desired
satellite observation. ??i is the difference between the ? values at levels i and i+1
in the profile, for all levels below the top of the profile. ?? is determined at the
top of the profile by enforcing summationtexttopi=1 ??i = 1. The pCRTM finds the brightness
temperature by seeking the temperature that would provide a black body radiation
of the satellite radiance.
In the discrete form of the radiative transfer model (Equation (4.19)), ??i
indicates the influence of the radiance from the ith layer of the profile. Therefore, we
use this value of ??i to select observations using the cutoff based selection described
in Section 4.2.2. In this study, we find that selecting observations with ??i of at
least
? = 0.4?
parenleftbigg
max
i=1,...,top??i
parenrightbigg
(4.20)
somewhere within the local region provides the smallest analysis rms error. Accord-
ingly, we use this value in all the numerical experiments in this paper.
4.3.3 Truth and Observations
We perform the numerical experiments in this paper for the perfect model
scenario with the SPEEDY model. Following Miyoshi (2005), the ?true? state of
the atmosphere is obtained by running the SPEEDY model for a year from its default
rest state on January 01, 1981. We then obtain the true atmospheric trajectory by
integrating the SPEEDY model from the resulting atmospheric state at January
01, 1982 for 240 timesteps (roughly two months). The quality of the analyses is
77
assessed by comparing the resulting analysis to this true state. Specifically, we
compute the global root mean square (RMS) error between the true and analysis
states in pressure coordinates. The average RMS error is the average of these errors
over the last month of the assimilation cycle.
We create two sets of simulated observations from the known ?true? state of
the atmosphere: a set of rawinsonde observations and a set of radiance observations.
The simulated rawinsonde observations are created by adding zero mean, Gaussian
noise to the true state of the atmosphere at those points of the SPEEDY model that
are closest to the real rawinsonde observations at the 12Z synoptic time (Miyoshi,
2005). We use a standard deviation of 1 m/s for u and v, 1 K for T, 0.0001 kg/kg
for q, and 1 hPa for ps.
Similarly, we simulate 15 long-wave temperature radiance channels from the
AIRS 281 subset (wave numbers: 680.491, 691.391, 694.4, 696.052, 697.99, 699.102,
700.218, 701.618, 703.87, 706.991, 709.566, 712.739, 715.941, 723.029, 724.824 cm?1)
at each horizontal model grid point. We note that these channels represent a third
of the long-wave temperature soundings from the M-11 hardware module in the 324
subset of AIRS channels for Numerical Weather Prediction. Figure 4.1 shows the
weighting function (i.e., the values of ?? for each model layer) for representative
channels.
These radiance observations are generated by applying the pCRTM to obtain
brightness temperature observations from the true atmospheric state. A zero mean,
Gaussian noise with standard deviation 0.5 K is added to the resulting brightness
temperature observation. The bias in these observations is simulated by modifying
78
?(z) in the pCRTM as suggested by Watts and McNally (2004) and described in
Section 4.2. Because the bias is presumed unknown by the analysis scheme, the un-
modified pCRTM is used for the observation operator during the data assimilation.
However, for our simulated radiances the true bias in the observations can be
calculated by applying both the modified pCRTM and unmodified pCRTM to the
true state. The difference between these two values of the brightness temperature
represents the true satellite bias. The extent to which the estimated bias parameters
reduce the known bias provides an additional means to verify the data assimilation
scheme. Furthermore, the correlation between the true bias and the predictors
indicates which predictors can be expected to make the largest correction of the
bias and, therefore, provides an additional measure to assess the performance of the
bias correction scheme.
4.3.4 LETKF and Bias Correction
We adapt the LETKF code developed originally by Miyoshi (2005) and Har-
lim and Hunt (2007) for the assimilation of rawinsonde observations to assimilate
the rawinsonde and biased radiance observations on the SPEEDY model. In all
these numerical experiments, we use 40 ensemble members. We find that local re-
gions one level deep (v = 0) with a diameter of 7 horizontal grid points (l = 3)
provide the best analysis for rawinsonde observations alone. We find that smaller
local regions of 3 grid points in diameter (l = 1) and one level deep provide the
best analysis when we assimilate both the rawinsonde and the simulated satellite
79
observations. To avoid having the spread in the ensemble members collapse, the
numerical experiments perform multiplicative variance inflation on the background
ensemble members (both for the dynamical and the bias variables (e.g., Anderson,
2001)). Furthermore, the inflation serves as a forgetting factor so that the bias pa-
rameters can account for slowly evolving biases even though persistence is used for
their forecast model (Hunt et al., 2007). As will be further described in the next
section (Section 4.4), the dynamical variables are always inflated by 6%, while the
bias variables are inflated by either 6% or 12%.
We verify the results of the LETKF analysis by computing the root mean
square (rms) distance between the analysis and truth (rms error). Similarly, we ver-
ify the bias correction parameters by applying the ensemble mean of the estimated
bias parameters to the true values of the predictors to correct the set of biased satel-
lite observations with no noise. We refer to the rms error of these corrected satellite
observations relative to the unbiased satellite observations as the observation rms
error. The time averaged results are shown for the last month of the assimilation
experiments (February, 1982) to exclude the spin-up period.
In this study, we subtract the mean over time and space from each of the
non-constant predictors before using them for the online bias correction. Thus,
the bias estimated from the constant predictor (p1) describes the overall mean bias
and the bias estimated from the other predictors represents variations from that
mean. We create the initial bias ensemble by randomly sampling from a Gaussian
distribution with mean zero and a standard deviation based on the desired initial
spread of the ensemble. Over the two month period, we found that the analysis
80
and estimated bias parameters can be sensitive to the initial spread in the bias
parameters. Specifically, we found that the bias parameters took longer to converge
when their initial spread was large. The convergence was slower for experiments
using three or more predictors (results not shown). However, the spread must start
out large enough for the ensemble to capture the bias uncertainty. We therefore
tune the initial spread in the bias parameters to minimize the analysis errors and
observation biases over the entire two month period. Based on this tuning, we choose
the initial standard deviation of the bias parameters to be 1 K for ?1, standard
deviation 0.02 K/m for ?2 and ?3, and standard deviation 0.004 K/K for ?4 and ?5.
We expect that if reasonable initial spreads are used, the impact of this choice will
decrease to become negligible for longer analysis times.
4.4 Results
To verify the significance of the satellite observation bias created in the form
of Watts and McNally (2004), we plot the true bias (as defined in Section 3.3)
of the simulated observations at the first analysis time (06Z January 01, 1982) in
Figure 4.2 for the channel corresponding to 696.052 cm?1 (panel (a)) and the channel
corresponding to 723.029 cm?1 (panel (b)). We observe that the bias is significant
and location dependent. In particular, the bias has the smallest magnitude near the
poles and its magnitude increases near the tropics to almost 1 K. Though Figure
4.2 shows the bias for a particular time and particular channels, it qualitatively
represents the typical true bias at all times and for all simulated channels.
81
The true satellite bias can be used to infer which predictors we can expect to
correct for the bias most efficiently. Though the correlations between the predictors
and the true bias cannot be computed in reality, they enable us to further evaluate
the performance of the bias correction scheme. We plot the correlation between each
of the predictors described in Section 4.2 and the true bias of the simulated satellite
observations in Figure 4.3. We observe that the true bias is most strongly correlated
to predictors p2 and p3 for all satellite channels at a wave number below 712.739
cm?1. On the other hand, predictors p4 and p5 are more strongly correlated to the
bias for wave numbers above 715.941 cm?1. However, none of the predictors are
well correlated to the bias for channels at wave numbers 700.218 and 701.618 cm?1.
This poor correlation is observed because the bias of these channels is substantially
smaller than the others (a fact which we will explore later in this section). Overall,
predictors p3 and p4 show the strongest correlations to the true bias. We also find
these predictors to be more useful than p2 and p5 in reducing the analysis error.
For completeness, we will show results using all five predictors at the end of this
section. Here, we discuss results using only predictors p1 (constant), p3 (850 - 300
hPa thickness), and p4 (skin temperature).
In Figure 4.4, we plot the global average of the 500 hPa temperature analysis
rms error over time when inflating all background variables by 6%. For reference, we
plot the analysis rms error when assimilating rawinsonde observations alone (cyan),
rawinsonde and biased satellite observations using the unmodified pCRTM (gray),
and rawinsonde and biased satellite observations using the modified pCRTM (black).
From these reference runs, it is apparent that the biased satellite observations have
82
a significant negative impact on the analysis. Nonetheless, were these observations
unbiased they could dramatically improve the analysis over that obtained from as-
similating the rawinsonde observations alone.
The analysis rms error for the bias correction is shown in Figure 4.4 for the
following combination of predictors: p1 (green); p1 and p3 (blue); p1 and p4 (red);
and p1, p3, and p4 (purple). All these bias correction experiments improve the
analysis rms error significantly. The skin temperature, p4, has the most significant
impact on the 500 hPa analysis. When using p1, p3, and p4, the 500 hPa temperature
analysis rms error more slowly reduces to the value obtained from using p1 and p4
alone. The results are similar for the surface pressure (results not shown). The
bias correction clearly has an impact on the 500 hPa zonal wind, meridional wind,
and humidity analyses (results not shown). However, there appears to be a less
significant difference between the different predictors for these variables. These
results are further summarized for all model levels in Table 4.1. This table shows
that predictors p1 and p4 improve the analysis rms error in most lower levels, while
p1 and p3 improve the analysis most in the higher model levels.
In order for the online bias correction to continually update the bias estimate,
the spread (standard deviation) of the ensemble of bias parameters must remain
significantly larger than zero. Therefore, Figure 4.5 plots the background spread for
each of the predictors for a representative subset of the channels corrected in the
run with p1, p3, and p4 (panels (a), (b), and (c), respectively) over time. The spread
in the bias parameters for channel 723.029 cm?1 remains nearly constant after the
spin-up time. However, for the majority of the other channels the spread of the bias
83
parameters decreases dramatically as the analysis cycle proceeds. A similar decrease
in the bias spread is observed for bias correction experiments with all combinations
of the predictors (results not shown). For channels like 723.029 cm?1, where the bias
remains nearly constant, we observe that the weighting function is above the cutoff
value for several model levels (results not shown). Therefore,the analysis of the bias
parameters corresponding to these channels combines information from more local
regions, keeping the spread relatively large.
Multiplicative variance inflation often prevents the background ensemble from
collapsing and serves as a forgetting factor of past observations and dynamics. How-
ever, increasing the variance inflation of all background variables degrades the qual-
ity of the analysis and does not prevent the spread of the bias parameters from
collapsing (results not shown). Instead, we compensate for the decreasing spread in
the ensemble of bias parameters by inflating these variables by 12% and retaining
the 6% variance inflation for the other analysis variables. In this case, the analysis
rms error over time is shown at 500 hPa in Figure 4.6 and is similar to that obtained
using a 6% variance inflation for all variables (Figure 4.4). Here, the analysis error
associated with using p1, p3, and p4 reaches its optimal value more quickly. The
similarity between the analysis error when using different inflation factors holds at
all models levels (Table 4.1 for 6% inflation of all variables and Table 4.2 for 12%
inflation of bias variables). Furthermore, Figure 4.7 reveals that the spread of the
bias correction run with p1, p3, p4 (panels (a), (b), and (c), respectively) plateaus to
a more reasonable value for all channels. The results are similar for the other bias
correction experiments (results not shown).
84
We also explore the impact of the bias correction on the global, time averaged
temperature analysis RMS error for at all seven SPEEDY model levels in Figure 4.8.
The color scheme in this figure is the same as that used in Figure 4.4. In this figure,
we see that the p4 term provides the greatest improvement to the analysis at low
levels, below 400 hPa. On the other hand, p3 provides the greatest improvement to
the analysis above 300 hPa. The combination of predictors p1, p3, and p4 provides
the analysis with the smallest error at almost all levels. Below 500 hPa, the analysis
error almost as low as that obtained from assimilating the satellite observations
with the true observation operator. We further summarize the global, time averaged
temperature rms error for all the predictors proposed in Section 4.2 in Table 4.2.
As we mentioned earlier, we find that using p2 or p5 also decreases the analysis rms
error, but not as significantly as using p3 or p4. Also, including p2 and p5 in the bias
correction with p1, p3, and p4 does not improve the analysis rms error.
Ideally, the bias parameters predicted by the data assimilation improve the
analysis rms errors because they reduce the bias in the observations. Figure 4.9
plots the global, time averaged observation rms error for each of the choices of
predictors for each satellite channel. We observe that p3 improves the observation
bias most for low wave numbers, while p4 improves the observation bias for higher
wave numbers. This result is consistent with the structure of the correlation between
the true observation bias and these predictors shown in Figure 4.3. Furthermore,
using predictors p1, p3, and p4 together significantly reduces the observation rms for
most channels. We observe that the bias correction scheme is unable to correct the
already small biases for channels at wave numbers of 700.218 and 701.618 cm?1.
85
Closer inspection reveals that the weighting function for these channels is such
that they rarely meet the localization criterion necessary for them to be used in
the analysis. Therefore, these observations have little impact on the analysis of
the atmospheric variables or the bias parameters. Accordingly, the estimated bias
parameters do not improve the relatively small biases for these channels. The global,
time averaged observation rms error is summarized for selected channels using all
choices of predictors in Table 4.3. As for the analysis errors, we see that none of the
results using p2 or p5 improve significantly compared to the results using p3 and/or
p4. We also see that for each of the channels the predictors that are best correlated
with the true bias (see Figure 4.3) yield the greatest improvement in the observation
rms error.
We observe in Figure 4.9 that for the lower wave number channels, using cer-
tain combinations of the predictors increases the rms error of the observations above
their value with no bias correction. However, the bias correction still significantly
improves the analysis rms error throughout the atmosphere (Figures 4.6 and 4.8).
For example, while using p1 (constant) improves the analysis at all but the high-
est model level, this predictor only improves the observation rms error for channels
with wave numbers above 701.618 cm?1 (Figure 4.9). To explore the impact of the
increased observation rms error further, we tried using p1 to correct only the bias for
channels above 701.618 cm?1. We plot the resulting observation and analysis rms
errors in panels (a) and (b) of Figure 4.10, respectively. The observation rms error
from using p1 to correct the bias only for channels above wave number 701.618 cm?1
(dotted) mirrors that obtained from using p1 for all channels (dashed) above 701.618
86
cm?1 and the true observation bias (solid in panel (a)) for the rest of the channels.
In spite of this apparent improvement in the overall observation rms error, using p1
to correct the bias only for channels above 701.618 cm?1 degrades the temperature
analysis rms error for the six lower model levels.
4.5 Conclusions
Our results show that ensemble data assimilation schemes, such as LETKF,
can estimate several observation bias parameters as part of the assimilation pro-
cedure. If correlated to the true observation bias, the estimated bias parameters
significantly reduce the analysis error and observation bias. In some cases, the ob-
servation biases may be increased in order to reduce the analysis error. Furthermore,
making the inflation parameter larger for the bias parameters than for the model
variables prevents the spread in the bias correction from collapsing. In this way, the
online bias estimation can respond to slow changes in the bias parameters.
Computing a global average of the local estimates for the bias parameters
ensures accuracy, efficiency, and flexibility. For one, obtaining a global estimate
of the bias parameters ensures that they do not capture the random errors from
a small set of observations in place of the bias. Furthermore, global averaging
effectively reduces the dimensionality of the bias problem so that no additional
ensemble members are required to estimate the bias parameters. Finally, averaging
provides flexibility because it could be applied to estimate the bias parameters over
any sufficiently large region. In this way, the bias correction scheme could obtain
87
bias estimates over latitude strips as easily as it obtains them globally in this paper.
Though we applied the observation bias correction scheme to a particular
formulation of the ensemble Kalman filter in this paper, we expect the results would
hold for any ensemble scheme. Furthermore, they need not be limited to estimating
bias parameters which combine with predictors to determine the bias. For example,
this correction scheme could as easily be applied to estimate the bias parameters
of Watts and McNally (2004). As for other online observation bias schemes, bias
correction in ensemble Kalman filter schemes require that there be only a limited
number of bias parameters to avoid overfitting. In this way, there is great flexibility
to estimating the observation biases with an ensemble Kalman filter.
88
4.6 Figures
Figure 4.1: The weighting function (profile of ?? at 180 W and 35 N) for represen-
tative channels: 680.491 cm?1 (black), 697.99 cm?1 (red, similar shape to channels
691.391 cm?1 - 699.102 cm?1), 700.218 cm?1 (green, similar to channel 701.618
cm?1), 706.991 cm?1 (dark blue, similar to channels 703.87 cm?1 - 709.991 cm?1),
712.739 cm?1 (light blue), and 723.029 cm?1 (magenta, similar to channels 715.941
cm?1 - 724.824 cm?1).
89
(a) Channel 696.052 cm?1
(b) Channel 723.029 cm?1
Figure 4.2: The true bias for two representative channels. Channel 696.052 cm?1
is in panel (a) while channel 723.029 cm?1 is in panel (b). The values indicated on
the plot are in Kelvin.
90
Figure 4.3: The correlation between the true bias and each of the predictors for each
satellite channel. (p2 is dashed; p3 is solid; p4 is dash-dotted; and p5 is dotted.)
91
Figure 4.4: The 500 hPa temperature analysis rms error for each of the analysis
times. The following color scheme is used in this figure: the results for rawinsonde
observations alone are cyan; for rawinsonde and biased satellite observations using
the unmodified pCRTM are gray; for rawinsonde and biased satellite observations
using the modified pCRTM are black; for predictor p1 are green; for predictors p1
and p3 are blue; and for predictors p1 and p4 are red; and p1, p3, and p4 are purple.
Here, both the analysis and bias variables are inflated by 6%.
92
(a) Spread of p1
(b) Spread of p3
(c) Spread of p4
Figure 4.5: The spread in the background bias parameters for the run with predictors
p1, p3, and p4. The spread of these predictors is in panels (a) to (c), respectively.
Each line represents a different satellite channel, where the solid line is for channel
680.491 cm?1, the dotted line is for channel 696.052 cm?1, the dash-dotted line is
for channel 709.566 cm?1, and the dashed line is for channel 723.029 cm?1.93
Figure 4.6: As for Figure 4.4. Here, the analysis variables are inflated by 6% while
the bias parameters are inflated by 12%.
94
(a) Spread of p1
(b) Spread of p3
(c) Spread of p4
Figure 4.7: As for Figure 4.5. Here, the analysis variables are inflated by 6% while
the bias parameters are inflated by 12%.
95
Figure 4.8: The global, February average of the temperature analysis rms error at
each model level. The color scheme for this plot is the same as that used in Figure
4.4. Here, the analysis variables are inflated by 6% while the bias parameters are
inflated by 12%.
96
Figure 4.9: The global, February average of the observation rms error for each
satellite channel. The observation rms error obtained when using no predictors is
solid; for p1 is dashed; for p1 and p3 is thickly dotted with open circles; for p1 and
p4 is dash-dotted; and for p1, p3, and p4 is lightly dotted.
97
(a) Observation rms error
(b) Temperature analysis rms error
Figure 4.10: The results from using p1 to correct only channels above 701.618 cm?1
(dotted) as compared to using p1 at all levels (dashed). In panel (a) showing the
RMS error of the observations, the true bias is plotted (solid) for reference. In panel
(b) showing the temperature analysis, we plot the results obtained from using the
true modified observation operator (solid).
98
Model Level
Experiment 1 2 3 4 5 6 7
modified pCRTM (unbiased) 0.25 0.25 0.25 0.31 0.25 0.32 0.19
p1 0.40 0.44 0.46 0.53 0.30 0.79 0.79
p1, p3 0.33 0.34 0.37 0.42 0.28 0.46 0.25
p1, p4 0.27 0.28 0.28 0.35 0.29 0.72 0.58
p1, p3, p4 0.29 0.29 0.31 0.38 0.30 0.50 0.28
Table 4.1: The global, February average of the temperature analysis RMS error
(K) at each model level for selected combinations of predictors. Here, 6% variance
inflation is applied to the analysis variables and 6% variance inflation is applied to
the bias parameters.
Model Level
Experiment 1 2 3 4 5 6 7
modified pCRTM (unbiased) 0.25 0.25 0.25 0.31 0.25 0.32 0.19
p1 0.38 0.41 0.44 0.53 0.31 0.77 0.75
p1, p2 0.37 0.40 0.44 0.51 0.34 0.60 0.33
p1, p3 0.33 0.34 0.37 0.42 0.32 0.57 0.28
p1, p4 0.27 0.28 0.29 0.36 0.31 0.71 0.49
p1, p5 0.29 0.30 0.32 0.38 0.32 0.77 0.60
p1, p3, p4 0.26 0.27 0.28 0.36 0.30 0.53 0.25
p1, p2, p3, p4, p5 0.26 0.27 0.28 0.39 0.33 0.56 0.34
Table 4.2: As for Table 4.2. Here, 6% variance inflation is applied to the analysis
variables and 12% variance inflation is applied to the bias parameters.
4.7 Tables
99
Channel (cm?1)
Experiment 680.491 696.052 709.566 723.029
unmodified pCRTM (biased) 0.35 0.50 0.96 0.83
p1 0.37 0.51 0.32 0.23
p1, p2 0.20 0.21 0.16 0.21
p1, p3 0.21 0.20 0.12 0.14
p1, p4 0.32 0.38 0.20 0.05
p1, p5 0.37 0.47 0.27 0.09
p1, p3, p4 0.13 0.13 0.09 0.01
p1, p2, p3, p4, p5 0.19 0.12 0.14 0.02
Table 4.3: The global, February average of the observation RMS error (K) for rep-
resentative channels when applying the bias correction parameters estimated using
6% variance inflation for the analysis variables and 12% for the bias parameters.
100
Bibliography
Anderson, J. 2001. An ensemble adjustment Kalman filter for data assimilation.
Monthly Weather Review, 129, 2884?2903.
Anderson, J. 2003. A local least squares framework for ensemble filtering. Monthly
Weather Review, 131, 634?642.
Baek, S.-J., Hunt, B., Kalnay, E., Ott, E., and Szunyogh, I. 2006. Local ensemble
Kalman filtering in the presence of model bias. Tellus, 58A, 293?306.
Bishop, C., Etherton, B., and Majumdar, S. 2001. Adaptive sampling with the
ensemble transform Kalman filter. Part I: Theoretical aspects. Monthly Weather
Review, 129, 420?436.
Burgers, G., van Leeuwen, P., and Evensen, G. 1998. Analysis scheme in the en-
semble Kalman filter. Monthly Weather Review, 126, 1719?1724.
Cameron, J. 2003. The effictiveness of the AIRS bias correction for various air-mass
predictor combinations. Technical report, Met Office NWP.
Courtier, P., Th?epaut, J.-N., and Hollingsworth, A. 1994. A strategy for operational
implementation of 4D-VAR, using an incremental approach. Quarterly Journal
of Royal Meteorological Society, 120, 1367?1387.
Dee, D. 2005. Bias and data assimilation. Quarterly Journal of the Royal Meteoro-
logical Society, 131, 3323?3343.
Derber, J. and Wu, W.-S. 1998. The use of TOVS cloud-cleared radiances in the
NCEP SSI analysis system. Monthly Weather Review, 126, 2287?2299.
Evensen, G. 1994. Sequential data assimilation with a nonlinear quasi-geostrophic
model using Monte Carlo Methods to forecast error statistics. Journal of Geo-
physical Research, 99, 10143?10162.
Evensen, G. and van Leeuwen, P. 2000. An ensemble Kalman smoother for nonlinear
dynamics. Monthly Weather Review, 128, 1852?1867.
Eyre, J. 1992. A bias correction scheme for simulated TOVS brightness tempera-
tures. Technical Memorandum 186, ECMWF, Reading, UK.
Fertig, E., Hunt, B., Ott, E., and Szunyogh, I. 2007a. Assmilating nonlocal obser-
vations with a local ensemble Kalman filter. Tellus Ser. A, (Submitted).
Fertig, E., Harlim, J., and Hunt, B. 2007b. A comparative study of 4D-VAR and
4D ensemble Kalman filter: Perfect model simulations with Lorenz-96. Tellus A,
59, 96?101.
Friedland, B. 1969. Treatment of bias in recursive filtering. IEEE Trans. Autom.
Contr., 14, 359?367.
101
Gaspari, G. and Cohn, S. 1999. Construction of correlation functions in two and
three dimensions. Quarterly Journal of Royal Meteorological Society, 125, 723?
757.
Hamill, T., Whitaker, J., and Snyder, C. 2001. Distance-dependent filtering of
background covariance estimates in an ensemble Kalman filter. Monthly Weather
Review, 129, 2776?2790.
Han, Y., van Delst, P., Liu, Q., Weng, F., Yan, B., and Derber, J. 2005. User?s
Guide to the JCSDA Community Radiative Transfer Model (Beta Version). Joint
Center for Satellite Data Assimilation, Camp Springs, Maryland.
Harlim, J. and Hunt, B. 2007. A non-gaussian ensemble filter for assimilating infre-
quent noisy observations. Tellus, 59A, 225?237.
Harris, B. and Kelly, G. 2001. A satellite radiance-bias correction scheme for data
assimilation. Quarterly Journal of Royal Meteorological Society, 127, 1453?1468.
Harris, B., Cameron, J., Collard, A., and Saunders, R. 2004. Effect of air-mass
predictor choice on the AIRS bias correction at the Met Office. Technical report,
Thirteenth International TOVS Study.
Houtekamer, P. and Mitchell, H. 1998. Data assimilation using an ensemble Kalman
filter technique. Monthly Weather Review, 126, 796?811.
Houtekamer, P. and Mitchell, H. 2001. A sequential ensemble Kalman filter for
atmospheric data assimilation. Monthly Weather Review, 129, 123?137.
Houtekamer, P. and Mitchell, H. 2005. Ensemble Kalman filtering. Quarterly Jour-
nal of the Royal Meteorological Society, 131, 3269?3289.
Houtekamer, P., Mitchell, H., Pellerin, G., Buehner, M., and M. Charron, e. a. 2005.
Atmospheric data assimilation with the ensemble Kalman filter: Results with real
observations. Monthly Weather Review, 133, 604?620.
Hunt, B., Kalnay, E., Kostelich, E., Ott, E., Patil, D., Sauer, T., Szunyogh, I., Yorke,
J., and Zimin, A. 2004. Four-dimensional ensemble Kalman filtering. Tellus, 56A,
273?277.
Hunt, B., Kostelich, E., and Szunyogh, I. 2007. Efficient data assimilation for
spatiotemporal chaos: a local ensemble transform Kalman filter. Physica D., (In
Press).
Joiner, J. and da Silva, A. 1998. Efficient methods to assimilate remotely sensed
data based on information content. Quarterly Journal of the Royal Meteorological,
124(549), 1669?1694.
Kalman, R. 1960. A new approach to linear filtering and predictions problems.
Journal of Basic Engineering, 82 D, 35?45.
102
Kalman, R. and Bucy, R. 1961. New results in linear filtering and prediction theory.
Journal of Basic Engineering, 83 D, 95?107.
Kalnay, E. 2002. Atmospheric Modeling, Data Assimilation, and Predictability.
Cambridge University Press.
Kalnay, E., Hunt, B., Ott, E., and Szunyogh, I. 2006. Ensemble forecasting and data
assimilation: two problems with the same solution? In T. Palmer and R. Hage-
dorn, editors, Predictability of Weather and Climate. Cambridge University Press.
Keppenne, C.1999. Dataassimilationintoaprimitive-equationmodelwithaparallel
ensemble Kalman filter. Monthly Weather Review, 128, 1971?1981.
Keppenne, C., Rienecker, M., Kurowski, N., and Adamec, D. 2005. Ensemble
Kalman filter assimilation of temperature and altimeter data with bias correc-
tion and application to seasonal prediction. Nonlinear Processes in Geophysics,
12, 491?503.
Kuhl, D., Szunyogh, I., Kostelich, E. J., Patil, D. J., Gyarmati, G., Oczkowski, M.,
Hunt, B. R., Kalnay, E., Ott, E., and Yorke, J. A. 2007. Assessing predictability
with a local ensemble kalman filter. Submited to Journal of the Atmospheric
Sciences, (In Press).
Lawson, L., Spitz, Y., Hofmann, E., and Long, R. 1995. A data assimilation tech-
nique applied to a predator-prey model. Bulletin of Mathematical Biology, 57(4),
593?617.
Le Dimet, F.-X. and Talagrand, O. 1986. Variational algorithm for analysis and
assimilation of meteorological observations: Theoretical aspects. Tellus, 38A,
97?110.
Liou, K. 2002. An Introduction to Atmospheric Radiation. Academic Press, New
York, second edition.
Liu, J., Fertig, E., Kalnay, E., Hunt, B., Kostelich, E., Szunyogh, I., and Todlingo,
R. 2007.
Lorenc, A. 2003. The potential of the ensemble Kalman filter for NWP-a comparison
with 4D-Var. Quarterly Journal of Royal Meteorological Society, 129, 3183?3203.
Lorenz, E. 1996. Predictability - a problem partly solved. In Proceedings on pre-
dictability, held at ECMWF on 4-8 September 1995.
Miyoshi, T. 2005. Ensemble Kalman filter experiments with a primitive-equation
global model. Ph.D. thesis, University of Maryland.
Molteni, F. 2003. Atmospheric simulations using a GCM with simplified physics
parameterizations. i: Model climatology and variability in multi-decadal experi-
ments. Clim. Dyn., 20, 175?191.
103
Oczkowski, M., Szunyogh, I., and Patil, D. J. 2005. Mechanisms for the development
of locally low dimensional atmospheric dynamics. Journal of the Atmospheric
Sciences, 62, 1135?1156.
Ott, E., Hunt, B., Szunyogh, I., Zimin, A., Kostelich, E., Corrazza, M., Kalnay,
E., and Yorke, J. 2004. A local ensemble Kalman filter for atmospheric data
assimilation. Tellus, 56A, 415?428.
Parrish, D. and Cohn, S. 1985. A Kalman filter for a two-dimensional shallow-water
model: Formulation and preconditioning experiments. Office Note 304, National
Meteorological Center, Washington, DC.
Parrish, D. and Derber, J. 1992. The National Meteorological Center?ss spectral
statistical-interpolation analysis system. Monthly Weather Review, 120(8), 1747?
1763.
Patil, D., Hunt, B. R., Kalnay, E., Yorke, J. A., and Ott, E. 2001. Local low
dimensionality at atmospheric dynamics. Physical Review Letters, 86(26), 5878?
5881.
Press, W., B.P.Flannery, S.A.Teukolsky, and Vetterling, W. 1992. Numerical recipes
in Fortran. Cambridge University Press.
Rabier, F., Th?epaut, J.-N., and Courtier, P. 1998. Extended assimilation and fore-
cast experiments with a four-dimensional variational assimilation system. Quar-
terly Journal of Royal Meteorological Society, 124, 1861?1887.
Rabier, F., J?arvinen, H., Mahfouf, J.-F., and Simmons, A. 2000. The ECMWF
operational implementation of four-dimensional variational assimilation: Experi-
mental results with simplified physics. Quarterly Journal of Royal Meteorological
Society, 126, 1148?1170.
Rodgers, C. 2000. Inverse Methods for Atmospheric Sounding: Theory and Practice.
World Scientific Publishing.
Szunyogh, I., Kostelich, E., Gyarmati, G., Patil, D., Hunt, B., Kalnay, E., Ott, E.,
and Yorke, J. 2005. Assessing a local ensemble Kalman filter: Perfect model ex-
periments with the National Centers for Environmental Prediction global model.
Tellus, 57A, 528?545.
Szunyogh, I., Kostelich, E., Gyarmati, G., Kalnay, E., Hunt, B., Ott, E., and Yorke,
J. A. 2007. A local ensemble transform Kalman filter data assimilation for the
NCEP global model. Tellus, (Submitted).
Th?epaut, J.-N., Hoffman, R., and Courtier, P. 1993. Interactions of dynamics and
observations in a four-dimensional variational assimilation. Monthly Weather Re-
view, 121, 3393?3414.
104
Tippett, M., Anderson, J., Bishop, C., Hamill, T., and Whitaker, J. 2003. Ensemble
square-root filters. Monthly Weather Review, 131, 1485?1490.
Wang, X., Bishop, C., and Julier, S. 2004. Which is better, an ensemble of positive-
negative pairs or a centered spherical simplex ensemble? Monthly Weather Re-
view, 132(7), 1590?1605.
Wang, X., Hamill, T., Whitaker, J., and Bishop, C. 2006. A comparison of hy-
brid ensemble transform Kalman filter-OI and ensemble square-root filter analysis
schemes. Monthly Weather Review.
Watts, P. and McNally, A. 2004. Identification and correction of radiative transfer
modelling errors for atmospheric sounders: AIRS and AMSU-A. In ECMWF
Workshop on Assimilation of high spectral resolution sounders in NWP.
Whitaker, J. and Hamill, T. 2002. Ensemble data assimilation without perturbed
observations. Monthly Weather Review, 130, 1913?1924.
Whitaker, J., Compo, G., Wei, X., and Hamill, T. 2004. Reanalysis without ra-
diosondes using ensemble data assimilation. Monthly Weather Review, 132, 1190?
1200.
Whitaker, J., Hamill, T., and Wei, X. 2007. Ensemble data assimilation with the
ncep global forecast system. Monthly Weather Review, (Submitted).
Zupanski, D. and Zupanski, M. 2006. Model error estimation employing an ensemble
data assimilation approach. Monthly Weather Review, 134, 1337?1354.
Zupanski, M. 2005. Maximum likelihood ensemble filter: Theoretical aspects.
Monthly Weather Review, 133, 1710?1726.
105