ABSTRACT
Title of dissertation: IMPROVING U.S. EXTREME
PRECIPITATION PREDICTION
AND PROCESS UNDERSTANDING
USING A MESOSCALE CLIMATE MODEL
MULTI-PHYSICS ENSEMBLE APPROACH
Chao Sun
Doctor of Philosophy, 2019
Dissertation directed by: Professor Xin-Zhong Liang
Department of Atmospheric and Oceanic Science
Despite many recent improvements, climate models continue to poorly sim-
ulate extreme precipitation. I attempted to improve prediction of extreme precip-
itation, focusing on daily 95th percentile (P95) events, and to better understand
the source of model biases in three ways: 1) determine which physics processes
P95 is most sensitive to and which parameterization schemes best represent these
processes; 2) understand the underlying mechanisms through which these processes
impact P95; and 3) maximize advantages from the ensemble of the best performing
models.
First, to determine the sensitive processes affecting P95, I tested a 25-member
ensemble of different physics configurations in the regional Climate-Weather Re-
search and Forecasting model (CWRF) for 36-yr historical U.S. simulations. Of
these, P95 simulation was most sensitive to cumulus parameterization. Overall,
the ensemble cumulus parameterization best represented P95 seasonal mean spatial
patterns and interannual variations, while one traditional cumulus scheme generally
overestimated P95 and the other three severely underestimated P95, especially over
the Gulf States (GS) and the Central-Midwest States (CM) in convection-dominated
seasons.
Second, I built structural equation models (SEMs) to identify the underlying
processes through which cumulus parameterization affects precipitation. I discov-
ered five distinct physical mechanisms, each involving unique interplays among water
and energy supplies and surface and cloud forcings. The relative importance of these
factors varied significantly by season and region. For example, water supply is the
dominant factor for P95 in CM, but its effect reversed from positive in summer to
negative in winter due to changes in the prevailing precipitation system. In contrast,
the predominant factors affecting P95 in GS were cloud forcing in summer, but sur-
face forcing in winter. Since the choice of cumulus parameterization affected how
water and energy supplies acted through surface and cloud forcings, it determined
CWRF?s ability to simulate extreme precipitation.
Third, I improved P95 prediction by developing an optimized multi-model
ensemble based on the Bayesian Model Averaging (BMA) approach. BMA is a
model-selection method that weights ensemble members to create an optimal com-
posite. However, many BMA methods rely on maximum likelihood estimation and
thus may be flawed when the true solution is not among the ensemble, as is the case
in extreme precipitation. To resolve this issue, I adapted three BMA variations to fit
the needs of extreme precipitation problems. These methods significantly improved
performance compared to both the ensemble mean and the single best model, and
provided a more reliable confidence interval.
My work shows that to improve extreme precipitation simulation, a better
understanding of physics processes, especially cumulus processes, is critical. For
this, I applied the SEM framework, for the first time in the climate community, to
uncover the underlying physical mechanisms essential to regional extreme precipi-
tation predictions. Furthermore, I adapted new BMA methods into extreme pre-
cipitation ensembles to maximize the benefits from the most physically advanced
models. These advances may help improve the prediction of extreme precipitation
occurrences and future changes, one of the most difficult modeling challenges and
one with huge socioeconomic significance.
IMPROVING U.S. EXTREME
PRECIPITATION PREDICTION
AND PROCESS UNDERSTANDING
USING A MESOSCALE CLIMATE MODEL
MULTI-PHYSICS ENSEMBLE APPROACH
by
Chao Sun
Dissertation submitted to the Faculty of the Graduate School of the
University of Maryland, College Park in partial fulfillment
of the requirements for the degree of
Doctor of Philosophy
2019
Advisory Committee:
Professor Xin-Zhong Liang, Chair/Advisor
Professor Da-Lin Zhang
Professor Michael N. Evans
Professor Russell R. Dickerson
Professor Raghu Murtugudde
Professor Wei-Kuo Tao
?c Copyright by
Chao Sun
2019

Acknowledgments
First and foremost, I would like to thank my beloved advisor, Professor Xin-
Zhong Liang for giving me an invaluable opportunity to work on challenging and
exciting projects over the past years. He has always made himself available for help
and advice. He even helped my project at 3 a.m. and ready to pick up my call
whenever he is awake. Once I heard his wife urged him several times to eat, but he
still worked with me on my project. He is a terrible workaholic, sometimes I really
wonder whether he has time to sleep. His attitude towards research is ultra-strick,
which showed me the right way to become a scholar. He edited and commented
almost every word in my papers, and provided enormous detailed suggestions on
how to write it. More important, whenever I was stuck in my project, he always
can come up with a genius solution.
Furthermore, he also took great care of my life, supported me not only by
scholarship but also by his wisdom and experience. Because of the protection from
his big umbrella, I can focus on my studies peacefully for past years. Usually, he
hosted several parties a year at his home. He and his family members prepared
the most delicious food for us, which is really unforgettable. Sometimes, he even
brought some fresh vegetables from his garden and let us enjoy them together. It
has been a great pleasure to work with him and learn from such an extraordinary
scientist.
I sincerely thank Jennifer Kennedy for careful editing. She is the most thought-
ful and patient person I have ever seen on the planet of earth. She spent tremendous
ii
time and effort in helping my writing and provided me many practical suggestions.
I owe my deepest thanks to my family. Words cannot express the gratitude I
owe them.
The CWRF simulations and analyses were conducted on the supercomputers,
including the University of Illinois? Blue Water, the Maryland Advanced Research
Computing Center?s Bluecrab, the Computational and Information Systems Lab
of the National Center for Atmospheric Research, and the National Energy Re-
search Scientific Computing Center of the U.S. Department of Energy. I thank
Kenneth Kunkel for providing the COOP station data. The radiation data was
obtained from the SRB/GEWEX product, courtesy of the Langley Research Center
Atmospheric Science Data Center, USGS Earth Resources Observation and Science
Center. The research was supported by U.S. National Science Foundation Innova-
tions at the Nexus of Food, Energy and Water Systems under Grant EAR-1639327,
U.S. Department of Agriculture UV-B Monitoring and Research Program at Col-
orado State University under the National Institute of Food and Agriculture Grant
2015-34263-24070, and U.S. Environmental Protection Agency Science to Achieve
Results under Assistance Agreement No. RD83587601. The views expressed in this
document are solely those of the authors and do not necessarily reflect those of the
funding Agency.
iii
Table of Contents
Acknowledgements ii
Table of Contents iv
List of Tables vi
List of Figures vii
List of Abbreviations xi
1 Introduction 1
1.1 Background and Motivation . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Improving extreme precipitation simulation through a multiphysics
approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Improving extreme precipitation simulation by better physical under-
standing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 Improving extreme precipitation simulation by optimal weightings . . 7
1.5 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2 Sensitivity to physics parameterizations 10
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 Model description, experiment design, observations, and extreme in-
dices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3 General performance of seasonal mean and extreme precipitation . . . 18
2.4 Physics sensitivity of regional extreme precipitation simulation . . . . 23
2.5 Summary and discussion . . . . . . . . . . . . . . . . . . . . . . . . . 39
3 Dependence on cumulus parameterization and underlying mechanism 46
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.2 Model description, experiment design, causal ingredients, and obser-
vations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.3 Extreme precipitation dependence on cumulus parameterization . . . 56
3.4 Correlation analysis of regional extreme precipitation biases . . . . . 62
3.5 Process understanding of P95 biases by structural equation modeling 74
3.6 Summary and conclusions . . . . . . . . . . . . . . . . . . . . . . . . 88
iv
4 Improvement by Markov Chain Monte Carlo based Bayesian model average 93
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.2 Model description, observations . . . . . . . . . . . . . . . . . . . . . 96
4.3 Definitions of extreme precipitation, OptiRankDSCV, MODE tool . . 99
4.4 Performance analysis of individual ensemble members . . . . . . . . . 104
4.5 Bayesian model average methods . . . . . . . . . . . . . . . . . . . . 118
4.6 Ensemble performance analysis . . . . . . . . . . . . . . . . . . . . . 120
4.7 Summary and conclusions . . . . . . . . . . . . . . . . . . . . . . . . 132
5 Future work: projections of future extreme precipitation changes and impacts137
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
v
List of Tables
2.1 Physical processes, parameterizations, and their references. . . . . . 13
2.2 Ensemble experiment physics configurations. . . . . . . . . . . . . . 15
3.1 Cumulus schemes closures, triggers, entrainments, and their references. 51
3.2 Observations available years and their references. . . . . . . . . . . . 55
4.1 CPN experiment model configurations, parameterizations, and their
references. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.2 Extreme indicators, their definations and units. . . . . . . . . . . . . 100
vi
List of Figures
2.1 Geographic distributions of 1980-2015 mean seasonal precipitation
amount [mm day?1] observed (OBS), assimilated (ERI), and simu-
lated by CWRF control ECP for winter (DJF), spring (MAM), sum-
mer (JJA), and autumn (SON). . . . . . . . . . . . . . . . . . . . . . 18
2.2 Same as Fig. 2.1 except for the number of rainy days (NRD). . . . . 20
2.3 Same as Fig. 2.1 except for the daily 95th percentile precipitation
(P95) [mm day?1]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.4 Boundary specification of the two key regions where ERI severely
underestimated extreme precipitation: the Gulf States (GS) in spring
and the Central to Midwest States (CM) in summer. . . . . . . . . . 24
2.5 Comparison among ERI and all CWRF physics configurations in sim-
ulating 1980-2015 mean seasonal P95 [mm day?1] biases (from obser-
vations) averaged over the GS (left) and CM (right) for winter (DJF),
spring (MAM), summer (JJA), and autumn (SON). They are sepa-
rated by color into type I (blue) and type II (red) members depending
on their cumulus schemes. . . . . . . . . . . . . . . . . . . . . . . . . 26
2.6 Geographic distributions of 1980-2015 mean daily 95th percentile pre-
cipitation (P95) biases from observations [mm day?1] for winter (DJF),
spring (MAM), summer (JJA), and autumn (SON) as assimilated
(ERI) and simulated by five CWRF members varying only the cumu-
lus scheme (ECP, NKF, TDK, NSAS, BMJ). . . . . . . . . . . . . . . 29
2.7 Same as Fig. 2.6 except for the number of rainy days (NRD) biases. . 31
2.8 Same as Fig. 2.6 except for the daily rainfall intensity (DRI) biases
[mm day?1]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.9 Taylor diagram of pattern statistics comparing the overall perfor-
mance among ERI and all CWRF physics configurations in simulating
1980-2015 mean seasonal P95 geographic distributions over the GS
(left) and CM (right) regions for winter (DJF), spring (MAM), sum-
mer (JJA), and autumn (SON). Shown are the pattern correlation
(azimuthal) and normalized standard deviation (radius) compared
with observations. The black dot (OBS) marks the perfect score with
a unit correlation and deviation. Off the chart are outliers performing
poorly, with correlations and deviations indicated in the parentheses. 34
vii
2.10 The equitable threat score (ETS) that measures the overall skill de-
pendence on rainfall intensity for ERI and all CWRF physics con-
figurations in simulating 1980-2015 mean seasonal P95 geographic
distributions over the GS (left) and CM (right) regions for winter
(DJF), spring (MAM), summer (JJA), and autumn (SON). The x-
axis depicts the P95 thresholds at a 1.0 [mm day?1] bin interval, while
the y-axis scores the ETS values. . . . . . . . . . . . . . . . . . . . . 36
2.11 Summer mean vertical potential temperature ( ?,star) and water va-
por ( qv, curve) tendency profiles among the five cumulus schemes
(color) for 2004 as averaged over all the grids having rainfall greater
than 50 [mm day?1] within the CM (left) and GS (right) regions.
Also labeled at the altitude of the profile peak is the number that de-
picts the tendency?s vertical integral for the respective scheme coded
with the same color. . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.1 The teleconnection patterns of the CM and GS regional mean pre-
cipitation interannual variations correlated with 850 hPa meridional
wind distributions for winter (DJF), spring (MAM), summer (JJA),
and autumn (SON). Outlined separately for CM and GS is the core
correlation area common to all seasons, where the V850 index is defined. 53
3.2 Seasonal P95 interannual variations during 1980-2015 averaged over
the CM (left) and GS (right) regions for spring (MAM), summer
(JJA), autumn (SON), and winter (DJF) [from top down] as observed
(OBS), assimilated by NR2 and ERI, and simulated by CWRF using
five cumulus schemes (ECP, NKF, TDK, NSAS, BMJ). Also shown
(bottom) are the corresponding temporal correlations (COR, scaled
upward on the left) and root mean scare errors (RMSE, scaled down-
ward on the right) with respect to observations during the whole
period. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.3 Seasonal mean RCT distributions averaged in the CM (left) and GS
(right) regions according to total precipitation intensity bins at an
interval of 5 [mm day?1] for spring (MAM), summer (JJA), autumn
(SON), and winter (DJF) [from top down] as assimilated by NR2 and
ERI, and simulated by five CWRF cumulus schemes (ECP, NKF,
TDK, NSAS, BMJ). Marked with circles are the corresponding re-
gional average 1980-2015 mean seasonal P95 values, while the respec-
tive observed values are depicted by the vertical lines. . . . . . . . . . 61
3.4 Composite P95 bias (blue) and departure (red) correlations with those
fields that had observational data (DRI, NRD, SWD, RET, OLR,
CRE, CWP, T2m) as well as P95 departure correlations with rainfall
components (PL, PC) in the CM (left) and GS (right) regions for
spring (MAM), summer (JJA), autumn (SON), and winter (DJF)
[from top down]. A star mark indicates the correlation is statistically
significant. They are labeled with a number equal to the correlation
coefficient times 100 if statistically significant. . . . . . . . . . . . . . 65
viii
3.5 Spring composite departure correlations across P95 and its all ingre-
dient fields in the CM (upper triangle) and GS (lower triangle). They
are coded each with a color and, if statistically significant, also la-
beled with a number equal to the correlation coefficient times 100.
The diagonal represents the P95 correlation with each field named. . 66
3.6 Same as Fig. 3.5 except for summer. . . . . . . . . . . . . . . . . . . 67
3.7 Same as Fig. 3.5 except for autumn. . . . . . . . . . . . . . . . . . . 68
3.8 Same as Fig. 3.5 except for winter. . . . . . . . . . . . . . . . . . . . 69
3.9 The conceptual design of the experimental SEM for extreme precip-
itation (EP). The center oval represents the predictand (EP), while
each outer oval defines one latent variable (LV) with a list in a brace of
the designated manifest variables (MV) as the predictor candidates.
There are four LVs, hypothetically representing energy supply (ES),
water supply (WS), surface forcing (SF), and cloud forcing (CF).
Each effect from one LV to another or to EP is expressed by an arrow
for its direction and a coefficient along the line for its strength. The
SEM consists of regression equations from these MVs through LVs to
EP, as depicted at the lower right corner. . . . . . . . . . . . . . . . . 76
3.10 Spring finalist SEMs for CM (upper) and GS (lower). Each SEM
panel includes its structure (left) with the active manifest and la-
tent variables and their directional effect coefficients, its performance
scores (upper right corner), and the relative importance of each latent
variable?s direct, indirect and total effects (bottom). . . . . . . . . . . 80
3.11 Same as Fig. 10 except for summer. . . . . . . . . . . . . . . . . . . . 81
3.12 Same as Fig. 10 except for autumn. . . . . . . . . . . . . . . . . . . . 82
3.13 Same as Fig. 10 except for winter. . . . . . . . . . . . . . . . . . . . . 83
4.1 Six subregions in the continental United States . . . . . . . . . . . . . 101
4.2 Winter overall performance of 1989-2009 mean extreme indicators
measured by multi-score in all six subregions. Color represents scores
scaled by their range in each row. Horizontally, the locations of mod-
els represent the optimal rank aggregated from all 144 indicators,
with the most left is the best performed. . . . . . . . . . . . . . . . . 105
4.3 Same as Fig4.2 except for spring. . . . . . . . . . . . . . . . . . . . . 107
4.4 Same as Fig4.2 except for summer. . . . . . . . . . . . . . . . . . . . 109
4.5 Same as Fig4.2 except for autumn. . . . . . . . . . . . . . . . . . . . 111
4.6 Geographic distributions of 1980-2009 mean winter P95 amount [mm
day?1] (color) and results of MODE (grey) for observed (OBS), as-
similated (ERI), and simulated by all CPN members. A grey area
represents the ?interest? area identified by the MODE tool, with total
interest score showing in the lower-left corner. The black lines repre-
sent the convex hull identified by MODE, which was used to calculate
shape features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
4.7 Same as Fig4.6 except for spring. . . . . . . . . . . . . . . . . . . . . 114
4.8 Same as Fig4.6 except for summer. . . . . . . . . . . . . . . . . . . . 115
ix
4.9 Same as Fig4.6 except for autumn. . . . . . . . . . . . . . . . . . . . 117
4.10 Geographic distributions of 2004-2009 (cross-validation) mean sea-
sonal P95 amount [mm day?1] observed (OBS), assimilated (ERI)
simulated by CWRF, composite ensemble mean (EnsMean), post-
processed by bootstrapping Akaike method (B-Akaike), post-processed
by Akaike method (Akaike), and post-processed by stacking method
(stacking) for winter (DJF), spring (MAM), summer(JJA),and au-
tumn(SON). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
4.11 Geographic distributions of 2004-2009 (cross-validation) winter P95
confidence interval (CI) amount [mm day?1]. Estimated by boot-
strapping observational data (OBS), estimated by bootstrapping en-
sembles (boot), estimated by applying bootstrapping Akaike method
on ensembles (B-Akaike), estimated by applying Akaike method on
ensembles (Akaike), and estimated by applying stacking method on
ensembles (stacking).left lower (25%) CI; left upper (75%). . . . . . . 125
4.12 Same as Fig4.11 except for spring. . . . . . . . . . . . . . . . . . . . . 127
4.13 Same as Fig4.11 except for summer. . . . . . . . . . . . . . . . . . . . 128
4.14 Same as Fig4.11 except for autumn. . . . . . . . . . . . . . . . . . . . 130
4.15 Geographic distributions of 2004-2009 (cross-validation) IGN of BMA
methods minus IGN of raw composite ensembles: bootstrapping Akaike
method (B-Akaike), Akaike method (Akaike), and stacking method
(stacking). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
5.1 Geographic distributions of projected changes in extreme precipita-
tion for different return levels. . . . . . . . . . . . . . . . . . . . . . . 138
5.2 Geographic distributions of projected changes in the number of rainy-
days, with confidence interval estimated by the bootstrapping method.138
x
List of Abbreviations
ACM Asymmetric Convective Model
AIC Akaike Information Criteria
AMS American Meteology Society
AOGCM Atmosphere-Ocean General Circulation Models
AVHRR Advanced Very High-Resolution Radiometer
BMA Bayesian Model Averaging
BMJ Betts-Miller-Janjic cumulus scheme
CAM Community Atmosphere Model
CAPE Convective Available Potential Energy
CCCMA Canadian Centre for Climate Modelling and Analysis
CDD Consective Dry Days
CDF Cumulative Distribution Function
CF Cloud Forcing
CFI Comparative Fit Index
CI Confidence Interval
CIN Convective inhibition
CLASS Canadian Land Surface Scheme
CM Central to Midwest
CONUS Contiguous United States
CORDEX Coordinated Regional Climate Downscaling Experiment
CPN CWRF Plus NA-CORDEX
CRCM Canadian Regional Climate Model
CRE cloud radiative effect
CSSP Conjunctive Surface Subsurface Process Model
CWP Cloud Water Path
CWRF Climate Weather Research and Forecasting Model
DGM Data Generating Model
DREAM DiffeRential Evolution Adaptive Metropolis
DRI Daily Rain Intensity
DSCV Distance (rmse), Similarity (correlation), Consistency(LEPS) and Variability(MVI)
ECP Ensemble Cumulus Parameterization
EM Expectation-Maximization
ERI ERA-Interim
ES Energy Supply
ET Evaporation Transpiration
ETS Equitable Threat Score
FCH Fraction of Cloud (high)
FCL Fraction of Cloud (low)
FLG Fu?Liou?Gu radiation transfer scheme
xi
GEWEX Global Energy and Water Exchanges
GS Gulf States
GSFC NASA Goddard Space Flight Center
HIRMAM HIgh Resolution Atmospheric Model
HPD Highest Posterior Density
IGN Ignorance skill score
LCL Lifting Condensation Level
LEPS Linear Error In Probability Space
LFC Level of Free Convection
LOO Leave-One-Out
LV Latent Variables
MC Moisture Convergency
MCMC Markov Chain Monte Carlo
MLE Maximum Likelihood Estimation
MODE Method for Object-Based Diagnostic Evaluation
MOR Morrison et al. two-moment microphysics scheme
MV Manifest Variables
MVI Model Variability Index
MYNN Mellor?Yamada PBL scheme modified by Nakanishi?Niino
NARR North American Regional Reanalysis
NASA National Aeronautics and Space Administration
NCA National Climate Assessment
NCAR National Center for Atmospheric Research
NCEP National Centers for Environmental Prediction
NKF New Kain?Fritsch cumulus parameterization
NOAH NCAR?NCEP unified land surface model
NRD/RD Number of Rainy Days
NSAS New Simplified Arakawa-Schubert scheme
NSE Net Surface Energy
NUTS No U-turn Sampler algorithm
OBS Observation
OLR Outgoing Longwave Radiation
P95 95th percentile precipitation
PBLH Planetary Boundary Layer Height
PC convective precipitation
PL explicit precipitation
RB Relative Biases
RCA Rossby Centre regional atmospheric climate model
RCD Regional Climate Downscaling
RCT convective to total precipitation ratio
xii
RET reflecting more solar radiation
RI Relative Importance
RIV Ratios of Interannual Variability
RMSE/rmse Root Mean Squre Error
RRTMG Rapid Radiative Transfer Model GCM Applications
SEM Structural Equation Modeling
SF Surface Forcing
SH Sensible Heat
SRB Surface Radiation Budget
SWD Short Wave Downwelling
TAO Tao et al. microphysics scheme
TDK Tiedtke cumulus scheme
TPW Total Precipitable Water
UW University of Washington
WS Water Supply
xiii
Chapter 1: Introduction
1.1 Background and Motivation
Extreme precipitation has profoundly harmed U.S. social and economic life.
From 1980 to 2019, $1.3 trillion has been lost due to extreme precipitation-related
events (?), which have imposed a substantial burden on our society. ? showed that
compared to other weather disasters, extreme precipitation dominated economic
losses (accounting for more than 70%).
Extreme precipitation events affect every aspect of life. They pose a direct
threat in urban areas with flood damage (e.g., $22 million loss in the July 2016
Ellicott flood, $1 billion loss in the 2010 Tennessee flood) (?). They increase the
risks of landslides and mudslides (?). ? demonstrated that excessive rain can also
damage buildings, increasing infrastructure risks. ? show that extreme precipitation
disrupts transportation and increases the risk of motor vehicle collisions. Extreme
precipitation events can also lead to health risks through waterborne diseases (?).
Not only that, from 1980 to 2016, crop production lost $10 billion U.S. dollars due
to excessive rains (?). Overall, extreme precipitation exerts significant pressure on
the government and insurance system, decreasing the stability of the society.
Records show that losses from many types of extreme precipitation are rising
1
consistently (?). For example, over three decades (1940?1990), losses from hurri-
canes increased eight times from $5 billion to $40 billion (?). During the same period,
damage from floods rose from $1 billion to $6 billion (?). Meanwhile, hailstorms
occurred more frequently, with losses of more than $300 million (?).
Future projections show that the frequency and intensity of extreme precipi-
tation events are increasing across the United States, which will lead to higher risks
of property loss and more significant impacts on citizens? daily lives (????). Fur-
thermore, within the warming climate, threats from extreme precipitation on food
security are becoming more and more severe (???).
The goal of this study is to improve the prediction and process understanding
of extreme precipitation in the United States. To do this, I adopted a mesoscale
climate model multi-physics ensemble approach. To improve prediction, I first iden-
tified major problems in current extreme precipitation simulations, and key regions
in which those problems occur. This analysis also provided a performance base-
line for the following studies. Focusing on the most problematic key regions in
extreme simulation, I performed multi-physics sensitivity experiments and found
that mesoscale climate modeling of extreme precipitation is highly sensitive to your
choice of cumulus schemes. Second, I focused on explaining the underlying phys-
ical mechanisms of why some Climate-Weather Research and Forecasting model
(CWRF) schemes failed to capture extreme precipitation. I built a causality model
to find the best fitting configuration from all possible configurations of structural
equation models. I used this analysis model to explore the causes of problematic
extreme precipitation simulation from a systematic point of view, including iden-
2
tifying the interaction between multiple crucial processes. Both the multi-physics
sensitivity experiments and causal analysis showed that there are substantial differ-
ences in simulation performance and physics representation of extreme precipitation
simulations among schemes. Therefore, weighting all schemes equally, regardless of
performance does not produce the optimal ensemble outcome. To further boost
the performance of the ensemble system, Chapter Four is dedicated to applying the
Bayesian model averaging (BMA) method to derive optimal weights for each mem-
ber. Using the selected physics configurations (particularly the ECP scheme) with
the BMA method can improve the future projections of extreme precipitation. This
improved understanding of the underlying mechanisms of extreme precipitation may
help to guide future model development.
1.2 Improving extreme precipitation simulation through a multiphysics
approach
Many studies have demonstrated the difficulties of extreme precipitation sim-
ulation, and the fact that the majority of numerical models still underperform and
have sizable uncertainty (???). Meanwhile which processes extreme precipitation
is most sensitive to is still a subject of debate. ? concluded that microphysics is
the most critical process, specifically the ice phase process. However, even though
the ice phase process is non-negligible, given the complexity of numerical models
(and especially convection-cloud-radiation interactions) (?), they are not sufficient
to resolve the underestimation in other situations. ? demonstrated the importance
3
of cumulus schemes in extreme precipitation simulations, while other researchers
have shown the impact of resolution (????).
Consequently, Chapter Two aims to identify the processes to which extreme
precipitation is most sensitive and to provide a reference point of the current sta-
tus of extreme precipitation simulation, specifically for U.S mesoscale simulations.
To achieve this goal, I conducted experiments using 25 different physics configura-
tions covering eight major processes, and thoroughly analyzed their performance.
The analysis identified not only problems in extreme simulation in current climate
models, but also which processes were most sensitive to extreme precipitation simu-
lation. The results showed that for mesoscale extreme precipitation simulations on
a 30km grid, extreme precipitation was most sensitive to cumulus parameterization
schemes. Overall, CWRF with ensemble cumulus parameterization (ECP) signifi-
cantly outperformed other models and even outperformed the driving reanalysis.
Given the significant improvements in the best-performing configuration, this
study provides a useful optimized configuration of physics schemes for future ex-
treme precipitation projections. I also found the most sensitive process in mesoscale
extreme precipitation simulation was the cumulus process. This analysis is highly
valuable from a practical perspective, since the complexity of the climate model-
ing system, efficient model development requires identifying and focusing on key
processes
4
1.3 Improving extreme precipitation simulation by better physical
understanding
While capturing the observed characteristics of extreme precipitation is ambi-
tious, the second goal, understanding the underlying physical mechanisms explaining
model failure or success, is even more challenging. The previous analysis highlighted
the importance of cumulus schemes. Given the importance of cumulus activity to
precipitation, multiple potential hypotheses on why some cumulus schemes are in-
sufficient to produce intense extreme precipitation have been put forward. ? empha-
sized the importance of convective to total precipitation ratios. In our experiments,
however, we did not found a strong correlation between the convective ratio and
extreme precipitation. Some studies have emphasized the importance of better-
simulating intensities of cumulus activity: for example, ? proposed the use of an
explicit convection solution, and ? suggested developing better convective closures
and triggers. However, from the vertical tendency profiles of cumulus schemes, I
found that schemes that simulated strong cumulus activity did not always produce
sufficient P95 intensity. Furthermore, my experiments found no clear connection
between triggers and performance. Meanwhile, ? highlighted that convective avail-
able potential energy (CAPE) is crucial in extreme precipitation simulations, while ?
found that the highly uncertain term ?time scale for CAPE consumption rate? plays
a central role in extreme precipitation simulations. These studies indicate that the
connection between cumulus activity and energy supply is important. Therefore, to
5
uncover why some schemes underestimate extreme precipitation, the analysis needs
to include surface forcing, cloud forcing, energy supply, water supply, and their
interactions.
To comprehensively analyze the interaction between different forcing and sup-
ply, I built an analysis model that find all potential configurations of the structural
equation model (SEM) (?). The SEM can detect potential causal relationships
within a web of knowledge, while also considering collinearity between factors. More
importantly, the SEM can quantitively measure the effects of each component. This
ability of SEM is beneficial in the highly complex and interconnected relationships
between each process in the earth system. For this analysis of biases, I included the
22 primary dynamic and thermodynamic fields most relevant to extreme precipita-
tion. Through the SEM, I discovered five distinct physical mechanisms underlying
the key biases in regional extreme precipitation, each involving interplays among
water and energy supplies and surface and cloud forcing, with varying degrees of
relative importance. For extreme precipitation, complex interactions were underly-
ing the four major processes (two forcing and two supplies). The dominant processes
changed with the transition of prevailing precipitation systems (convective precipi-
tation to stratiform precipitation). The choice of cumulus parameterization affected
how water and energy supplies acted through surface and cloud forcing and thus
determined CWRF?s ability to simulate extreme U.S. precipitation.
This analysis confirmed that the improvement of ECP was due to a better
representation of the physics mechanism rather than a result of overfitting parame-
ters. Thus, we can have more confidence in the ECP schemes in future projection.
6
Specifically, when the ECP scheme results disagree with other models in future
projections, my findings indicate that the ECP outputs may be more trustworthy.
1.4 Improving extreme precipitation simulation by optimal weight-
ings
Chapters Two and Three focus on the performance of individual schemes.
However, given the regime dependence of parameters, no single scheme can repre-
sent the real climate, let alone extreme precipitation universally (???). In particu-
lar, the success of ECP also validated the benefits of combining ensemble members.
Hence, Chapter Four combines ensemble results from different schemes to further
boost extreme precipitation skill. Two methods were used to combine outcomes
from the different ensemble members. The first method was the arithmetic mean of
ensemble members (i.e. ?composite ensemble?). This composite ensemble has the
benefit of computational efficiency and stability and is theoretically straightforward
to explain. However, ? demonstrated that equal weights in ensemble calculation
cannot provide optimal ensemble mean outcomes for schemes with substantial dif-
ferences in performance. Both of the previous sensitivity experiments and causal
analyses demonstrated individual schemes perform differently in terms of extreme
precipitation simulations. Chapter Four thus applied non-equal weighting methods.
Among these, Bayesian Model Averaging (BMA) is generally considered to be the
gold standard for making out of sample predictions (?). ? proposed a well-adopted
expectation-maximization (EM) algorithm-based BMA method, which has success-
7
fully improved the performances of precipitation simulations (?). Therefore, this
study tested the EM-based BMA method in extreme precipitation related ensem-
ble. The EM-based method also served as a baseline to compare with other meth-
ods, including an Akaike information criteria-based BMA method (AIC,?), which is
computationally more efficient. Moreover, this study also compared a bootstrapping
version of Akaike weights, which considered the uncertainty associated with training
data. Besides the above three BMA methods, Chapter Four also tested a stacking
method. This final method was tested because BMA methods are based on a strong
hypothesis: that the true data-generating model is among the potential candidate
models. However, given the highly nonlinear complexity of extreme precipitation,
the true model, earth system, is out of the ensemble member list (?). ? proposed
the stacking method specifically to resolve this theoretical flaw.
? implemented the EM-based BMA method with linear bias correction. The
linear bias correction ignored model uncertainty and led to ?over-confident? pre-
diction, in contrast to the probabilistic type of bias correction, which provides un-
certainty information. Such uncertainty information is highly valuable in climate
studies and projections. Hence, I also implemented the Markov Chain Monte Carlo
(MCMC) based probabilistic bias correction in this study. The MCMC methods
also enabled a more flexible model design (?), which allowed the use of extreme
value distributions as the prior distribution.
Therefore, Chapter Four explores the potential benefits of using different meth-
ods to combine members (three variations of BMA, and stacking) as well as bias
correction methods (linear bias correction and MCMC method) in extreme precipita-
8
tion ensemble simulations. The results demonstrated that both BMA and stacking
methods improved the general performance of extreme precipitation not only in
spatial pattern distribution but also in interannual variability. This improvement
highlights the advantage of using more sophisticated methods for ensemble combina-
tions. Meanwhile, there was no significant difference between different combinations
or bias correction methods. Consequently, the more computationally efficient EM
algorithm with linear bias correction is preferred. In addition to the above benefits,
the EM also provided probabilistic information associated with each field, which can
be greatly valuable in future projections and beneficial to decision makings.
1.5 Organization
Chapter Two (submitted to Journal of Climate) examines CWRF?s improve-
ments in simulating U.S. extreme precipitation and investigates the responsive pro-
cesses by comparing a large ensemble of long integrations using multiple physics con-
figurations. Chapter Three (submitted to Journal of Climate) explores the physical
mechanisms that can explain how cumulus parameterization determines CWRF?s
ability to simulate U.S. extreme precipitation by comparing deep integrations with
the five representative cumulus schemes (ECP, NKF, TDK, NSAS, and BMJ). In
Chapter Four, to further boost the performance of extreme precipitation simula-
tion, I explored different combination methods in extreme precipitation simulation
to maximize extreme precipitation simulation skills.
9
Chapter 2: Sensitivity to physics parameterizations
2.1 Introduction
Since 1980, 230 billion-dollar weather disasters have occurred in the United
States, causing more than $1.5 trillion in total economic losses (?). Of the damage
caused by these weather disasters, more than 70% was due to extreme precipitation
(?). Greater risks are anticipated in the future, since the frequency and intensity of
extreme precipitation events are increasing over the United States (??????). De-
spite the profound impact on society, prediction of extreme precipitation remains
highly uncertain (???). In addition to observational data scarcity and discrepancy
(??), data-to-model spatial scale mismatch (??), and unpredictable natural variabil-
ity (?), the prediction uncertain arises from model deficiencies in representing orga-
nized convection systems and other physical processes (??????). A major problem is
that most climate models tend to underestimate extreme precipitation (????). The
causes of the underestimation are still debated. Some studies attributed the under-
estimation to a ?drizzling problem?, where models tended to overestimate light rain
events (??). Others reported simultaneous overestimation of light rain and under-
estimation of extreme precipitation (???). However, more light rain could consume
only slightly more energy, which would not noticeably weaken extreme precipitation
10
(?). ? argued that precipitation was underestimated because the threshold in the
trigger function was set too low so that convective instability was released too early.
On the other hand, ? showed that a modified trigger function suppressed not only
light rain but also heavy rain. ? attributed the underestimation to a lack of ice phase
processes in cloud microphysics schemes. While ice-phase processes are necessary,
given the complexity of numerical models and especially convection-cloud-radiation
interactions (?), they are not sufficient to resolve the underestimation. Many others
attempted to address the problem by increasing model resolution, but large extreme
precipitation biases still existed in global and regional climate simulations at grid
spacing of 10?50 km (??????). ? found that even cloud-permitting models with
grid spacing of 3-4 km could still underestimate extreme precipitation as their low-
resolution counterparts. In contrast, ? showed that cloud-permitting simulations
could produce more heavy rains than lower resolutions, but they over-forecasted
(by an order of magnitude) occurrences of extreme rainfall events. None of these
studies have systematically investigated the model sensitivity of extreme precipita-
tion simulation to varying physics representations and the underlying mechanisms.
For climate prediction, the most significant uncertainty lies in representing phys-
ical processes (?). This is especially the case for extreme precipitation, which by
definition is rare and is typically not well tested during model development and
evaluation. The choice of physical parameterization schemes could have greater im-
pact on model performance during more intensive rainfall events (?). The regional
Climate-Weather Research and Forecasting model (CWRF) has built in many al-
ternate schemes with consistent coupling for each major physical process, including
11
cumulus, microphysics, cloud, aerosol, radiation, planetary boundary layer, and sur-
face processes (????????????). CWRF is superior to the driving reanalysis and a
popular regional climate model in capturing extreme precipitation characteristics
over China (?). More importantly, its built-in ensemble of physics configurations
offers a unique opportunity to systemically investigate the responsive processes to
which extreme precipitation simulation is sensitive. As the first part of a pair, this
paper examines CWRF?s improvements in simulating U.S. extreme precipitation,
and investigates the responsive processes by comparing a large ensemble of long
integrations using multiple physics configurations. Section 2 describes CWRF and
its selected physics configurations, the observational data used for evaluation, and
the experiment design for the sensitivity analysis. Section 3 analyzes CWRF?s per-
formance at predicting seasonal mean and extreme precipitation distributions over
the entire contiguous United States, relative to that of the driving reanalysis. Sec-
tion 4 focuses on two key regions in which the reanalysis underestimated extreme
precipitation and CWRF offered a significant improvement, and thereby examines
the sensitivity of simulation skill to physics parameterization schemes. Section 5
concludes the results. The companion paper (?) will demonstrate how cumulus pa-
rameterization dominates extreme precipitation simulation and explore the potential
physical contributors to extreme event modeling ability.
12
Physical processes Parameterizations References
ECP ?; ? ;???
NKF ? ;?
Cumulus (CU) TDK ? ;? ;? ;? ;?
NSAS ?
BMJ ?; ??
TAO ??
THO ??; ?
Microphysics (MP)
MOR ???
WD6 ?
?;?; ?. A3D uses aerosol mass loadings
A3D and optical properties or observed (MISR,
Aerosol (AE)
MODIS)
A2D uses aerosol optical depth distributions
A2D
(?)
Diagnostic cloud cover based on ? with mod-
XRL
Cloud (CL) ifications by ?
CPL Prognostic cloud cover based on ?
GSFC ? ;?
CCCMA ?; ? ;?
Radiation (RA)
FLG ??
RRTMG ?
? with updates to include gravity wave drag
CAM
effect and orographic turbulence stress (?)
Boundary layer (BL) MYNN ??
? with updates on MOL calculation follow-
ACM
ing WRF3.7.1.
UW ?
CSSP ????? ; ? ;? ;? ;? ;?
Land
NOAH ? ;? ;?
Surface (SF)
SST ?; ?
Ocean
XOML ??
Table 2.1: Physical processes, parameterizations, and their references.
13
2.2 Model description, experiment design, observations, and extreme
indices
CWRF has been systematically advanced as a Climate extension to the Weather
Research and Forecasting model (WRF, Skamarock et al. 2008) since 2002 (?),
with several important updates including terrestrial hydrology (?), cloud-aerosol-
radiation interaction (??), land surface characteristics (?), and upper oceans (?). Of
particular relevance to this study, CWRF incorporates an ensemble cumulus param-
eterization (ECP) based on ?, which has outstanding performance in precipitation
simulation, including extreme events and flooding (?), over oceans (?) and land
(?), and in different climate regimes (?). CWRF is good fit for this study because
it incorporates alternative parameterization schemes for each of the surface (land,
ocean), planetary boundary layer, cumulus (deep, shallow), microphysics, cloud,
aerosol, and radiation processes. Moreover, these schemes were coupled systemat-
ically to maximize consistency between interactive components critical to regional
climate simulation (?).
This study selected multiple schemes from each of the nine key CWRF pa-
rameterization processes (Table 2.1, ?) to form 25 CWRF physics configurations,
as listed in Table 2.2. All CWRF simulations were conducted on a well-tested North
American domain including the contiguous United States (??????). Horizontally,
the domain was centered at (37.5 o N, 95.5 oW), containing 138x195 points at grid
spacing of 30 km using the Lambert conformal map projection. There were 36
14
15
Parameterization CU MP AE CL RA BL Land Ocean
scheme ECP NKF TDK NSAS BMJ GSFC THO MOR WD6 A2D A3D XRL CLP GSFC CCCMA FLG RRTMG CAM MYNN ACM UW CSSP NOAH SST XOML
A ECP X X X X X X X X
B NKF X X X X X X X X
C TDK X X X X X X X X
D NSAS X X X X X X X X
E BMJ X X X X X X X X
F THO X X X X X X X X
G MOR X X X X X X X X
H WD6 X X X X X X X X
I A3D X X X X X X X X
J CLP X X X X X X X X
K CCCMA X X X X X X X
L FLG X X X X X X X
M RRTMG X X X X X X X
N MYNN X X X X X X X X
O ACM X X X X X X X X
P UW X X X X X X X X
Q NOAH X X X X X X X X
R XOML X X X X X X X X
S FM X X X X X X X X
T FMNKF X X X X X X X X
U FMTDK X X X X X X X X
V FMNSAS X X X X X X X X
W FMBMJ X X X X X X X X
X FMTHO X X X X X X X X
Y FMMOR X X X X X X X X
Table 2.2: Ensemble experiment physics configurations.
vertical terrain-following sigma (?) levels, denser near the surface, and the top of
the model was at 50 hPa. All simulations were driven by the European Center
for Medium-Range Weather Forecasts Interim Reanalysis (ERI), with 6-hourly data
available at horizontal grid spacing of approximately 80 km and 60 vertical levels
up to 0.1 hPa (?). The simulation began in October 1, 1979 and ran continu-
ously until the end of 2015. Considering the first two months as the spin-up, our
analysis below is based on 1980-2015, a total of 36 years. Since ERI assimilated
pseudo-observations of rainfall and surface analyses of temperature and humidity
measurements, its resulting precipitation and surface air temperature can be con-
sidered as a realistic proxy of observations. For comparison with CWRF, ERI?s
cumulus scheme was originally described by ?, and has since been updated, includ-
ing modifications in entrainment formulation (?) and parameterization closure (?).
Given the non-Gaussian temporal and inhomogeneous spatial distributions of ex-
treme precipitation (??), an extended observational period in a dense monitoring
network is required to properly capture the statistical characteristics of extreme
events (?). The daily precipitation observations were based on quality-controlled
records from 8516 stations in the National Weather Service Cooperative Observer
network (COOP), which are updated continuously (??). These stations each con-
tained at least 40% available daily data for 1951-2012 and were kept the same for all
subsequent years (personal communication with Kenneth Kunkel 2019). Following
?, they were adjusted for topographic dependence using monthly mean data from
the Parameter elevation Regression on Independent Slopes Model (?). This ad-
justment was necessary because elevation and precipitation correlate strongly, and
16
observations over mountain areas are usually at lower elevations and thus may un-
derestimate precipitation. The station data were mapped onto the CWRF 30-km
grid following the mass-conservative Cressman objective analysis method of ?. For
consistency, ERI daily precipitation values were mapped onto the same CWRF grid
by a conservative algorithm from the Earth System Modeling Framework regridding
package. These remapping procedures were applied to alleviate the impact of data
scale mismatch on extreme event comparison (?).
Numerous extreme indices have been recommended by the Expert Team on
Climate Change Detection, Monitoring and Indices (??), and some are often used
for observational analysis (e.g. ?) and model evaluation (e.g. ?). Here we used daily
95th percentile precipitation (P95) to analyze climatic extreme simulation skills in
different seasons and distinct regions. The geographic distribution of P95 in each
season is preferred to other indices, since it was designed to represent the climatology
characteristics and regime dependence of extreme precipitation (??). P95 is also a
more robust statistic than a maximum or other value, as it can show key features of a
sample?s distribution without being distorted by abnormal outliers (???). However,
improved P95 performance could be due to fake drizzling (??) or to an artificial
shift in the precipitation intensity distribution (?), rather than improvement in the
underlying physics processes. To improve the reliability of the P95 analysis, we also
evaluated total number of rainy days (NRD) and average daily rainfall intensity
(DRI = total accumulated precipitation amount / NRD). Together they provide
additional information on whether P95 biases are associated with deficiencies in
clear-day frequencies or in rainfall magnitudes.
17
DJF MAM JJA SON
OBS
ERI
CWRF
0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 6 7 8
Figure 2.1: Geographic distributions of 1980-2015 mean seasonal precipitation
amount [mm day?1] observed (OBS), assimilated (ERI), and simulated by CWRF
control ECP for winter (DJF), spring (MAM), summer (JJA), and autumn (SON).
2.3 General performance of seasonal mean and extreme precipitation
We first analyze the performance of the control CWRF (CTL) in simulating
precipitation-related fields over the contiguous United States, relative to ERI. Figure
2.1 compares observed and simulated 36-year (1980-2015) mean seasonal precipita-
tion distributions. In all seasons, CWRF captured precipitation distribution details
over mountain areas with a finer structure and more realistic intensity than ERI.
This was especially obvious in the western United States. In winter, observed pre-
cipitation was more than 4.5 [mm day?1] over the Gulf States. ERI did not produce
enough intensity over that region, whereas CWRF captured the intensity, though its
maximum center shifted inland. In spring, observations showed that the maximum
center moved inward. While ERI missed this feature, CWRF produced sufficient
18
intensity and the correct location. CWRF had a larger area of precipitation greater
than 4.5 [mm day?1], which was more realistic than ERI. In summer, observations
showed strong centers over the Midwest. ERI significantly underestimated precipi-
tation over those centers, while CWRF produced both better intensity and a more
reasonable distribution. ERI overestimated precipitation over the Gulf States, av-
eraging over 4.5 [mm day?1] compared to the less than 4 [mm day?1] observed. In
between two rain belts over the Midwest and the Southeast, observations showed
a narrow region with relatively weak precipitation. CWRF realistically simulated
the intensity in this region, but ERI overestimated it. In autumn, observations
showed peaks in Arkansas and along the Texas-Louisiana coast. CWRF captured
these peaks with some overestimation, while ERI produced insufficient intensity.
The above comparison shows that CWRF simulated climatological mean precipi-
tation distributions better than ERI, even though ERI assimilated pseudo-rainfall
observations and surface measurements (?), which should have incorporated most
realistic features. Due to its more comprehensive physics representation and finer
resolution, CWRF showed useful added value in precipitation simulation, provid-
ing greater detail and higher accuracy than ERI across the majority of the United
States.
Figure 2.2 compares 36-year mean seasonal NRD distributions. CWRF cap-
tured finer structural details than ERI over the western U.S. mountain regions in
all seasons, especially the rain shadow areas. In these regions, the gradient of rainy
days tends to be large, and predicting detailed distribution is vital for management
decision-making and planning processes (?). In winter, both ERI and CWRF sim-
19
DJF MAM JJA SON
OBS
ERI
CWRF
0 5 10 15 20 25 30 35 40
Figure 2.2: Same as Fig. 2.1 except for the number of rainy days (NRD).
ulated NRD peaks well from the Midwest to Northeast, but underestimated them
over the Gulf States. Given the dominance of stratiform precipitation, addressing
this regional underestimation may require improving microphysics representation.
In spring, both ERI and CWRF realistically captured the pattern and magnitude
of NRD distribution over the entire Central to Eastern States. In summer, ERI
overestimated NRD in the Central to Midwest States, exhibiting its drizzling prob-
lem, whereas CWRF reduced this overestimation. However, CWRF underestimated
NRD near the Great Lakes, suggesting that its interactive lake model (?) needs re-
finement. In autumn, ERI overestimated NRD in the Southwest again due to its sig-
nificant drizzling problem, while CWRF continually underestimated near the Great
Lakes. Overall, CWRF captured the essential spatial features of NRD, demonstrat-
ing the potential to resolve the ERI?s drizzling problem.
Figure 2.3 compares 36-year mean seasonal P95 distributions. In all seasons,
20
DJF MAM JJA SON
OBS
ERI
CWRF
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 35 40 45 50
Figure 2.3: Same as Fig. 2.1 except for the daily 95th percentile precipitation (P95)
[mm day?1].
CWRF outperformed ERI over the western U.S. mountainous regions, including the
Coastal Ranges, the Cascade Range-Sierra Nevada region, and the Rocky Moun-
tains. The windward slopes of these mountains are prone to cyclogenesis-induced
heavy rainfall (?), while the eastern sides are typically dry transitional zones partly
controlled by the precipitation-shadowing effect (?). ERI underestimated P95 peaks
with no clear dry zones in every season, especially in winter. On the other hand,
CWRF well captured both the intensity and the wet-dry pattern distribution. This
is an important improvement, as precipitation prediction in these mountainous re-
gions is notoriously difficult (??). Since general circulation models, including ERI,
do not resolve topographical details, they are at a disadvantage when it comes
to such regions. In contrast, CWRF incorporates finer details and so realistically
captured precipitation spatiotemporal variations (?). CWRF also substantially out-
21
performed ERI for P95 over the Central to Eastern States, demonstrating its ability
to improve even in regions not dominated by topographic forcing. In winter, ob-
servations showed a broad region of high values over the southern Central States,
with extremes greater than 35 [mm day-1]. ERI systematically underestimated the
magnitude and reduced the coverage of these regional extremes. CWRF accurately
captured both the magnitude and coverage, though it overestimated rainfall along
the eastern coastal States. In spring, the observed P95 peak region expanded north
and west. Again, ERI systematically underestimated that, with the area of P95
greater than 28 [mm day-1] reduced substantially. In contrast, CWRF further ex-
panded and strengthened the region, resulting in overestimation in the Midwest and
along the eastern coastal States. In summer, observations showed high P95 val-
ues in the northern Central States, exceeding 26 [mm day-1] across Iowa, Missouri,
Kansas, and Oklahoma. ERI failed to produce any extreme precipitation greater
than 20 [mm day-1], which is consistent with the drizzling problem identified from
its NRD overestimation (Fig. 2.3). On the other hand, CWRF realistically captured
the magnitude and pattern of observations. CWRF underestimated P95 in the Gulf
States by up to 5 [mm day-1], while ERI underestimated by up to 10 [mm day-1].
In autumn, the observed pattern resembled that of spring, except high values also
occurred along the eastern coastal States. As such, CWRF performed even better
in autumn than in spring, whereas ERI further reduced its skill.
22
2.4 Physics sensitivity of regional extreme precipitation simulation
The above comparison identified two distinct regions in which the driving
ERI substantially underestimated precipitation extremes that were realistically cap-
tured by the downscaling CWRF: the Gulf States (GS) and the Central to Midwest
States (CM). Since the driving large-scale synoptic conditions were same and the
regional topographic forcing effects are expected to be small, the CWRF?s superior
performance to ERI in these regions likely resulted from improved physics param-
eterizations under a refined resolution. This downscaling ability presents a unique
opportunity to explore the sensitivity of P95 simulation to CWRF?s configuration
of physics parameterizations, and therefore the key model improvement needed to
alleviate the extreme precipitation underestimation and related drizzling problem.
The following analyses focus on these two regions, comparing CWRF?s performance
among its ensemble of 25 physics configurations to simulate extreme precipitation
features. The comparison identifies which physical processes CWRF extreme pre-
cipitation simulation is most sensitive to, and which schemes or combinations best
capture those processes.
CWRF?s improved skill over ERI is particularly evident in GS spring and CM
summer. Therefore, we used ERI spring and summer P95 bias distributions to
define the boundaries of the two regions, as illustrated in Fig. 2.4. The GS region
encompasses all grids where ERI significantly underestimated spring P95. Similarly,
the CM region encompasses all grids where ERI significantly underestimated summer
P95, excluding those overlapping with GS. Scattered areas with radii smaller than
23
Figure 2.4: Boundary specification of the two key regions where ERI severely un-
derestimated extreme precipitation: the Gulf States (GS) in spring and the Central
to Midwest States (CM) in summer.
24
three grids were discarded. The two regions differ in prevailing precipitation systems
and dominant physical mechanisms.
Figure 2.5 compares 36-year mean seasonal P95 biases (from observations)
between ERI and the 25 CWRF physics configurations averaged over the two regions.
The control CWRF corresponds to the ECP run, called in short the control ECP.
For brevity, we refer these five CWRF runs directly by the name of the respective
cumulus scheme they used. Similarly, any run differed from the control by a single
process? parameterization scheme is referred by that scheme?s name. For example,
MOR denotes for the CWRF run replacing the microphysics scheme of Tao et al.
with that of Morrison et al. while the rest is identical to the control configuration.
Other runs replacing two or more processes? parameterization schemes differed from
the control CWRF are referred by the experiment name listed in Table 2. We often
include both the process and scheme names to avoid confusion. A more general
term like ?ECP members? denotes for all runs using the CWRF configurations that
include ECP cumulus scheme.
As apparent in Fig. 2.5, CWRF members using different radiation or mi-
crophysics but the same cumulus schemes had similar P95 biases. Likewise, the
P95 bias spread among different boundary layer and land surface schemes was not
large. On the other hand, P95 bias differences between members using different
cumulus schemes were substantial. This suggests that cumulus parameterization
plays a crucial role in extreme precipitation simulation. According to the simulated
P95 biases, CWRF physics configurations consisted of two broad types. Type I
did not significantly underestimate P95 in either GS or CM region, and included
25
20
DJF
15
10
5
0
5
10 ERI
15 Type IType II
20
MAM
15
10
5
0
5
10
15
20
JJA
15
10
5
0
5
10
15
20
SON
15
10
5
0
5
10
15 Central to Midwest Gulf States
20
Figure 2.5: Comparison among ERI and all CWRF physics configurations in simu-
lating 1980-2015 mean seasonal P95 [mm day?1] biases (from observations) averaged
over the GS (left) and CM (right) for winter (DJF), spring (MAM), summer (JJA),
and autumn (SON). They are separated by color into type I (blue) and type II (red)
members depending on their cumulus schemes.
26
ERI
ECP
NKF
TDK
NSAS
BMJ
THO
MOR
WD6
A3D
CLP
CCCMA
FLG
RRTMG
MYNN
ACM
UW
NOAH
XOML
FM
FMNKF
FMTDK
FMNSAS
FMBMJ
FMTHO
FMMOR
ERI
ECP
NKF
TDK
NSAS
BMJ
THO
MOR
WD6
A3D
CLP
CCCMA
FLG
RRTMG
MYNN
ACM
UW
NOAH
XOML
FM
FMNKF
FMTDK
FMNSAS
FMBMJ
FMTHO
FMMOR
members using the ECP and NKF cumulus schemes. Type II produced significant
underestimates in either GS spring or CM summer, and included members using
the TDK, NSAS, and BMJ cumulus schemes. In the GS region, where ERI sub-
stantially underestimated P95 in all seasons, type I members produced reasonable
extreme precipitation and relatively small biases, especially in winter and spring. In
particular, the control ECP had the least bias and a relatively stable performance,
with no outliers in any season. It outperformed others most notably in spring, when
ERI underestimation was the greatest. Other ECP members slightly overestimated
in autumn and underestimated in summer, while NKF members generally overesti-
mated, except in winter. Type II members severely underestimated in all seasons,
with the exception of TDK members, which produced small biases in winter, spring,
and autumn. As discussed later, the exception for TDK in spring and autumn was
identified with incorrect spatial patterns and excessive variability. In the CM re-
gion, ERI substantially underestimated summer P95, whereas type I members of
CWRF produced more realistic simulations. ERI also underestimated autumn P95,
which CWRF type I members overestimated slightly. On the other hand, type II
members significantly underestimated in both summer and autumn (when convec-
tive precipitation is dominant), except for TDK members, which had small biases in
autumn. Once again, TDK members displayed incorrect spatial patterns (discussed
below). Among all seasons, ERI was most realistic in spring and overestimated
slightly in winter. Type I members significantly overestimated in both winter and
spring, when convective activities are relatively infrequent. In contrast, the biases
of type II members were mixed in these two seasons: BMJ members still under-
27
estimated (especially in spring) and TDK members overestimated, whereas NSAS
members had small spring biases and moderate winter overestimates.
Figure 2.6 compares geographic distributions of seasonal P95 biases from ob-
servations over the contiguous United States for ERI and the five CWRF members
differed only in cumulus schemes (ECP, NKF, TDK, NSAS, BMJ). In the GS re-
gion, the observed P95 maxima were higher than 25 [mm day-1] in summer and
even greater than 35 [mm day-1] in other seasons (Fig. 2.1). These extreme pre-
cipitation events usually happened near the coastline. ERI failed to capture this
intensity in all seasons, never exceeding 28 [mm day-1], and its maximum center
shifted much to the inner mainland (see also ?). In contrast, the control CWRF
produced sufficiently strong intensity as well as the correct location of the center.
Both center dislocation and intensity underestimation caused ERI?s substantial dry
P95 biases (Fig. 2.6). On the other hand, in summer, NKF shifted the center
eastward, casing substantial overestimations in the eastern coastal States but large
underestimations in Texas, Oklahoma and Louisiana. These large opposite biases
canceled each other to produce a smaller overestimation when averaged other the
GS region. Similarly, in spring and autumn, TDK produced large overestimations
in Georgia and Alabama but underestimations in Texas, which canceled each other
to yield smaller GS average biases than ECP. In all seasons, NSAS systematically
underestimated P95 over the GS region, while BMJ had more substantial underes-
timations over more extensive areas except for overestimations along the southern
and eastern coastlines in summer and autumn. One potential factor contributing
to CWRF?s improvement in representing P95 is that its ECP used different sets
28
P95 bias (mm day 1)
DJF MAM JJA SON
ERI
CWRF
NKF
TDK
NSAS
BMJ
-25 -20 -15 -10 -5 5 10 15 20 25
Figure 2.6: Geographic distributions of 1980-2015 mean daily 95th percentile precipi-
tation (P95) biases from observations [mm day?1] for winter (DJF), spring (MAM),
summer (JJA), and autumn (SON) as assimilated (ERI) and simulated by five
CWRF members varying only the cumulus scheme (ECP, NKF, TDK, NSAS, BMJ).
29
of cumulus parameterization closure assumptions to distinguish land versus oceans
(???), which more realistically represented the regional-specific processes govern-
ing extreme precipitation in both GS and CM. The ECP better captured coastal
baroclinicity-generating fronts in GS and CM, both of which were linked to most
heavy precipitation events in the respective region (?). Therefore, the ECP scheme,
with a more comprehensive treatment of the land-ocean contrast helped produce
sufficient convective activity and better extreme precipitation simulation.
Figure 2.7 compares geographic distributions of NRD biases among ERI and
the five CWRF members. ECP biases were generally between ?10 days, and the
lowest among all simulations. TDK also did reasonably well, except for large un-
derestimations in both the CM and GS regions. This exception was coincident with
small P95 underestimations. On the other hand, both TDK and NSAS substan-
tially underestimated NRD in both the CM and GS regions throughout the year.
The underestimations were especially large and systematic in summer, by more than
25 [days] over most regions of the central to eastern Unites States. Interestingly,
BMJ did very well and was comparable to ECP in winter, spring, and autumn. In
summer, BMJ resembled other type II members in great underestimations, except
for a realistic simulation in the Great Plains.
Figure 2.8 compares geographic distributions of DRI biases among ERI and
the five CWRF members. These DRI biases were highly correlated with P95 bi-
ases in all seasons. Their spatial pattern correlations over the entire CONUS were
0.88?0.94 for ERI, 0.89?0.91 for ECP, 0.88?0.90 for NKF, 0.74?0.90 for TDK,
0.92?0.94 for NSAS, and 0.96?0.98 for BMJ. Strong correlations indicated that un-
30
NRD bias (day)
DJF MAM JJA SON
ERI
CWRF
NKF
TDK
NSAS
BMJ
-35 -30 -25 -20 -15 -10 -5 5 10 15 20 25 30 35
Figure 2.7: Same as Fig. 2.6 except for the number of rainy days (NRD) biases.
31
DRI bias (mm day 1)
DJF MAM JJA SON
ERI
CWRF
NKF
TDK
NSAS
BMJ
-8 -7 -6 -5 -4 -3 -2 -1 1 2 3 4 5 6 7 8
Figure 2.8: Same as Fig. 2.6 except for the daily rainfall intensity (DRI) biases [mm
day?1].
32
derestimations (overestimations) of extreme precipitation occurred mostly because
rainfall intensities were systematically reduced (increased). This was especially the
case for BMJ. Among all simulations, TDK had the lowest correlations, especially in
summer (0.74) when substantial P95 underestimations corresponded to large DRI
overestimations over most regions except the southern coastlines. Such summer
P95 and DRI opposite biases were coincident with substantial NRD underestima-
tions (Fig. 2.7 ), indicating that TDK simulated not only much lighter rains but
also drastically more clear days. A similar situation occurred in autumn, though it
was limited to Texas, Oklahoma and Florida.
Figure 2.9, using ? diagram, compares seasonal P95 spatial pattern corre-
lations and standard deviations of ERI and the 25 CWRF physics configurations
relative to observations. All statistics are based on 36-year mean distributions sepa-
rately over the GS and CM regions. The simulation?s distance from the observation
represents its root mean square error (rmse). In the GS region, ERI had almost no
pattern correlation and the most severely underestimated standard deviation (0.3)
in summer. ERI produced higher correlations in autumn (0.6), spring (0.7), and win-
ter (0.9), but still significantly underestimated deviations (0.7-0.8). CWRF type I
members showed improved skill over ERI, with generally higher correlations in sum-
mer (0.3-0.4), spring (0.6-0.7), autumn (0.5-0.7), and in winter (0.85-0.9), as well as
larger deviations (0.8-1.2). In particular, the control ECP correlated most strongly
to observations in all seasons. Other ECP members also performed consistently
to each other, implying that combining ECP with other physical process schemes
had little impact on simulation ability. One exception was with the boundary layer
33
Central to Midwest Gulf States
0.0 0.0
1.6 0.2 0. 1.6
0.2
4 0.4
1.4
.6 0. Co 1.4 C1 6 rre 1.6
0.6 orre
1.2 0.
la
7 t
l
ion 1.2
0. a7 tion
1.0 1.0
0.8 0.8
1.2 1.2
0.6 0.6
0.4 0.4
0.2 0.2
DJF DJF
0.00.0 0.00.0
1.6 0.2 0. 1.6
0.2
4 0.4
1.4 C 1.4 C
1.6
0.6 orre 1.6
0.6 or
0. l
r
1.2 a 0
e
t OBS li a7 o tn 1.2 .7 iERI on
1.0 ECPNKF 1.0
TDK
0.8 .2 NSAS 0.81 .2BMJ 1
0.6 THO 0.6
MOR
0.4 WD6A3D 0.4
CLP
0.2 CCCMA 0.2
MAM FLG MAM
0.00.0 RRTMG 0.0
1.6 0.2 MYNN
0.0
1.6 0.20.4 ACM 0.4
1.4 C UW0. o 1.4 C1.6 6 rr NOAH 0e 1.6 .6
or
0 lat XOML
rel
1.2 . i 0 a7 on FM 1.2 .7 tion
FMNKF
1.0 FMTDK 1.0
FMNSAS
0.8 FMBMJ 0.8
1.2 FMTHO 1.2
0.6 FMMOR 0.6
0.4 0.4
0.2 0.2
JJA JJA
0.00.0 0.00.0
1.6 0.2 0.20. 1.64 0.4
1.4 C 1.4 C
1.6
0.6 orre 1.6
0.6 or
l re
1.2 0. a7 t 0.
l
ion 1.2
a
7 tion
1.0 1.0
0.8
1.2
0.8
1.2
0.6 0.6
0.4 0.4
0.2 0.2
SON SON
0.0 0.0
Figure 2.9: Taylor diagram of pattern statistics comparing the overall performance
among ERI and all CWRF physics configurations in simulating 1980-2015 mean
seasonal P95 geographic distributions over the GS (left) and CM (right) regions for
winter (DJF), spring (MAM), summer (JJA), and autumn (SON). Shown are the
pattern correlation (azimuthal) and normalized standard deviation (radius) com-
pared with observations. The black dot (OBS) marks the perfect score with a unit
correlation and deviation. Off the chart are outliers performing poorly, with corre-
lations and deviations indicated in the parentheses.
34
0.8 0.8 0.8 0.8
0.4 0.4 0.4 0.4
0.8 0.8 0.8 0.8
0.4 0.4 0.4 0.4
0.99 0.99 0.99 0.99
0.9 0.9 0.9 0.9
0.8 0.8 0.8 0.8
0.99 0.99 0.99 0.99
0.9 0.9 0.9 0.9
0.8 0.8 0.8 0.8
schemes ACM, UW, and MYNN, which had less skill than CAM in the control
CWRF. Meanwhile, NKF members had lower scores (less correlation or larger vari-
ability) than ECP, especially in summer. On the other hand, type II members
generally produced lower correlations and substantially underestimated (BMJ) or
overestimated (TDK) deviations; these errors were especially excessive in summer,
falling off the chart as outliers. TDK members simulated large positive and negative
local errors (Fig. 2.6), which canceled each other to yield small regional mean P95
biases (Fig. 2.5) with significantly overestimated spatial deviations (Fig. 2.9). ERI
performed better in the CM than GS region, with increased pattern correlations in
summer (0.6), autumn (0.8), spring (0.9), and winter (0.9), but still underestimated
deviations (0.5-0.9). The control ECP continually outperformed ERI, with compara-
ble correlations but correct deviations (1.0-1.1) in all seasons. Other ECP members
performed similarly well, especially in winter and spring, though deviations varied
widely (0.95-1.4) in summer and autumn. One outlier was the member combining
ECP with boundary layer scheme ACM (replacing CAM in the control CWRF) in
summer, whose spatial variability (standard deviation) was about 1.6 times that of
the observation. NKF members performed poorly in summer, with lower correla-
tions (0.5) and excessive deviations (1.5); they were comparable to ECP members
in other seasons. On the other hand, in all seasons type II members generally had
lower correlations and were more scattered, with abnormally high or low deviations.
In particular, TDK members substantially overestimated spatial variability in all
seasons (Fig. 2.9), with large positive and negative local errors (Fig. 2.6) canceled
each other to yield small regional mean biases in spring and autumn (Fig. 2.5)
35
Central to Midwest Gulf States
0.9
DJF
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.09
MAM ERI
0.8 ECP
0.7 NKF
0.6 TDK
NSAS
0.5
BMJ
0.4 THO
0.3 MOR
0.2 WD6
A3D
0.1
CLP
0.90
JJA CCCMA
0.8 FLG
0.7 RRTMG
MYNN
0.6
ACM
0.5 UW
0.4 NOAH
0.3 XOML
FM
0.2
FMNKF
0.1 FMTDK
0.09 FMNSAS
mm/day SON mm/day
0.8 FMBMJ
FMTHO
0.7
FMMOR
0.6
0.5
0.4
0.3
0.2
0.1
0.0
0 5 10 15 20 25 30 35 40 10 15 20 25 30 35 40 45
mm/day mm/day
Figure 2.10: The equitable threat score (ETS) that measures the overall skill depen-
dence on rainfall intensity for ERI and all CWRF physics configurations in simu-
lating 1980-2015 mean seasonal P95 geographic distributions over the GS (left) and
CM (right) regions for winter (DJF), spring (MAM), summer (JJA), and autumn
(SON). The x-axis depicts the P95 thresholds at a 1.0 [mm day?1] bin interval, while
the y-axis scores the ETS values.
36
To examine the model skill dependence on P95 magnitude, we adopt the widely
used categorical equitable threat score (ETS), defined as the ratio of hits minus hits
expected by change divided by hits plus false alarms plus misses minus hits expected
by chance (?). Figure 2.10 compares ETS at a 1.0 [mm day-1] bin interval of 36-
year mean seasonal P95 spatial distribution over the GS and CM regions among
ERI and the 25 CWRF physics configurations. Overall, ETS skills were highest in
winter and lowest in summer. This highlights the difficulty of simulating extreme
events in summer, when convective precipitation dominates. In the GS region, the
control ECP outperformed ERI in all seasons for the entire range of observed P95
magnitudes, except in winter for light precipitation (?10 [mm day?1]). This ETS en-
hancement was significant, especially in spring and autumn for the entire P95 range,
when ERI showed low skill. The improvement was moderate in summer, when ERI
scored zero, and also substantial in winter for rainfall above 25 [mm day?1]. While
most ECP members combining with other processes? parameterization schemes were
fairly similar to the control, some had notable skill improvements. In particular, the
members combined with land surface scheme NOAH replacing CSSP increased ETS
and microphysics scheme MOR replacing TAO both increased ETS in winter for P95
of 12-28 [mm day?1], in summer for heavy rain above 25 [mm day?1]. The radiation
scheme CCCMA or RRTMG replacing GSFC, and the combined radiation-boundary
layer-microphysics scheme FLG-MYNN-MOR replacing control GSFC-CAM-TAO
also showed improved skill in summer. Thus, there is still room for further skill en-
hancement in summer through physics refinement. CWRF members using cumulus
schemes other than ECP generally had lower ETS. NKF members scored systemat-
37
ically lower in winter for the entire P95 range, and also in other seasons except for
some improvement to rainfall above 35 [mm day?1], especially in autumn. The im-
provement was limited to a small area along the Texas-Louisiana coast, where ECP
underestimated P95 (Fig. 2.6). Replacing ECP with TDK reduced ETS systemat-
ically, except for improvements in the middle range between 12-20 [mm day?1] in
winter, 20-22 [mm day?1] in spring, and 22-25 [mm day?1] in autumn. The improve-
ments occurred in part of Texas, where ECP overestimated P95. Both NSAS and
BMJ members substantially underestimated P95 (Fig. 2.6) and scored persistently
lower than ECP in all seasons. In the CM region, ERI generally scored higher than
all CWRF members in winter and spring, mainly because the latter systematically
overestimated P95 (Fig. 2.5). Most CWRF members had higher ETS for rainfall
events above 27 [mm day?1], which were totally missed by ERI. Notably, NSAS
had very high ETS in spring, substantially larger than other CWRF members. It
produced systematically higher ETS than ECP, with especially large score increases
between 20-30 [mm day?1]. Figures 2.6-2.8 indicates that NSAS significantly under-
estimated NRD, while ECP was more realistic. Consequently, NSAS showed high
skill for P95 and DRI, but at the cost of overestimating clear days. In spring, TDK
also produced slightly higher ETS than ECP. In summer, for the entire P95 range,
CWRF significantly outperformed ERI, which had almost zero ETS. The control
ECP yielded the highest ETS, except for a slight improvement in rainfall above
23 [mm day?1] achieved by its combination with boundary layer scheme ACM. All
TDK and NKF members scored lower across the entire P95 range, with NSAS mem-
bers performing poorly, and BMJ members failing completely. In autumn, ERI had
38
higher ETS than most CWRF members for P95 between 15-22 [mm day?1]. ERI
skill dropped abruptly above this range, and thus was increasingly outperformed
by CWRF members as P95 rose. The control ECP generally scored highest, but
was exceeded by several other members between 18-27 [mm day?1], including those
using cumulus NSAS and microphysics MOR. This again indicated the potential for
further physics improvement. The BMJ members were persistently outliers, with
little skill.
2.5 Summary and discussion
We analyzed CWRF?s improvements over ERI extreme precipitation simula-
tion in 1980-2015 over the contiguous United States, and selected two key regions
(GS and CM) of substantial ERI underestimation with weak orographic forcing to
focus on the sensitivity to physical process parameterizations. By comparing an
ensemble of 25 simulations downscaled from ERI during 1980-2015 with CWRF
physics configurations of varying parameterization schemes, we investigated the re-
sponsive processes to which regional precipitation extremes are sensitive. We found
that of all the physics configurations, CWRF?s P95 simulation was most sensitive
to cumulus parameterization. Accordingly, we classified the CWRF configurations
into two broad types based on their cumulus schemes. Type I (using ECP and
NKF) did not significantly underestimate P95 in either region, while type II (us-
ing TDK, NSAS, and BMJ) produced substantial underestimations in either GS
spring or CM summer. The two groups differed substantially in model biases and
39
skill scores, depending on regions and seasons, as summarized below. In the GS
region, ERI substantially underestimated P95 in all seasons, while CWRF type I
members produced general improvement. In particular, CWRF control ECP had
the highest ETS for the entire observed P95 range, and outperformed others most
significantly in spring when ERI?s underestimation was the largest. NKF members
generally overestimated P95, except in winter, and scored systematically lower than
ECP, albeit for some improvement to heavy rainfall (especially in autumn). Type II
members generally had lower ETS, severely underestimated P95 in all seasons, and
produced generally lower pattern correlations and substantially smaller (BMJ) or
larger (TDK) spatial variations (especially in summer). One exception was TDK?s
small biases in winter, spring, and autumn, which were the result of incorrect spa-
tial patterns. The members of TDK replacing ECP reduced ETS systematically
in summer, and with the exception of the middle P95 range, also in other seasons.
NSAS members scored persistently lower than ECP, and BMJ members had even
lower skill. In the CM region, ERI substantially underestimated summer P95, while
CWRF type I members produced the most realistic simulations. The control ECP
had the highest ETS and significantly outperformed ERI for the entire P95 range.
ERI underestimated autumn P95, while CWRF type I members slightly overesti-
mated it and hence showed more skill for heavier rainfall. In winter and spring,
when convective activities are relatively infrequent, ERI scored higher for light to
moderate P95 than CWRF, but increasingly lower for heavier precipitation, mainly
because of the systematic CWRF overestimation and ERI underestimation. In all
seasons, type II members significantly underestimated P95, and produced generally
40
lower pattern correlations and substantially smaller (BMJ) or larger (TDK) spatial
variations. TDK scored lower than ECP and NKF, while NSAS had even less skill
and BMJ failed totally. Two exceptions were that NSAS had outstanding ETS for
spring P95, but due to its overestimation of clear days, and that TDK had small
regional mean biases in spring and autumn, but due to the cancelation of large pos-
itive and negative local errors. Some cumulus schemes may have the potential to
capture precipitation extremes under mixed synoptic and convective forcings. For
example, CWRF using TDK simulated spring and autumn P95 reasonably well in
both regions, though it did not correctly capture spatial patterns, while NSAS even
outperformed ECP for most P95 ranges in spring CM, though it overestimated clear
days. Other parameterization schemes may be able to work with ECP to further
improve CWRF skills. In particular, combining the ECP cumulus with MOR micro-
physics schemes significantly enhanced CWRF?s ability to capture summer P95 in
the GS region. For heavy summer rainfall events, the ECP cumulus combining with
CCCMA radiation, MYNN boundary layer, and NOAH land surface schemes also
produced scores higher than the control. Thus, there is still room for further skill
enhancement through physics refinement of the whole model system. In summary,
CWRF using the ECP cumulus scheme performed the best of all physics configu-
rations and generally outperformed ERI, especially over the GS and CM regions in
seasons dominated by convective precipitation. This success may reflect ECP?s use
of an optimized ensemble of parameterization closures based on the framework of ?
to represent convection variations between land and oceans (???). Other cumulus
schemes severely underestimated extreme events. In particular, CWRF members
41
15 10 5 0 15 10 5 0
0 qvt  (g/day) 0
qv
t  (g/day)CM GS
200 200
912
860
916
400 1241090 918 1947
625
152
600 -190 70600 -287 467
-399     qvt  t            -647
800 -307  ECP800
-24   NKF -219
-243    TDK -54
    NSAS -1621000 1000
     BMJ
30 20 10 0 10 20 30 40 50 30 20 10 0 10 20 30 40 50
t  (K/day) t  (K/day)
Figure 2.11: Summer mean vertical potential temperature ( ?,star) and water vapor
( qv, curve) tendency profiles among the five cumulus schemes (color) for 2004 as
averaged over all the grids having rainfall greater than 50 [mm day?1] within the
CM (left) and GS (right) regions. Also labeled at the altitude of the profile peak is
the number that depicts the tendency?s vertical integral for the respective scheme
coded with the same color.
using TDK, NSAS, and BMJ underestimated P95 substantially in both GS and
CM regions for summer, and also largely in autumn and spring. ERI, which used a
variant of the TDK cumulus scheme (?), similarly underestimated P95, even though
it assimilated pseudo-precipitation data. ? compared the convective heating pro-
files of these three cumulus schemes and showed that TDK favors boundary layer
clouds, BMJ favors shallow and midlevel clouds, while NSAS exclusively favors deep
clouds. They further found that the scheme generating more deep convection tended
to produce less intense mean precipitation in the tropics due to reduced net cloud
radiative cooling.
Figure 2.11 compares summer mean vertical temperature and humidity ten-
dency profiles among the five cumulus schemes for 2004, in which the P95 geographic
42
Pressure (hPa)
distribution was most strongly correlated to the climatological mean. These pro-
files were calculated using 3-hourly samples and averaged over all the grids having
rainfall greater than 50 [mm day-1] within the CM and GS regions separately. Also
shown is the vertical integral of each profile, representing the overall strength of
the convection. For all schemes, the tendency magnitudes were generally greater
in the GS than CM region, indicating stronger convection in the former. While
BMJ is based on the convective equilibrium that adjusts unstable model column to
a moist adiabat, all other schemes are based on the mass flux concept but differ in
their formulations of subgrid plume entrainment and detrainment as well as trig-
ger function and closure assumption (?). Thus, for both regions, BMJ produced
a greater warming peak at a higher altitude than other schemes in order to partly
cancel a strikingly large cooling layer below ?600 hPa. A much weaker and shal-
lower cooling layer occurred in ECP, which simulated a unique warming profile of
all moderate attributes including the overall magnitude, layer thickness, and peak
altitude as compared to other schemes. In contrast, NKF produced a much greater
and deeper warming layer than ECP, and a tiny cooling at cloud base. Hence, NKF
expanded the ECP warming profile toward the surface. NKF also had a significant
cooling peak in the cumulus top, likely due to heat loss from large detrainment and
associated evaporation. On the other hand, NSAS generated a very weak warming
in the entire cumulus tower, while TDK yielded an even weaker warming. All cu-
mulus schemes resulted in drying throughout the cloud column as water vapor was
depleted by precipitation. However, their drying?s vertical distributions differed
substantially. In the CM region, BMJ had distinct double peaks at 925 (primary)
43
and 550 (secondary) hPa, corresponding to the shallow and midlevel convection, re-
spectively. ECP also simulated double peaks at 875 and 600 hPa, but the midlevel
drying was predominant. In contrast, NKF produced a much stronger and deeper
drying layer than ECP, with the predominant peak at 700 hPa. These features were
similarly presented in the GS region, except that the drying peaks were closer to
the surface by 50 hPa, indicating deeper convection, and that the overall strength
were increased by 50% (ECP) and 62% (NKF) but decreased by 10% (BMJ). On the
other hand, in the CM region, NSAS generated a deep drying layer that was overall
stronger than ECP but weaker than NKF, with the predominant peak at 825 hPa,
whereas in the GS region, it produced a much weaker drying than both ECP and
NKF, with the peak further down at 950 hPa. In both regions, TDK yielded a tiny
drying and very light rainfall. The above comparison exemplified the complexity of
convective effects as parameterized by different cumulus schemes. Our result agrees
with ? in regarding convective heating profile contrasts among TDK, NSAS, and
BMJ, but disagrees on the tendency of deeper convection for less precipitation. To
the opposite, among all schemes, NKF produced the strongest warming and drying
rates and correspondingly largest P95, whereas TDK and NSAS generated drasti-
cally weaker rates and so substantially underestimated P95. On the other hand,
ECP and BMJ both simulated moderate rates, but the former realistically captured
P95 whereas the latter substantially underestimated it. Therefore, it is difficult to
identify a general correspondence in the strength between convection and P95 Given
our sensitivity analysis above, it is imperative to understand how these cumulus pa-
rameterizations and their interactions with other physics representations result in
44
improved extreme precipitation simulation. More comprehensive analyses and sen-
sitivity experiments are needed to identify the physical mechanisms and feedback
processes that are responsible for model failure or success. This will be the goal of
our companion paper (?) and other subsequent papers in this series.
45
Chapter 3: Dependence on cumulus parameterization and underly-
ing mechanism
3.1 Introduction
Extreme events are by definition rare, but are high-impact, hard-to-predict
phenomena that are beyond our normal expectation (??). Frequency and intensity
of extreme precipitation events have been observed increasing over the United States
in the past decades and are projected to continue rising as global warming (???).
However, realistic simulation of these events remains challenging, since their devel-
opment depends on initial conditions, large-scale drivers, regional feedbacks, and
stochastic processes (?). In particular, most climate models significantly underesti-
mate extreme precipitation (????). This has often been identified as the drizzling
problem, where models simulate too frequent light rain but inadequate heavy events.
The problem exists even in the modern reanalyses that have already assimilated daily
pseudo-precipitation observations (????). Increasing model resolution (such as to
grid spacing of 10?50 km) may help, but cannot solve the problem (????????).
Even cloud-permitting models with grid spacing down to 3-4 km may still underes-
timate extreme precipitation (?) or substantially overestimate it (?). The problem
46
could also be responsible for existing models? general underestimation of observed
extreme precipitation trends, and have further ramification for their reliability in
projecting future changes of these events (?), since biases can persist or propa-
gate into projections (??). Therefore, it is imperative to improve model ability in
simulating extreme precipitation. As the first of a pair, our earlier paper (?) demon-
strated that the regional Climate-Weather Research and Forecasting model (CWRF,
? ) significantly improved its driving European Center for Medium-Range Weather
Forecasts? Interim Reanalysis (ERI, ? ) in simulating extreme precipitation over the
United States. This offered an opportunity to investigate how the improvement was
made. To facilitate the study, we defined two distinct regions: the Gulf States (GS)
and the Central to Midwest States (CM), where ERI substantially underestimated
precipitation extremes while CWRF realistically captured them. We then compared
the performance among an ensemble of 25 CWRF physics configurations in the two
regions and found that the extreme precipitation simulation was most sensitive to
cumulus parameterization. Of five tested schemes, the ensemble cumulus parameter-
ization (ECP, ???) used in the CWRF control was the most skillful at reproducing
seasonal mean spatial patterns of daily 95th percentile precipitation (P95), which is
also a good indicator of rainfall intensity. In contrast, the new Kain-Fritsch scheme
(NKF,?) produced a comparable P95 skill to ECP but at the expense of underesti-
mating the number of rainy days (NRD). On the other hand, the modified ? scheme
(TDK, ?) severely underestimated summer P95 because its stronger NRD underes-
timate could not balance its larger reduction in average daily rainfall intensity (DRI
= total accumulated precipitation amount / NRD). Such TDK biases occurred sim-
47
ilarly to the new simplified Arakawa?Schubert (NSAS) scheme used in the National
Centers for Environmental Prediction?s Global Forecast System (?), except that the
underestimates covered broader areas in summer and extended into all seasons over
the GS region. The Betts?Miller?Janjic? scheme (BMJ, ? ) underestimated P95
most severely in both regions for all seasons because its strong DRI underestimate
accompanied with its NRD overestimate ? the typical drizzling problem. The ERI,
using another variant of the ? scheme (?), simulated realistically DRI through its
observational assimilation, but still underestimated P95 largely in summer CM and
all seasons? GS mainly because of its overestimating NRD or again the drizzling
problem. While capturing the observed characteristics of extreme precipitation is
challenging, understanding the underlying physical mechanisms for model failure or
success is even more difficult. After reviewing various issues on detection, simu-
lation, and attribution of extreme events, ? and ? highlighted the pressing need
to better understand and ultimately overcome model deficiencies. While numerous
studies have focused on why future increases in regional precipitation extremes can
be significantly greater than that expected from the Clausius?Clapeyron relation-
ship (???), very few have investigated how and why current climate models fail to
reproduce extreme events observed in the past. ? suggested that realistically sim-
ulating convective to total precipitation ratios is important. ? demonstrated that
improving cumulus parameterization formulations, such as convective closures and
triggers, is a key. ? speculated that replacing cumulus parameterization with an
explicit convection solution is more desirable. ? showed that no single physics con-
figuration performs best in all cases and the selection of parameterization schemes
48
has larger impact on model performance during more intensive rainfall events. ?
found a large sensitivity of extreme precipitation simulation to physics representa-
tions of land surface, cumulus, and radiation processes, and argued that how much
the convective available potential energy (CAPE) is generated and consumed is es-
sential. ? identified extreme precipitation is sensitive to uncertain parameters in
deep convection scheme, especially its time scale for CAPE consumption rate. ?
attributed the underestimation of extreme precipitation to lacking ice phase pro-
cesses in cloud microphysics schemes. Given the complexity of climate models and
especially convection-cloud-radiation interactions (??), these studies did not fully
address the underlying mechanisms and their relative contributions to systematic
model biases at regional scales. As the second of the pair, this paper builds on our
first study (?) to explore the physical mechanisms that can explain how cumulus pa-
rameterization determines CWRF?s ability in simulating U.S. extreme precipitation
by comparing long integrations with the five representative cumulus schemes (ECP,
NKF, TDK, NSAS, BMJ). Section 2 describes these cumulus schemes, the causal
ingredients and observational data for evaluation, and the experiment design for
sensitivity understanding. Section 3 examines extreme precipitation?s dependence
on cumulus schemes, distinguishing seasonal-interannual variations over the GS and
CM regions. Section 4 presents the complicated correlations of regional extreme
precipitation biases with the causal ingredient fields. Section 5 uses the structural
equation modeling approach to explore the potential causes and physical processes
responsible for the regional extreme precipitation biases and their differences among
ERI and CWRF configurations. Section 6 concludes the results with a summary for
49
the key mechanisms underlying these model biases and regional contrasts.
3.2 Model description, experiment design, causal ingredients, and
observations
The CWRF model formulation, computational domain, and integration pro-
cedure for this study were identical to ?. Here we used the five ERI-driven CWRF
1980-2015 continuous integrations, whose model configurations differed only in cu-
mulus parameterization, swapping the control ECP with NKF, TDK, NSAS, or
BMJ schemes. We also compared these CWRF downscaling simulations with the
driving ERI, which used a variant of the TDK scheme. Table 3.1 summarizes the
major differences among these cumulus schemes in closure, trigger, and entrainment
formulations as well as respective references. We further adopt ? as the common
parameterization for shallow convection, so to eliminate the complication from those
built in some cumulus schemes. ? presented a comprehensive sensitivity analysis
of how these and other 9 cumulus schemes affect CWRF?s prediction of the 1993
and 2008 summer floods over the Central United States. ?? then focused on ECP
to study the effects of its closures on summer precipitation simulations over the
continental United States and coastal oceans.
To gain physical insight, we conducted a correlation analysis among model
biases in P95, DRI, NRD and other dynamic and thermodynamic quantities. For
this, we adopt the ingredients-based approach proposed by ? and used in ?. As
lifting moist air to condensation produces precipitation, extreme events must re-
50
Closure Trigger Entrainment Reference
ECP Ensemble of Maximum Linear combina- ?; ?; ???
moisture con- CAPE tions of multi-
vergence and strength ple entrainment
vertical ve- rates, each mem-
locity closure, ber has its own
with different updraft radius
weights and
ensemble
algorithms
over ocean
and land
NKF Total in- CAPE Entrainments ?;?
stability >0; Parcel were propor-
adjustment temp- tional to mass-
erature flux at the cloud
pert- base divided by
urbation updraft radius
TDK Total in- Moisture Entrainments ?;? ; ?; ? ; ?
stability conver- equals to sum-
adjustment gence mation of
turbulent part
and organized
part. Turbulent
part calculated
from vertical
velocity. Orga-
nized part is a
linear function
of height in
pressure
NSAS Quasi- Lifting Entrainments ?
equilibrium depth inversely pro-
assumption trigger portional to
updraft radius
BMJ Quasi- Positive No entrainment ?; ??
equilibrium cloud work schemes
assumption function
threshold
Table 3.1: Cumulus schemes closures, triggers, entrainments, and their references.
51
sult from sustained high rainfall rates, which require rapid ascents of air masses
containing ample water vapor contents and efficient rainouts. So the basic ingre-
dients for extreme precipitation events include maximizing moisture supply, ascent
strength, and precipitation efficiency. These conditions can be met in the weather
systems with dominant deep moist convection, which prevails in warm seasons with
abundant moisture availability and sufficient buoyant instability promoting strong
updrafts. Convective rainfall rates tend to be higher than those from stratiform (i.e.
stably stratified) precipitation systems, in which updrafts are relatively weak but
widespread as maintained by topographic or synoptic forcings rather than buoyancy.
Thus, an alternative for precipitation efficiency is the convective to total precipita-
tion ratio (RCT). Moisture supply can be measured by column total precipitable
water (TPW), which is associated with atmospheric moisture convergence (MC) and
surface evapotranspiration (ET). Ascent strength may be quantified by CAPE and
convective inhabitation (CIN) for buoyant instability caused updrafts, and 700 hPa
vertical velocity (W700) for large-scale forced upward motions. Note that CAPE
?
depicts the theoretical maximum updraft speed wmax = CAPE (?), and is sepa-
rated from CIN at the level of free convection (LFC). They all links to the lifting
condensation level (LCL), where the air parcel lifted dry adiabatically to become
saturated.
For a deeper process understanding, we examined other relevant quantities.
In particular, surface shortwave downwelling radiation (SWD) provides the source
energy for ET and sensible heat flux (SH), which drive the development of CAPE
and planetary boundary layer height (PBLH). The latter is expected to link with
52
DJF MAM JJA SON
1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Figure 3.1: The teleconnection patterns of the CM and GS regional mean precipi-
tation interannual variations correlated with 850 hPa meridional wind distributions
for winter (DJF), spring (MAM), summer (JJA), and autumn (SON). Outlined sep-
arately for CM and GS is the core correlation area common to all seasons, where
the V850 index is defined.
LFC and LCL as well as the fraction of cloud in low layers (FCL), which in turn
affects SWD and hence 2-m air temperature (T2m) and specific humidity (Q2m).
Deep cumulus towers have higher and colder cloud tops, giving larger fraction of
cloud in high layers (FCH), emitting less outgoing longwave radiation (OLR), and
also thicker optical depths, reflecting more solar radiation (RET). As the water vapor
provider, TPW limits total cloud water (liquid plus ice) water path (CWP) available
to the warm and cold cloud microphysics processes. Furthermore, we calculated net
surface energy (NSE) as net downward radiation minus (latent plus sensible) heat
fluxes at the surface, which include respectively SWD, ET, and SH effects, while
define cloud radiative effect (CRE) as the (OLR plus RET) difference between clear
and total sky at the top of the atmosphere.
Dynamic quantities (except MC and W700) are more difficult to define because
their relationships with precipitation are generally nonlocal (e.g., ????). Thus, we
first examined their observed teleconnection patterns. This requires observed atmo-
53
GS V850 CM V850
spheric circulation data, which is lack. The best proxy is the NASA Modern-Era
Retrospective Analysis for Research and Applications version 2 (NR2), with data
available at grid spacing of 50 km from 1980 onward (?). This offers an indepen-
dent reanalysis from the driving ERI (?80 km) so that our diagnosis based on the
CWRF downscaling simulations is not biased toward ERI inherited circulation er-
rors. Figure 3.1 shows the teleconnection patterns of the CM and GS regional mean
precipitation interannual variations correlated with 850 hPa meridional wind distri-
butions. The correlations were done for each season using monthly mean anomalies
from daily maximum wind values to remove the influence of the annual and diur-
nal cycles. Strong positive correlations exist with 850 hPa southerly flows over the
Southeast, with specific patterns that vary both in seasons and between CM and
GS. For simplicity, we defined a low-level flow index (V850) as daily 850 hPa merid-
ional wind averaged over the core correlation area that is common to all seasons but
separated for CM and GS.
This study used daily precipitation and T2m data based on the National
Weather Service Cooperative Observer network (COOP) with quality-controlled
records from 8516 stations (??). They were analyzed with topographic adjust-
ments onto CWRF grids following ?. Table 3.2 summarizes other daily observa-
tional data sources and available periods used in this study. These include: 1)
CWP retrieved from the Advanced Very High-Resolution Radiometer (AVHRR)-
based Thematic Climate Data Records (?) 2) SWD, RET, OLR and CRE from the
Surface Radiation Budget release 3.0 Global Energy and Water Cycle Exchanges
project (SRB/GEWEX) Daily Universal Time Data (?) ; 3) Furthermore, our di-
54
Meteorology variables Available years Sources
Precipitation
1980-2015 COOP data (?)
(PR)
Temperature at
2-meter (T2m)
Total precip- 2001-2013 MODIS Daily Global data prod-
itable water uct (?)
(TPW)
Cloud (liquid 1984-2008 Advanced Very High-Resolution
plus ice) water Radiometer (AVHRR) (??)
path (CWP)
Shortwave
downwelling at
1984-2010 SRB3.0 (?)
surface (SWD)
Reflecting solar
radiation at top
of atmosphere
(RET)
Outgoing long-
wave radiation
clear/all (OLR)
Cloud radiative
effect (CRE)
Table 3.2: Observations available years and their references.
55
agnostic comparison used all respective variables from ERI and NR2. These obser-
vational and reanalysis data were all interpolated onto CWRF 30-km grids using a
higher-order patch recovery algorithm from the Earth System Modeling Framework
regridding package (?) to reduce data mapping error. One important exception was
for precipitation, whose daily values from COOP, ERI, and NR2 were all mapped
onto the same CWRF grid by mass-conservative algorithms (?). This remapping
procedure was to alleviate the impact of data scale mismatch on extreme precipita-
tion analysis (?).
3.3 Extreme precipitation dependence on cumulus parameterization
? demonstrated that the simulation of climatological average seasonal P95
distributions is most sensitive to the choice of the cumulus parameterization among
all 25 CWRF physics configurations. This section elaborates performance differences
in interannual variability averaged over the GS and CM regions among ERI and
CWRF five cumulus schemes (ECP, NKF, TDK, NSAS, BMJ). Here we focused on
these CWRF configurations, where only the cumulus parameterization was changed
and other physics schemes were kept identical.
Figure 3.2 compares CM and GS regional average P95 interannual variations
during 1980-2015 for each season between NR2, ERI and five CWRF cumulus
schemes against observations. Also shown are the corresponding temporal corre-
lations and rmse with respect to observations during the whole period. These two
statistics differ from those shown in Fig. 2.9, depicting performance in the regional-
56
OBS TDK NKF MAM
NR2 NSAS ECP 40
ERI BMJ
30
20
CM GS 10
40 JJA
30
20
10
SON 40
30
20
10
40 DJF
30
20
10
1985 1990 1995 2000 2005 2010 20151985 1990 1995 2000 2005 2010 2015
0 0
5 5
1.0 10 10MAM JJA SON DJF MAM JJA SON 15 DJF0.8 15
0.6
0.4
0.2
Figure 3.2: Seasonal P95 interannual variations during 1980-2015 averaged over the
CM (left) and GS (right) regions for spring (MAM), summer (JJA), autumn (SON),
and winter (DJF) [from top down] as observed (OBS), assimilated by NR2 and ERI,
and simulated by CWRF using five cumulus schemes (ECP, NKF, TDK, NSAS,
BMJ). Also shown (bottom) are the corresponding temporal correlations (COR,
scaled upward on the left) and root mean scare errors (RMSE, scaled downward on
the right) with respect to observations during the whole period.
57
 COR
RMSE 
mean temporal characteristics for the former rather than the time-mean spatial
distributions for the latter. In the CM region, ERI was the most realistic in winter,
as compared to ECP?s slight overestimation of P95 magnitude and variability. ERI
and ECP performed comparably well in autumn. Interestingly, in spring ERI had
a notably lower correlation but smaller rmse than ECP. This resulted from ERI?s
unrealistic decreasing trend and ECP?s slight overestimation. On the other hand, in
summer ERI had a zero correlation and large rmse, whereas ECP was much more
realistic. This occurred because ERI not only substantially underestimated P95 but
also produced a large unrealistic decreasing trend. In the GS region, ERI?s corre-
lation was notably higher than that of ECP in autumn and spring, comparable in
winter but lower in summer. However, in all seasons ERI produced notably larger
rmse than ECP due to its substantial systematic underestimation. In the CM region,
NR2 performed systematically better than ERI in simulating P95 variations, with
higher correlations and smaller rmse in all seasons. The skill difference was especially
large in summer when ERI produced significant underestimation and fake decreas-
ing trend. NR2 was also more realistic than ECP in autumn and winter, while they
were comparable in spring and summer. In the GS region, NR2 and ECP were com-
parable in winter and both slightly better than ERI, which underestimated P95. In
autumn, NR2 were more skillful than ERI (larger rmse by systematic underestima-
tion) and ECP (smaller correlation by less interannual correspondence). In summer,
NR2 and ERI both underestimated P95, having similar rmse larger than ECP, while
ERI also incurred a notably smaller interannual correlation than NR2 and ECP. In
spring, NR2 had a correlation comparable to ERI and higher than ECP, and an rmse
58
similar to ECP because of systematic underestimation, which ERI was more severe.
Overall, NR2 more faithfully reproduced the P95 annual cycle and interannual vari-
ations in both regions than ERI, and hence could serve as a reasonable proxy for
the reference when lacking observational data. Given that ERI assimilated pseudo-
precipitation data, its interannual correspondence to observations is expected to
be high, with a relatively high correlation and small rmse. [The same expectation
applies to NR2.] The above comparison indicated otherwise. In cases where deep
convection dominates (GS throughout the year and CM in summer and spring), ERI
failed to capture the key processes responsible for precipitation extremes (?). Since
CWRF did not directly assimilate surface observations but modeled precipitation as
driven by ERI general circulation though lateral boundary conditions (??), ECP?s
skill matching or even exceeding that of ERI is an outstanding achievement. Of all
the cumulus schemes examined, ECP performed best overall, with highest correla-
tions and lowest rmse in both CM and GS regions for most seasons. One exception
was autumn GS, where NKF produced slightly higher correlation but similar rmse.
Another was in spring when TDK?s skill was comparable for GS and better for CM
(smaller rmse or less overestimation). Overall, NKF scored the second, TDK the
third, NSAS the fourth, whereas BMJ was absolutely the worst with substantial
systematic underestimation and little coherence to observations. NSAS generally
underestimated, except for reasonable spring and underestimated winter in the CM
region. TDK?s performance was more mixed for both regions: similar to ECP in
winter, slightly better than ECP in spring, close to NSAS in summer, and slightly
worse than NKF in autumn. These results confirmed that cumulus parameteriza-
59
tion played a dominant role in extreme precipitation simulations, in terms of not
only climatological mean spatial distributions but also regional mean interannual
characteristics.
Figure 3.3 compares daily RCT averaged in the CM and GS regions for each
season. Daily ratios at individual grids were first grouped according to total precip-
itation intensity bins at an interval of 5 [mm day?1] and then averaged into those
bins over all grids within the CM or GS region. Marked also are the corresponding
regional average 1980-2015 mean seasonal P95 values. In a striking contrast, for all
seasons NKF produced substantially high ratios while TDK yielded systematically
lowest ratios across all precipitation intensities. The RCT ratios from light to heavy
rain changed gradually in NKF, but much faster in TDK. Correspondingly, for both
regions NKF notably overestimated spring and autumn P95 while TDK severely
underestimated summer P95, otherwise they simulated P95 reasonably well. On
the other hand, ECP produced intermediate ratios, which remained relatively sta-
ble and high for heavy precipitation events; its simulated P95 was most realistic
among all schemes except for overestimation in CM spring and winter. To some ex-
tent, these results seem to agree with ? in that a balanced convective contribution
to total precipitation may be associated with extreme events. Models simulating
larger (smaller) convective ratios tend to overestimate (underestimate) P95, espe-
cially when convection activities prevail. However, the situation is more complex.
ERI, NSAS, and BMJ also produced intermediate ratios, but all substantially un-
derestimated P95 with exceptions only in CM spring and winter for ERI and NSAS.
While NR2 closely matched TDK for both RCT and P95 in autumn and winter, they
60
100
NR2 TDK NKF MAM
80 ERI NSAS ECP BMJ
60
40
20
100
JJA
80
60
40
20
100
CM SON GS
80
60
40
20
100
DJF
80
60
40
20
00 5 10 15 20 25 30 35 40 45 50 55 60 5 10 15 20 25 30 35 40 45 50 55 60
Figure 3.3: Seasonal mean RCT distributions averaged in the CM (left) and GS
(right) regions according to total precipitation intensity bins at an interval of 5 [mm
day?1] for spring (MAM), summer (JJA), autumn (SON), and winter (DJF) [from
top down] as assimilated by NR2 and ERI, and simulated by five CWRF cumulus
schemes (ECP, NKF, TDK, NSAS, BMJ). Marked with circles are the corresponding
regional average 1980-2015 mean seasonal P95 values, while the respective observed
values are depicted by the vertical lines.
61
differed significantly in other seasons. In spring, although RCT was very similar,
NR2 underestimated (overestimated) P95 in the GS (CM) region while TDK was
realistic. In summer, for the CM region with almost same RCT, P95 was correctly
simulated by NR2 but largely underestimated by TDK; for the GS region, NR2
produced much smaller RCT but less P95 underestimation than TDK. Therefore,
the relative contribution of the parameterized versus resolved rainfall to extreme
events may not be as important as originally thought, some other processes must
be at play.
3.4 Correlation analysis of regional extreme precipitation biases
Tracing back specific causes for model biases is challenging, but necessary to
guide future model improvement. This is particular so for extreme events as they
are rare but of high impacts (?). Here we explored possible physical processes that
lead to CWRF performance differences in extreme precipitation simulation. To do
so, we first analyzed possible relationships between biases in P95 and other fields on
extreme event days. For a given simulation of ERI or CWRF, we first identified the
date of the P95 event in each season of a year at a specific grid. Then at that grid
and respective to the dates of all P95 events as a function of season and year, we
calculated the biases of simulated precipitation and their relevant variables from the
corresponding observations. Although the specific date when the P95 event occurred
in a season of a year was expected to differ from grid to grid, the composite biases
retained the coherence among the variables in concern. These biases were averaged
62
over the CM and GS regions to yield interannual variations in each season and for
every simulation of ERI and CWRF?s five cumulus members. The subsequent statis-
tics (correlation coefficient and significance p-value) were based on these regional
mean interannual variations of all six simulations between precipitation and each
of the other variables for the entire period with observational data records as listed
in Table 2. Thus, the total number of samples used in the statistics was 6 simula-
tions times (25, 27, 36) years or (150, 162, 216) for (CWP, SWD/RET/OLR/CRE,
P95/T2m) variables. We also included bias correlations with DRI and NRD, which
were calculated from the same COOP precipitation data as for P95, and by def-
inition seasonal values rather than on specific P95 event dates. All other fields,
while essential for process understanding, lacked daily observational data and hence
their departures from the respective NR2 result were used, having 6 simulations
times 36 years or 216 samples. For any pair of variables, the correlation could be
based on their biases if both have observations or otherwise their departures. This
mixture, however, would cause two issues. First, the P95 event dates simulated by
NR2 likely differ from those observed, and so the biases and departures would be
based on different events between the fields with and without observational data.
Second, the cross-field correlations and subsequent regression models would have to
build upon the shortest common period (25 years or 150 samples), and so the result
would suffer from large uncertainties. To alleviate these issues, we first established
the consistency between the bias and departure correlations of P95 with those fields
that had observational data, then examined the functional relationships among the
departures of all fields, and finally built the regression models on these departures in
63
Section 5 to best capture P95 structural dependences on the key contributing fields.
Correlations are considered as statistically significant if they pass the student?s t-test
at the one-tail 95% confidence level.
Figure 3.4 compares the above P95 composite bias and departure correlations
with those fields that had observational data in all seasons, while Figs. 3.5-3.8 com-
pare the composite departure correlations across all ingredient fields in respectively
spring, summer, autumn, and winter. In both (CM, GS) regions, P95 and DRI
biases were strongly correlated in phase, with coefficients increasing from summer
(0.62, 0.71), autumn (0.80, 0.79), spring (0.89, 0.87) to winter (0.89, 0.89). A model
simulating stronger (weaker) mean rainfall intensity likely produced higher (lower)
extreme precipitation. The correlation with NRD was positive only for GS summer
(0.35), reversed in winter (-0.41) and spring (-0.21), and insignificant in autumn; it
was systematically much weaker for CM, significant only in summer (0.26) and win-
ter (-0.23). Thus a simulation biased with more rainy days tended to overestimate
summer but underestimate winter (and weakly spring) extreme precipitation; this
tendency was more evident in the GS than CM region, and not obvious in autumn.
Most of these bias correlations were faithfully reproduced by the departures from
NR2 in both sign and magnitude.
The correlation between P95 and parameterized or convective precipitation
(PC) departures was strong for both (CM, GS) regions in summer (0.54, 0.68), sig-
nificant in spring (0.47, 0.48) and autumn (0.36, 0.43), and reversed or insignificant
in winter (-0.21, -0.13). On the other hand, the correlation with resolved or explicit
precipitation (PL) departures was significant for both (CM, GS) regions only in
64
0.9 CM GS
0.6
0.3
0
-0.3
Bias
-0.6 Departure MAM
0.9
0.6
0.3
0
-0.3
-0.6
JJA
0.9
0.6
0.3
0
-0.3
-0.6
SON
0.9
0.6
0.3
0
-0.3
-0.6
DJF
Figure 3.4: Composite P95 bias (blue) and departure (red) correlations with those
fields that had observational data (DRI, NRD, SWD, RET, OLR, CRE, CWP, T2m)
as well as P95 departure correlations with rainfall components (PL, PC) in the CM
(left) and GS (right) regions for spring (MAM), summer (JJA), autumn (SON), and
winter (DJF) [from top down]. A star mark indicates the correlation is statistically
significant. They are labeled with a number equal to the correlation coefficient times
100 if statistically significant.
65
P95 bias or departure correlation
DRI 89 80 62 8989 79 62 89
NRD -23 26-19 26
PL 44 19 38
PC -21 36 54 47
SWD 20 24 2633 38 31 42
RET -23 -19 -26-29 -33 -27 -40
OLR 34 30 27 4338 30 23 43
CRE 27 4016 37 40
CWP 54 52 59 5329 23 22 20
T2m 65 6917 63 71
DRI 89 79 71 8789 81 74 88
NRD -41 35 -21-41 36
PL 17 24
PC 43 68 48
SWD 27 1720 24 23
RET -18 -18 -19
OLR 21 2218 17 25
CRE 2215 24 21 18
CWP 63 51 20 6237 26 18
T2m 20 31 27 4023 31 24 42
RCT 45 27 34 41 46 ?22 15 17 31 34 ?18 ?51 ?45 ?26 ?47 47
CRE 97 82 57 82 89 24 25 67 58 ?21 29 ?88 ?90 ?96 40 RCT ?25
RET ?99 ?82 ?52 ?95 ?91 ?20 23 ?17 ?18 ?24 ?65 ?66 16 ?28 93 97 ?40 CRE 18
CWP ?15 ?17 40 22 21 18 24 ?20 18 20 RET ?19 ?96
FCH ?95 ?76 ?50 ?95 ?87 ?24 27 ?21 ?19 ?29 ?64 ?65 21 ?26 90 ?43 CWP 20 ?15
FCL ?90 ?67 ?49 ?87 ?83 40 ?23 ?18 ?16 ?53 ?59 ?21 ?19 FCH ?20 17 94 ?86
CAPE 30 31 29 32 24 49 36 24 64 65 30 ?61 ?38 49 58 FCL 87 91 ?88 ?21
CIN 32 17 17 16 44 41 ?51 ?39 36 CAPE 44 ?26 ?28 26
LCL ?31 ?52 ?42 ?30 ?65 ?45 75 ?41 CIN 16 33 ?31 ?36 ?34 38
LFC ?15 ?46 ?17 ?66 ?32 ?16 ?48 ?75 ?67 ?25 ?47 LCL ?25 ?22 ?48
PBLH 64 50 34 69 60 16 ?18 28 15 55 33 LFC ?18 77 ?38 ?67 20 21 22 ?23 ?20
T2m 65 48 59 65 62 64 17 41 75 71 PBLH 17 ?30 22 29 ?58 ?69 ?70 61 18
Q2m 22 43 28 27 77 24 20 31 61 T2m 42 65 ?65 ?45 47 63 ?64 ?69 ?71 69 15
V850 15 34 24 35 28 20 Q2m 38 76 28 ?77 ?68 36 72 ?27 ?30 ?29 27
MC 17 23 19 V850 43 34 21 ?49 ?45 48 ?17 ?17 16
W700 ?17 ?31 ?21 46 27 MC 16 ?22 ?17 ?16 ?17 31
TPW 20 30 18 24 50 W700 30 26 34 33 15 ?19 ?19 40 21 ?18
ET 91 61 36 86 34 TPW 16 51 24 37 72 50 23 ?57 ?39 46 ?17 26 ?17 18 17
OLR 92 74 45 43 ET 16 30 17 36 59 56 ?25 29 25 ?75 ?77 ?18 ?86 87
NSE 53 40 32 OLR 25 71 17 31 69 71 ?21 30 32 ?82 ?91 ?28 ?91 77
SH 86 44 NSE 57 57 22 25 44 69 44 ?46 ?24 43 40 ?61 ?64 ?28 ?67 70 19
SWD 42 SH 26 48 87 60 60 64 25 24 ?77 ?84 ?19 ?90 83
5 
?
9 SWD 23 91 66 91 85 15 15 27 70 68 ?20 33 28 ?89 ?93 ?22 ?99 96P
5 
?
P9
MAM GS 
?100 ?50 0 50 100
Figure 3.5: Spring composite departure correlations across P95 and its all ingredient
fields in the CM (upper triangle) and GS (lower triangle). They are coded each
with a color and, if statistically significant, also labeled with a number equal to the
correlation coefficient times 100. The diagonal represents the P95 correlation with
each field named.
2
66
CM
SWD
SH
NSE
OLR
ET
TPW
W700
MC
V850
Q2m
T2m
PBLH
LFC
LCL
CIN
CAPE
FCL
FCH
CWP
RET
CRE
RCT
RCT 22 16 22 30 26 20 ?19 ?22 ?20 ?23 30
CRE 95 90 50 76 71 23 80 81 54 ?91 ?86 ?20 ?93 37 RCT
RET ?99 ?93 ?52 ?94 ?77 ?16 19 ?74 ?87 ?55 98 97 33 ?27 CRE 21 36
CWP ?31 ?24 ?36 ?39 ?27 40 50 18 22 30 ?24 ?28 ?27 23 32 36 22 RET ?18 ?94 ?27
FCH ?95 ?88 ?54 ?94 ?74 31 17 ?67 ?85 ?53 94 ?19 CWP 18 46 ?37
FCL ?97 ?91 ?49 ?92 ?75 ?19 17 ?73 ?86 ?55 ?27 FCH 48 94 ?84 ?22
CAPE 61 30 33 78 32 ?74 ?51 50 49 FCL ?25 88 34 96 ?92 ?32
CIN 21 ?16 20 17 42 27 ?50 ?33 35 CAPE 16 ?26 ?19 27 29
LCL 58 64 27 50 21 ?35 ?21 ?18 ?30 ?77 30 56 60 CIN 34 ?17 ?21 ?19 25
LFC ?75 ?27 ?23 ?46 ?83 ?38 ?46 LCL 18 ?38 ?49 ?52 ?35 ?53 50
PBLH 87 88 45 85 54 ?17 ?21 79 30 LFC 48 ?29 ?73 31 25 26 ?29 ?33
T2m 76 80 56 67 37 50 17 17 63 PBLH ?22 55 19 17 ?82 ?81 ?39 ?87 79
Q2m ?17 ?25 ?16 74 28 31 46 40 T2m 24 81 ?57 28 47 55 ?77 ?70 ?28 ?78 79 28
V850 59 40 37 26 Q2m 45 ?89 ?65 23 80 ?20 19 26
MC ?17 40 17 V850 ?17 22 ?17 ?30 17 21 ?15
W700 ?20 ?24 ?19 43 42 MC 31 ?18 ?15 33
TPW 17 47 W700 23 50 24 36 21 ?35 ?22 34 44 16
ET 75 52 21 68 TPW 56 35 35 74 45 23 ?71 ?36 57 ?29 23 ?21 25 28
OLR 93 88 51 23 ET 23 29 37 51 ?29 16 16 ?74 ?67 ?22 ?73 70 28
NSE 52 48 16 OLR 17 61 69 84 ?19 49 18 ?84 ?90 ?53 ?89 71
SH 94 35 NSE 58 33 ?28 ?30 ?21 55 56 ?16 44 26 ?52 ?68 ?46 ?64 63 16
SWD 31 SH 27 60 83 39 82 86 ?17 61 29 16 ?85 ?85 ?50 ?91 85 20
 ?
95 SWD 24 92 64 90 71 16 ?16 79 87 ?24 55 21 19 ?96 ?93 ?46 ?99 94 26P
 ?
P9
5
JJA GS 
?100 ?50 0 50 100
Figure 3.6: Same as Fig. 3.5 except for summer.
673
CM
SWD
SH
NSE
OLR
ET
TPW
W700
MC
V850
Q2m
T2m
PBLH
LFC
LCL
CIN
CAPE
FCL
FCH
CWP
RET
CRE
RCT
RCT ?29 ?40 ?16 ?23 27 28 ?39 ?38 45 ?27 ?45 ?46
CRE 22 53 ?25 ?31 61 31 45 ?60 ?20 33 55 ?45 27 21 16 RCT ?20
RET ?97 ?55 34 ?93 ?38 20 30 43 ?23 ?24 87 92 ?33 CRE 24
CWP 25 ?19 22 17 16 ?28 ?18 31 23 RET ?65
FCH ?86 ?41 26 ?93 ?20 20 30 ?28 48 ?20 ?19 84 ?32 CWP 26 ?40
FCL ?76 ?21 21 ?88 41 49 ?38 15 64 31 ?32 ?33 ?19 FCH 21 91 ?39
CAPE ?30 25 37 30 ?65 ?48 56 FCL 79 22 87 ?47
CIN ?38 29 15 ?58 ?30 39 28 35 ?20 ?44 ?31 ?17 CAPE 25 21 16
LCL 20 ?16 22 ?47 ?32 ?43 ?36 ?41 ?24 79 CIN 28
LFC 20 ?16 21 ?53 ?34 ?38 ?46 ?39 ?17 ?15 LCL ?25 ?53 ?29 ?15 ?23 ?22 20
PBLH 29 ?15 55 34 40 ?36 48 LFC 79 ?34 ?66 ?27 15
T2m ?36 ?17 40 ?41 24 56 40 ?26 46 17 PBLH 21 15 ?23 ?20 ?38 58
Q2m ?16 25 ?20 61 19 22 23 38 T2m 31 32 ?63 ?53 16 45 34 19 18
V850 ?16 ?26 16 15 17 Q2m 22 71 ?66 ?62 34 42 16
MC ?50 35 31 ?42 V850 24 19 ?30 ?34 19 24
W700 ?18 15 ?41 25 68 MC 21 ?19 ?27 ?22 15 ?22 ?16
TPW ?23 25 25 W700 34 44 27 42 ?32 ?30 22 46 28 30 32
ET 51 51 ?29 18 25 TPW 18 60 33 69 62 ?65 ?68 39 39 16 18 24
OLR 84 31 ?16 30 ET 20 ?22 35 ?46 ?60 ?69 62 20
NSE ?45 ?77 ?36 OLR 56 ?22 ?38 22 ?83 ?94 ?18 ?92 31
SH 70 43 NSE 39 24 38 41 48 ?39 ?20 16 28 ?21
SWD 38 SH 18 ?48 63 37 ?19 ?24 ?19 ?17 ?18 34 18 19 ?61 ?64 16 ?76 55
?
95
 
SWD 80 ?17 86 73 ?22 ?25 ?15 42 15 24 ?15 ?83 ?87 ?99 72P
 ?5
P9
SON GS 
?100 ?50 0 50 100
Figure 3.7: Same as Fig. 3.5 except for autumn.
684
CM
SWD
SH
NSE
OLR
ET
TPW
W700
MC
V850
Q2m
T2m
PBLH
LFC
LCL
CIN
CAPE
FCL
FCH
CWP
RET
CRE
RCT
RCT ?30 ?33 ?34 19 ?25 ?25 ?28 ?39 ?16 ?22 ?51
CRE 68 55 32 41 ?22 ?41 ?38 15 ?22 ?33 ?53 RCT ?36
RET ?96 ?60 ?15 ?93 ?49 25 30 33 67 ?19 ?27 ?20 19 75 88 ?29 CRE 15
CWP 16 19 44 28 20 37 38 ?15 15 29 RET ?18 ?76
FCH ?86 ?52 ?93 ?31 20 23 ?19 24 71 15 ?15 ?27 ?27 64 ?31 CWP 37
FCL ?66 ?23 ?68 ?35 55 54 ?23 42 71 42 ?20 ?43 ?19 40 FCH ?18 84 ?41
CAPE ?17 48 48 21 20 28 ?50 ?51 21 27 FCL 63 82 ?63 ?19
CIN 16 29 29 ?33 19 CAPE 35 34 ?27
LCL 22 28 ?50 ?46 ?44 ?26 ?28 63 CIN 39 ?25 ?19 17
LFC 21 20 23 ?21 ?15 ?42 ?40 ?25 ?21 ?19 LCL ?28 ?53 ?23 27
PBLH 24 31 30 ?25 42 23 LFC ?23 78 ?40 ?72 ?23 26
T2m ?72 ?49 20 ?64 ?16 37 35 61 PBLH ?16 ?28 ?20 16 ?20
Q2m ?38 ?33 19 ?20 ?16 55 41 20 19 28 T2m 23 48 ?41 ?40 23 45 41 22 ?21
V850 27 15 34 Q2m 31 65 ?39 ?33 22 45 28
MC ?25 16 16 V850 27 30 ?22 ?24 20
W700 ?20 ?31 85 MC 25 37 20 ?25 ?21 ?28
TPW ?18 ?25 19 W700 29 20 34 46 47 23 ?39 ?27 54 44 20 ?17
ET 47 20 41 TPW 29 68 15 29 60 60 25 ?42 ?39 50 40 25 ?21
OLR 89 46 23 38 ET ?18 ?18 17 ?16 ?63 ?66 ?80 60
NSE ?35 OLR 18 71 17 25 19 ?69 ?93 ?87 38
SH 66 33 NSE 23 38 15 44 28 36 56 40 ?21 ?27 18 ?25 ?28 ?27 33
SWD 33 SH 32 60 47 ?26 ?21 ?33 17 15 ?42 ?61 16 ?74 60
5 
?
9 SWD 20 79 26 84 79 ?16 ?22 21 18 ?78 ?82 ?99 80P
?
95
 
P
DJF GS 
?100 ?50 0 50 100
Figure 3.8: Same as Fig. 3.5 except for winter.
691
CM
SWD
SH
NSE
OLR
ET
TPW
W700
MC
V850
Q2m
T2m
PBLH
LFC
LCL
CIN
CAPE
FCL
FCH
CWP
RET
CRE
RCT
spring (0.38, 0.24) and autumn (0.44, 0.17), and CM summer (0.19). Therefore, for
extreme precipitation simulation, cumulus parameterization is more sensitive than
cloud microphysics, especially in summer. However, P95 and RCT departure corre-
lations were negative and significant for both (CM, GS) regions in autumn (-0.46,
-0.20) and winter (-0.51, -0.36), and GS spring (-0.25). As compared to the P95-
PC correlations, they were drastically weakened in both summer and spring, totally
reversed in autumn, and largely strengthened in winter. As discussed earlier, the
RCT role on P95 was very much mixed.
Positive correlations between P95 and SWD biases were significant only for
CM summer, spring and autumn (0.24, 0.26, 0.20) and GS summer and spring
(0.27, 0.17). These bias correlations were generally captured by the corresponding
departures from NR2, although systematically strengthened for CM spring to winter
(0.42, 0.31, 0.38, 0.33) and GS spring, summer and winter (0.23, 0.24, 0.20). Pos-
itive correlations between P95 and ET departures were significant for both (CM,
GS) regions only in spring (0.34, 0.16) and autumn (0.25, 0.20). In these transition
seasons, the regional water recycling process plays a certain role in extreme precipi-
tation formation, and a simulation with insufficient (excessive) water supply through
ET could underestimate (overestimate) P95. Positive correlations between P95 and
SH departures were significant for the CM region from spring to winter (0.44, 0.35,
0.43, 0.33), while generally reduced for the GS region (0.26, 0.27, 0.18, 0.32). Hence,
surface energy supply through SH could affect P95 significantly. These results were
coherent since SWD is the dominant source for ET and SH, although their effects
on extreme precipitation strongly depended on regions and seasons due to other
70
feedback processes. P95 and NSE departures were correlated significantly only in
CM from spring to autumn (0.32, 0.16, -0.36), again indicating feedback effects.
Positive correlations between P95 and OLR biases were significant in the CM
region from spring to winter (0.43, 0.27, 0.30, 0.34). They were largely reduced in
the GS region, significant only in summer (0.22). Meanwhile, negative correlations
between P95 and RET biases were significant only in CM spring and winter (-0.26,
-0.23). Consequently, positive correlations between P95 and CRE biases were signif-
icant only in CM spring winter (0.40, 0.27) and GS spring (0.22). Deeper convection
produced smaller OLR that overcame larger RET to reduce CRE, a net warming
effect to the earth. Underestimation of deep convection could reduce extreme precip-
itation, more so in the CM than GS region. As compared to these bias correlations,
P95 correlations with OLR departures were very close in the CM region from spring
to winter (0.43, 0.23, 0.30, 0.38), while in the GS region strengthened in spring
(0.25), slightly weakened in summer and winter (0.17, 0.18), and still insignificant
in autumn. For RET departures, the correlations were generally strengthened in
the CM region from spring to winter (-0.40, -0.27, -0.33, -0.29), and also in the GS
region in spring, summer and winter (-0.19, -0.18, -0.18) but still insignificant in
autumn. For CRE departures, the correlations in both (CM, GS) regions were close
in spring (0.40, 0.18), strengthened in summer (0.37, 0.21) and autumn (0.16, 0.24),
and in winter only significant in GS (0.15). Considering large uncertainties in the
satellite estimates, these departure and bias correlations agreed reasonably well.
Positive correlations between P95 and CWP biases were strong in the CM re-
gion from spring to winter (0.53, 0.59, 0.52, 0.54), and also in the GS region except
71
for weaker summer (0.62, 0.20, 0.51, 0.63). This agrees with ? in that the cloud
microphysics process is important to extreme precipitation simulation. Positive
correlations were also shown by the corresponding departures from NR2, although
systematically weakened for CM spring to winter (0.20, 0.22, 0.23, 0.29) and GS
summer to winter (0.18, 0.26, 0.37). Such departure and bias correlation differences
could be partly due to large uncertainties in the satellite estimates. Positive corre-
lations between P95 and TPW departures were strong in spring and summer for the
CM region (0.50, 0.47) but systematically reduced for the GS region (0.16, insignif-
icant); they were weaker in autumn and winter for the CM region (0.25, 0.19) and
stronger for the GS region (0.18, 0.29). This agrees with Kunkel et al. (2013a) in
that TPW is generally a good indicator for the upper limit of extreme precipitation.
These results were consistent since atmospheric water vapor abundance (TPW) is
necessary for cloud liquid and ice water (CWP) formation, and their impacts on
P95 were systematically enhanced from the GS to CM region mainly because water
supply is typically more limited (so stronger dependence) over inland than coastal
areas.
Correlations between P95 and T2m biases were positive for the CM region,
strong in spring (0.69) and summer (0.65) but insignificant in autumn and winter.
They were reduced for the GS region in spring (0.40) and summer (0.27) but in-
creased in autumn (0.31) and winter (0.20). These bias correlations were faithfully
reproduced by the departures from NR2 in both sign and magnitude. In both re-
gions and for all seasons, strong T2m departure correlations were found with NSE,
Q2m, TPW, PBLH, and CAPE (positive) as well as LFC (negative), in the mag-
72
nitude range of 0.40?0.81 with very few exceptions. Consistently, greater NSE
produced warmer T2m, wetter Q2m, more TPW, higher PBLH, lower LFC, and
larger CAPE. In spring and summer, strong T2m departure correlations were also
found with SWD, SH, ET, and OLR (positive) as well as FCL, FCH, and RET
(negative), in the magnitude range of 0.48?0.82. Greater SWD led to warmer T2m,
larger SH and ET fluxes, less FCL and FCH clouds, and so reduced RET and OLR.
On the other hand, these correlations in autumn and winter were totally reversed in
sign, and most often weakened in magnitude and some became insignificant. There-
fore, surface-atmospheric and cloud-radiative interactions change substantially from
spring and summer to autumn and winter, when the regional precipitation processes
change from the dominance of deep convections to stratiform systems.
Positive correlations between P95 and PBLH departures were significant in
CM spring, summer and winter (0.33, 0.30, 0.23), and GS spring and autumn (0.17,
0.21). They were identified with strong PBLH links to T2m (discussed above) and
other fields. In spring and summer, strong PBLH departure correlations were found
in both regions with SH, ET, SWD, OLR and CRE (positive) as well as FCL, FCH
and RET (negative), in the magnitude range of 0.50?0.88. Greater SWD, SH and
ET caused higher PBLH, elevated cumulus base (smaller FCL) while reduced cloud
depth (smaller FCH), and thus resulted in less RET and more OLR and CRE.
In autumn and winter, with a few exceptions, they were substantially reduced in
magnitude, and some became insignificant (especially in CM) or changed in sign.
This again could be identified with the seasonal change from convective to stratiform
precipitation dominance.
73
The above analysis showed that P95 correlations with key fields? biases from
observations were well captured by those with the corresponding departures from
NR2. The bias and departure correlations resembled closely in both interannual
variations and seasonal contrasts. Thus, we can analyze the relationships across the
simulated departures from NR2 to determine the physical processes that may cause
P95 biases from observations. Such a causal analysis would be impossible if based
on observational data that were available to only a few fields. In addition, Figs.
3.4-3.8 contain much more complicated cross-field relationships than we discussed,
and require advanced machine learning techniques to disentangle them in order to
identify the most plausible mechanisms underlying the P95 departures or biases.
3.5 Process understanding of P95 biases by structural equation mod-
eling
The previous two sections illustrated that the regional precipitation extremes
resulted from variations in a complex coupled climate system, whose component pro-
cesses interacted each other through direct and indirect effects. Among the select
fields (Figs. 3.5-3.8) representing these processes existed strong correlations. They
could act and/or counteract on extreme precipitation formation. Our objective was
to build robust regression models of these fields to quantify their relative contribu-
tions and interpret the underlying processes affecting P95 simulation. Direct inclu-
sion of all these highly correlated fields as predictors would cause multicollinearity,
leading to unstable regression models that may suffer from double-counting partic-
74
ular physical factors and overfitting certain numerical parameters. These models
can violate the parsimony principle and miss the big picture that the reality may
present. Here we used the structural equation model (SEM) framework to solve the
problem of multicollinearity. The framework is an extension of the confirmatory
factor analysis and has the ability to test hypotheses on causal relationships in the
presence of physically based experimental design (?).
An SEM consists of manifest (measurable) and latent (hypothetical) variables.
All the fields listed in Figs. 3.5-3.8 were considered as the candidates for manifest
variables. Based on the physical understanding from the previous two sections, we
constructed four latent variables: energy supply, water supply, surface forcing, and
cloud forcing. As discussed earlier, sustained energy and water supplies directly
power the climate system processes for extreme precipitation. While surface forcing
acts to couple surface energy and water sources with atmospheric precipitation pro-
cesses, cloud forcing works to regulate such surface-atmospheric coupling in order
to balance the energy and mass budgets of the earth system. Given P95 as the
predictand and 22 manifest variables as the predictor candidates to choose from,
huge flexibility exists in constructing these latent variables, including their member
predictors and directional effects. Figure 3.9 illustrates our conceptual design of the
experimental SEM for extreme precipitation, where each latent variable was desig-
nated with 4 or 6 exclusive predictor candidates that are strongly correlated. Energy
and water supplies may each affect both surface and cloud forcings, and surface forc-
ing may also affect cloud forcing, while all the four latent variables may finally affect
P95. Therefore, the SEM reduces the dimensionality by using designated grouping
75
Cloud 
Forcing (CF)
{ }
RET CRE CWP
?CF FCL FCH RCT
CF WS
ES CF
Energy Water 
Supply (ES) ES E
{ }
SWD SH te 
P  Extreme WS EP Supply (WS)
Precipitation { }ET MC TPW
? ?ES WSNSE OLR  (EP) Q2m W700 V850
E FS S
S WSF Surface 
{Forcing (SF) } ?
T2m CAPE CIN EP = ?mLVm + ?
? mSF PBLH LFC LCL ?
LVm = ?miMVmi + ?m
i
Figure 3.9: The conceptual design of the experimental SEM for extreme precipita-
tion (EP). The center oval represents the predictand (EP), while each outer oval
defines one latent variable (LV) with a list in a brace of the designated manifest
variables (MV) as the predictor candidates. There are four LVs, hypothetically rep-
resenting energy supply (ES), water supply (WS), surface forcing (SF), and cloud
forcing (CF). Each effect from one LV to another or to EP is expressed by an arrow
for its direction and a coefficient along the line for its strength. The SEM consists
of regression equations from these MVs through LVs to EP, as depicted at the lower
right corner.
76
SF
CF
SF EP CF EP
to substitute the 22 manifest variables into the 4 latent variables. The dimension-
ality can be further reduced through system optimization that maximizes the SEM
performance while minimizing the model complexity. The SEM, as a network of
knowledge, can be made more robust by designating into each latent variable more
tightly connected manifest variables to reduce the uncertainty of the group and
so the whole network. We did this designation based on the physical understand-
ing and correlation analysis discussed earlier with three additional considerations:
the manifest variables selected for each latent variable were highly correlated so to
enhance the unidimensionality of the group; they were close to each other on the
hierarchical clustering dendrogram and so in the multidimensional space; they were
consistent as much as possible among different seasons and regions. The resulted
group of the manifest variables (exclusive) was listed for each latent variable in Fig.
3.9. They were linked to P95 through the respective latent variables and the latter
were coupled to form the SEM. These links and couplings were made by a set of
regression equations, each containing an error term. It is important to note that
these error terms are not orthogonal to other regression fields, and may contain a
significant portion of their covariance unexplained. The SEM so constructed rep-
resents the integrated impact on P95 through the active latent variables, each of
which consists of the variations manifested coherently in the select measured fields
as they are significantly related to P95. The manifest variables listed were still just
the candidates, not the finalists. Assume each latent variable consist of at least two
manifest variables. The total number of possible combinations for a single latent
variable is the sum of choosing 2 to 4 (6) out of 4 (6) manifest variables, that is, 11
77
(57). At the latent variable level, one or both of energy and water supplies act on
surface forcing, any combination of these three in turn affects cloud forcing, and any
of these four finally impacts P95. Thus, there are 3, 7 and 15 combinations respec-
tively for surface forcing, cloud forcing and P95. These counts, however, consider
only direct effects between two paired variables, while their mixtures can exert an
exponentially increased number of indirect effects. Together, there are 230,938,920
direct plus indirect effects on P95 or ?231 millions of alternative SEMs in each re-
gion (CM or GS) and each season. This study used the open-source ?lavaan? software
version 0.6-3 (?) on the ?R? platform to construct the SEMs through unconstrained
optimization, adopting the default algorithm configuration for all parameters and
settings (including the maximum likelihood estimator). The goal of the optimiza-
tion is to minimize the difference between measured and implied cross-covariance
matrix and, as stated by ?, to ?discover a model with three attributes: it makes
theoretical sense, it is reasonably parsimonious, and it has acceptably close corre-
spondence to the data?. Given the huge number of alternatives, the whole process
of the optimization (brute-force search) requires machine learning via supercom-
puting to construct potential SEMs. Many of these alternatives failed to reach a
solution with stable regression coefficients, and so were filtered. The number of
successfully constructed SEMs was still tremendous. We therefore searched for the
final SEM under the following conditions. First, the sign of total (direct plus in-
direct) effect of each included manifest variable on P95 must be preserved same as
their original correlation if significant (Figs. 3.5-3.8). Second, the total explained
variance (R2) of P95 departures must be greater than 0.8. [All departures were
78
normalized with a zero mean and unit deviation.] Third, the comparative fit index
(CFI, ?), a most popular indicator for the goodness of fit (?), must be larger than
0.9. Fourth, a bootstrap resampling procedure must be succeeded to estimate the
uncertainty range of every regression coefficient across all acting latent variables and
P95; otherwise the SEM would be unstable in its structure, containing substantial
uncertainties. Finally, the ? information criterion (AIC), an integrated measure
of both model fitness and complexity, should be the lowest for the most preferred
SEM that fits best but keeps parsimony (to avoid overfitting). ? indicated the need
to consider other competitors only if their AIC differences from the minimum are
smaller than 10.
The above selection rules led to a unique choice of a single best SEM for each
region (CM or GS) and each season. Figures 3.10-3.13 illustrate these finalist SEMs
as paired by the CM and GS regions for individual seasons. They include the active
manifest variables to each latent variable, the strength (coefficient) and direction
(arrow) of each effect, the four performance scores (R2, CFI, AIC and its increment
to the next competitor ?AIC). The net effect from a latent variable onto P95 is its
direct effect (coefficient on immediate arrow line) plus indirect effects (product of all
coefficients along each directional path) (?). Below the result is interpreted in terms
of the relative importance to P95, and is expressed as the percentage of any effect?s
coefficient over the sum of the net effect?s absolute coefficient from all active latent
variables. It is stated as [RI = %] in the text and also shown in Figs. 3.10-3.13.
In CM spring (Fig. 3.10), the P95 departure was dominated by a positive direct
effect of the surface forcing departure [RI = 85%], where greater CAPE and higher
79
Figure 3.10: Spring finalist SEMs for CM (upper) and GS (lower). Each SEM
panel includes its structure (left) with the active manifest and latent variables and
their directional effect coefficients, its performance scores (upper right corner), and
the relative importance of each latent variable?s direct, indirect and total effects
(bottom).
80
Figure 3.11: Same as Fig. 10 except for summer.
81
Figure 3.12: Same as Fig. 10 except for autumn.
82
Figure 3.13: Same as Fig. 10 except for winter.
83
PBLH led to larger extreme precipitation. Such surface forcing was supported by
larger energy supply [RI = 13%], which consisted of more solar radiation incom-
ing to the surface (SWD), larger sensible heat release to the atmospheric boundary
layer (SH), and [1/2] net surface energy surplus (NSE). Meanwhile, precipitation
directly consumed energy supply [RI = -24%]. Larger precipitation also depleted
more low-level cloud (FCL), reflected less solar radiation (RET), and so reduced
cloud forcing [RI = -2%], which in turn increased energy supply, causing a small
positive feedback [RI = 2%]. Combining its direct and indirect (through mainly
surface and trivially cloud forcings) effects, energy supply impacted P95 with total
importance of 13%, which was smaller than surface forcing even without its support
[RI = 85-22 or 63%]. Therefore, surface forcing and energy supply were respectively
the first and second most important factors determining P95 in CM spring, while
cloud forcing played a very minor role and water supply had negligible influence. In
GS spring (Fig. 3.10), the P95 departure was dominated by a positive direct effect
of the surface forcing departure [RI = 84%], where greater CAPE and lower [3/4]
LCL drove larger extreme precipitation. The latter depleted more low-level cloud
(FCL), reflected less solar radiation (RET), and so reduced cloud forcing [RI = -3%],
which in turn reduced surface forcing, causing a tiny negative feedback. Meanwhile,
larger precipitation removed more water from the atmospheric column (TPW) and
the moisture transport (V850), and thus directly consumed most of water supply
[RI = -97%]. The latter, however, was replenished by surface forcing of a greater
amount. Since its direct and indirect (through mainly surface and trivially cloud
forcings) effects almost canceled each other, water supply impacted P95 with net
84
importance of only 7%. In contrast, energy supply (consisting of SWD and SH) had
only indirect effects through surface and cloud forcings (both positive) on P95 with
total importance of just 6%. Therefore, surface forcing was the predominant factor
determining P95 in GS spring [84%], cloud forcing had a very minor effect [-3%],
while both energy and water supplies shared evenly the remaining portion. In CM
summer (Fig. 3.11), the P95 departure was dominated by a positive direct effect
of the water supply departure [RI = 60%], where stronger upward motion (W700)
and higher surface humidity (Q2m) led to larger extreme precipitation. The latter
depleted more low-level cloud (FCL), reflected less solar radiation (RET), and so
directly reduced cloud forcing [RI = -20%], which in turn indirectly increased energy
supply, causing a strong positive feedback [RI = 21%]. The small surplus of energy
supply over cloud forcing implied that the combined effect of SWD, SH, OLR, and
[1/2] NSE outweighed that of RET and FCL by a little. Therefore, water supply and
energy supply or cloud forcing were respectively the first and second most important
factors determining P95 in CM summer, while surface forcing had negligible influ-
ence. In GS summer (Fig. 3.11), the P95 departure was dominated by a negative
direct effect of the cloud forcing departure [RI = -84%], where larger precipitation
depleted more FCL, reflected less RET, and yielded smaller CRE. Meanwhile, cloud
forcing had a strong positive indirect feedback from surface forcing [RI = 97%],
where mainly lower PBLH and secondarily [1/3] higher LFC lead to larger extreme
precipitation. This indirect effect of surface forcing canceled most of its direct ef-
fect to produce a tiny negative net impact on P95 [RI = -2%]. In addition, energy
supply, consisting of mainly SH and [2/3] NSE, exerted a weak positive direct effect
85
[RI = 6%]. On the other hand, water supply, mainly from the regional recycling
ET, had only negative indirect effects through primarily surface and trivially cloud
forcings [RI = -8%]. Therefore, cloud forcing was the predominant factor deter-
mining P95 in GS summer, while energy and water supplies exerted much weaker
effects, and surface forcing had a trivial impact. The physical mechanism for the
P95 departure in both CM and GS autumn (Fig. 3.12) was essentially identical
to that in CM summer (Fig. 3.12). In all cases, water supply and energy supply
or cloud forcing were identified as respectively the first and second most important
factors determining P95, while surface forcing had negligible influence. Their cor-
responding RI values were [60%, 20%, -20%] for CM summer, [58%, 20%, -22%] for
CM autumn, and [56%, 21%, -23%] for GS autumn. The manifest variables were
identical (FCL, RET) for cloud forcing, common (SWD, OLR) for energy supply
except with additional (SH, NSE) in CM summer, and same (W700, Q2m) for wa-
ter supply except with a replacement (TPW, 1/100 V850) in CM autumn. The
last subtle change was the additional tiny indirect effect from water supply through
cloud forcing in CM autumn [RI = -1%]. In CM winter (Fig. 3.13), the P95 de-
parture was determined by two opposite direct effects: negative water supply (ET,
3/50 V850) [RI = -51%] and positive energy supply (SWD, 3/5 SH) [RI = 31%].
The former also had a negative indirect effect through cloud forcing [RI = -11%],
and hence caused the total effect [RI = -62%] even stronger than energy supply.
Larger extreme precipitation was maintained by more surface energy supply, which
consisted of primarily more incoming solar radiation (SWD) and secondary larger
sensible heat release (SH). Meanwhile, precipitation directly consumed water supply
86
that predominantly recycled from surface evapotranspiration (ET). Strikingly, the
direct effect of cloud forcing in winter was positive [RI = 7%], totally opposite from
other seasons. Hence, larger low cloud amount (FCL), which reduced CRE, actually
resulted in greater extreme precipitation. This is reasonable since winter precipita-
tion is dominated by stratiform systems, where sustained water supply maintains
low clouds while steadily precipitating. In contrast, convective precipitation pre-
vails in other seasons, depleting clouds much faster. Therefore, water and energy
supplies were the two critical counteractive factors determining P95 in CM winter,
while cloud forcing played a secondary but positive role and surface forcing had
negligible influence. In GS winter (Fig. 3.13), the P95 departure was dominated
by a positive direct effect of surface forcing departure [RI = 87%], where greater
CAPE and higher [2/5] PBLH led to larger extreme precipitation. Such surface
forcing was supported by larger water supply [RI = 70%], which consisted of higher
atmospheric water content (TPW) and stronger upward motion (W700). Mean-
while, larger precipitation directly consumed more water supply [RI = -62%]. Due
to the near cancelation between its direct (negative) and indirect (positive) effects,
water supply had only a small net impact on P95 [RI = 8%]. Energy supply had
an even smaller effect [RI = 5%], which consisted of more incoming solar radiation
(SWD) and outgoing longwave radiation (OLR). Therefore, surface forcing was the
predominant factor determining P95 in GS winter, while water and energy supplies
exerted much weaker effects, and cloud forcing had a trivial impact.
87
3.6 Summary and conclusions
In this study, we took the challenge to uncover the physical mechanisms that
can explain how cumulus parameterization determines CWRF?s ability in simulating
U.S. extreme precipitation as identified in our companion paper (?). The challenge
arose from the rareness of extreme events, the lack of observational data, and the
complexity of physical processes. To disentangle the problem, we made three key
analyses. First, we analyzed interannual variations of spatial averages over the two
distinct regions, CM and GS, where ERI substantially underestimated climatolog-
ical mean P95 while CWRF realistically captured that. We found that, of all the
cumulus schemes tested in CWRF, ECP best simulated P95 interannual variability
(with highest correlations and lowest rmse) in both CM and GS regions for most
seasons, and also performed generally better than ERI. Hence, cumulus parameter-
ization affected significantly extreme precipitation simulation in terms of not only
climatological mean spatial distributions but also regional mean interannual char-
acteristics. However, we found that the relative contribution of the parameterized
versus resolved rainfall (RCT) to extreme events were not as important as originally
thought in the literature. In addition, we showed that NR2 more faithfully repro-
duced the P95 annual cycle and interannual variations in both regions than ERI,
and hence we adopt it as the proxy reference when lacking observational data. Sec-
ond, we analyzed interannual correlations of P95 biases (from observations) and/or
departures (from NR2) with those of seasonal statistics (DRI, NRD) and of the P95
event-based composites (22 key fields). The composite was made for each field in
88
a same simulation by first identifying the date when the P95 event occurred in a
season of a year at a specific grid and then averaging the field?s bias or departure
on that date over all grids within the CM or GS region. We showed that the P95
bias correlations with all the fields that had good observational data (DRI, NRD,
T2m, SWD, OLR, RET, CRE, CWP) were well captured by the corresponding de-
parture correlations in both interannual variations and seasonal contrasts. Thus, it
is reasonable to assume that the relationships underlying the simulated departures
from NR2 could represent the mechanisms responsible for P95 biases from observa-
tions. We found that the departures of all six simulations, that is, ERI and CWRF?s
five cumulus members (ECP, NKF, TDK, NSAS, BMJ), contained significant cor-
relations across P95 and many of the 22 fields. These correlations, however, varied
greatly in sign and magnitude, from -0.99 to +0.97, depending on season and region,
and also were interdependent across multiple fields. They could act or counteract
on extreme precipitation formation. They were so complex that a coherent picture
on the plausible mechanisms for P95 departures could not be readily discerned.
Third, we sought machine learning based on the SEM framework to build
robust regression models of these correlated fields to quantify their relative contri-
butions and interpret the underlying processes affecting P95 simulation. The SEM
is an extension of the confirmatory factor analysis and has the ability to test hy-
potheses on causal relationships in the presence of multicollinearity. Based on our
physical understanding and clustering analysis, we constructed four latent variables:
energy supply (SWD, SH, NSE, OLR), water supply (ET, MC, TPW, Q2m, W700,
V850), surface forcing (T2m, CAPE, CIN, PBLH, LFC, LCL), and cloud forcing
89
(RET, CRE, CWP, FCL, FCH, RCT). We then defined the objective selection rules
using four performance scores (R2, CFI, AIC, ?AIC) and searched through ?231
millions of alternative SEMs in each region (CM or GS) and each season. We finally
discovered a unique finalist SEM for each region and season that is physically rea-
sonable, structurally parsimonious, and optimally fitting the data of the simulated
P95 departure correlations with the responsive fields. They could be grouped to
represent five distinct physical mechanisms as discussed below.
The finalist SEMs for CM summer as well as CM and GS autumn resembled
closely, suggesting an essentially identical mechanism in which water supply [RI =
56?60%], energy supply [RI = 20?21%], and cloud forcing [RI = -20?-23%] jointly
determined P95. Here stronger W700 and higher Q2m in CM summer and GS au-
tumn or greater TPW in CM autumn caused larger P95, which in turn depleted
more FCL and reflected less RET, and consequently increased both SWD and OLR.
The SEMs for GS spring and winter were basically the same, suggesting a similar
mechanism in which surface forcing was the predominant factor [RI = 84?87%]
while energy and water supplies shared evenly the remaining portion [RI = 5?8%].
Here greater CAPE combining with lower [3/4] LCL in spring or higher [2/5] PBLH
in winter caused larger P95, which directly consumed most of TPW plus moisture
transported by southerly wind (V850) in spring or by ascending motion (W700) in
winter, while water supply residual and energy supply from more SWD plus larger
SH in spring or OLR in winter both supported surface forcing. The SEM for CM
spring also revealed a mechanism in which surface forcing was predominant [RI =
85%], but it was supported by energy supply alone [RI = 13%]. In contrast, the
90
SEM for GS summer portrayed a mechanism in which cloud forcing was predomi-
nant [RI = -84%], because larger precipitation directly depleted more FCL, reflected
less RET, and yielded smaller CRE. Meanwhile, water supply [-8%] was opposite to
energy supply [6%]. On the other hand, the SEM for CM winter showed a different
mechanism, in which water and energy supplies were the two critical counteractive
factors [RI = -62%, 31%], while cloud forcing played a secondary but positive role
[RI = 7%]. The effects of water supply and cloud forcing were both opposite to
those in summer and autumn, since the prevailing precipitation system changed
from convective to stratiform processes. Among the 22 fields listed as the possible
manifest variables, 15 were selected at least once constituting the latent variables
with notable importance [ |RI| > 3%] in the finalist SEMs. Of the eight SEMs,
the select fields (occurrences) were SWD (7), FCL (5), SH (5), OLR (4), RET (4),
CAPE (3), TPW (3), W700 (3), NSE (3), CRE (2), ET (2), PBLH (2), Q2m (2),
V850 (1), and LCL (1). This order of sequence indicated the decreasing degree of
their relevance to P95, while the relative importance of their actual effect on P95
was determined by the product sum of the strength coefficients along all their di-
rectional paths to P95. Both the relevance and importance depended strongly on
season and region. Notably, MC, RCT, T2m, CWP, FCH, CIN, and LFC were not
on the list. Missing MC is no surprise, since its correlation with P95 was always in-
significant. Missing RCT confirms our initial finding that the relative contribution
of the parameterized versus resolved rainfall was not important, although it had
significant negative correlations with P95 in autumn and winter. However, missing
T2m is not expected, especially in CM spring and summer when it had substantially
91
high correlations with P95 and other fields. The more representative fields such as
SWD, FCL and SH must have incorporated the T2m role. Missing the other fields
can be similarly understood. Unfortunately, on the final list of the select mani-
fest variables, only SWD, RET, OLR and CRE had long-records of good quality
observational data, making it difficult to objectively rank the actual model perfor-
mance. It is even more difficult to directly compare CWRF against ERI since only
the latter had constrained these radiation fields through satellite data assimilation.
Nonetheless, our SEM analysis discovered the five distinct physical mechanisms that
clearly explained how P95 was simulated differently by ERI and CWRF. In particu-
lar, CWRF using type II cumulus parameterization schemes (ECP, NKF) simulated
both P95 and the four radiation fields more realistically than using type II schemes
(TDK, NSAS, BMJ). The choice of cumulus parameterization affected how water
and energy supplies acted through surface and cloud forcings, and thus determined
CWRF?s ability to simulate U.S. extreme precipitation. In our subsequent papers,
we will conduct perturbation experiments, in which the model representation of
surface-atmospheric and cloud-radiative interactions is altered, to further test and
confirm these mechanisms responsible for extreme precipitation formation.
92
Chapter 4: Improvement by Markov Chain Monte Carlo based Bayesian
model average
4.1 Introduction
Nowadays, many numerical models still perform poorly in extreme precipita-
tion simulation (????). Many studies have investigated how to improve the per-
formance of extreme precipitation by increasing the model resolution (????????),
including complex crucial cloud-related processes (?), implementing a more compre-
hensive cumulus parameterization (?), developing more advanced dynamic models
(?), or using machine learning algorithms to identify underlying physics mechanisms
(?). However, given the regime dependence of parameters, no single scheme can rep-
resent the real climate, let alone extreme precipitation, universally (???). Hence,
more advanced ensemble simulation of extreme precipitation is urgently needed (?).
Ensemble methods have been proven to be one effective method to improve extreme
precipitation simulation (?????). Furthermore, they can reduce the uncertainty
in projection (????) and provide valuable information on the reliability of future
projections of extreme events (?????).
There are two main types of ensemble-combination methods. The most basic
93
method involves taking the arithmetic mean of ensemble members (i.e., the ?com-
posite ensemble?). This method has the benefit of computational efficiency, stability,
and is theoretically straight forward to explain. However, ? demonstrated that equal
weights in ensemble calculation cannot provide optimal ensemble mean outcomes for
schemes with substantial differences in performance. In Chapter Two and Chapter
Three, both the sensitivity experiments and causal analysis demonstrated that in-
dividual schemes perform differently in terms of extreme precipitation simulations.
Chapter Four thus applied non-equal weighting methods.
Among non-equal weighting methods, Bayesian Model Averaging (BMA) is
the gold-standard for making out-of-sample predictions (?). ? proposed a well-
adopted expectation-maximization (EM) algorithm-based BMA method, which has
successfully improved the performance of precipitation simulation (?). The EM-
based BMA method is both relatively easy to implement and computationally ef-
ficient. Furthermore ? provided the open source code, which greatly extended the
accessibility of this method. Therefore, I implemented the EM-based BMA method
in this extreme precipitation simulation study. The EM-based method also served
as a baseline to compare with other methods evaluated in this study. These included
the Akaike information criteria-based BMA method (AIC, Akaike 1998), which is
computationally more efficient, the bootstrapping version of Akaike weights, which
considers the uncertainty associated with training data, and the stacking method.
because the first three of these are BMA methods, which are based on the hypoth-
esis that the true data-generating model is among the potential candidate models.
However, given the highly nonlinear complexity of extreme precipitation, the true
94
model, earth system, is out of the ensemble member list (?). Therefore, this study
also tested the stacking algorithm proposed by ?, which resolves this theoretical
flaw.
? implemented the EM-based BMA method using linear bias correction. Lin-
ear bias correction ignores model uncertainty, which can led to an ?over-confident?
prediction, in contrast to the probabilistic type of bias correction, which provides
more information on prediction uncertainty. That uncertainty information is highly
valuable in climate studies and projections. Hence, this study also implemented
Markov Chain Monte Carlo (MCMC) based probabilistic bias correction. The
MCMC methods also enabled a more flexible model design (?), which allowed me
to implement extreme value distributions as the prior distribution.
This chapter first examined the performance of individual ensemble members
from the regional Climate-Weather Research and Forecasting model (CWRF, ?),
as well as results from the North American CORDEX (NA-CORDEX project(?))
(together we call it CPN in this chapter), using a newly proposed Optimal Rank
distance, similarity, consistency and variability (DSCV) framework and the Method
for Object-Based Diagnostic Evaluation (MODE) tool. Next, it proposes and tests
different ensemble combination methods to improve the performance of extreme pre-
cipitation simulations. Section 2 describes models in CPN, focusing on the physics
configuration as well as the observational data used for evaluation. Section 3 ex-
plains the extreme metrics, performance scores, the OptiRankDSCV framework, and
MODE. Next, section 4 analyzes the individual member performance in extreme
precipitation simulation using the OptiRankDSCV framework and the MODE tool.
95
Section 5 describes the three BMA methods. Section 6 demonstrates and com-
pares the extreme precipitation outcomes from BMA methods. Finally, section 7
summarizes the results.
4.2 Model description, observations
Dynamic SF PBL MP CU Ref
CWRF Non- CSSP CAM TAO (??) ECP ?
hydrostatic (??????????) (?), (?????)
(No with-
Nudging) modifi-
30km cations-
from
?
WRF Non- NOAH MYJ (?) WSM3 (?) ?? ?
hydrostatic (???)
(Nudg-
ing) 50km
25km
96
RegCM4 Hydrostatic BATS (?) ? SUBEX ? ?
(No (?)
Nudging)
50km
25km
RCA Semi- RCALSS (?) ??? Prognostic ?? ?
Lagrangain equation
(No for to-
Nudging) tal cloud
0.44o water (?)
HIRHAM5Semi- ? ECHAM3 ? ? ?
Lagrangain (????)
(No
Nudging)
0.44o
CRCM5 Semi- CLASS3.5 ? ? ?? ?
Lagrangain (??)
(No
Nudging)
0.44o,0.22o,0.11o
97
CanRCM4Semi- CLASS2.7 ? ? ?? ?
Lagrangain (??)
(Nudg-
ing)
0.44o,0.22o
Table 4.1: CPN experiment model configurations, pa-
rameterizations, and their references.
The performance of individual models is crucial to an ensemble simulation
study. The well-developed regional Climate-Weather Research and Forecasting
model (CWRF, ?) is the perfect platform for this study, because ?? demonstrated
the extraordinary skills of CWRF in precipitation related simulations over the con-
tinental United States as well as coastal oceans. ? demonstrated that the superior
performance from CWRF is universal to another climate domain. Furthermore,
? showed that CWRF simulates extreme precipitation exceedingly well. Given
the highly reliable performance from CWRF, this study included the control run
simulation results from ?. CWRF?s superior performance is due to its systemat-
ically developed multiple physics processes that cover the land surface, planetary
boundary layer, cumulus, and microphysics as well as the aerosol-cloud-radiation
system (?????). Therefore, the physically advanced CWRF is a perfect match for
the following adopted statistic ensemble method. his study focused on 1989-2009,
which is the period covered by both CWRF and NA-CORDEX. These 21 years
98
are the common period between CWRF and the NA-CORDEX experiments. Table
4.1 summarizes the major physics parameterization schemes, model configurations,
and references for all CPN members. To achieve consistent results and to minimize
the error introduced by interpolation ( ?, I followed procedures in ? and adopted
the conservative algorithm from the Earth System Modeling Framework regridding
package to interpolate NA-CORDEX outcomes onto the CWRF grid. This study
employed quality controlled observation data from the National Weather Service
Cooperative Observer Network (COOP) (??). Given the topographic dependence,
I preprocessed precipitation the data following the method described in ?, which
used the slope adjustment algorithm: Parameter elevation Regression on Indepen-
dent Slopes Model (?). Finally, the station data were gridded onto CWRF grid
using the Cressman objective analysis method from ?.
4.3 Definitions of extreme precipitation, OptiRankDSCV, MODE
tool
There is no single indicator that can comprehensively capture all aspects of
extreme precipitation (?). Therefore, this study followed ? and applied six extreme
precipitation related indicators. Table 4.2 summarizes the definitions, abbreviations,
and units of those indicators. Similarly, there is no universally perfect skill score that
can represent all differences in performance (?). Hence, this study proposed a sys-
tematic performance analysis framework: Optimal Rank distance (rmse), similarity
(correlation), consistency (?linear error in probability space?) and variability.The
99
Abbreviation Definitions units
RD Number of rainy days, with pre- days
cipitation greater than 1 mm
day?1
DRI Daily precipitation intensity. To- mm day?1
tal amount of precipitation di-
vided by RD in the period.
R5D Sum of five maximum daily pre- mm day?1
cipitations.
R10 Number of rainy days with pre- days
cipitation greater than 10 mm
day?1
P95 95 percentile precipitation for mm day?1
precipitation greater than 1 mm
day?1
CDD Maximum number of consecutive days
dry day (precipitation smaller
than 1 mm day?1 )
Table 4.2: Extreme indicators, their definations and units.
rmse and correlation are common indicators used in performance measurements,
and provide information enabling the validation by comparison with previous stud-
ies with previous studies. However, both rmse and correlation suffer from a damping
effect in which simulations with less variability yield better scores. The linear error
in probability space (LEPS) indicator resolves this problem, so this study adopted
the LEPS indicator following the definition of ?
LEPS =3(1? |CDFo(Fi)? CDFo(Oi)|+
CDF 2o (F (4.1)i)? CDFo(Fi)+
CDF 2o (Oi)? CDFo(Oi))? 1
where CDFo stands for observed cumulative distribution function; Oi stand for
100
Figure 4.1: Six subregions in the continental United States
the observed value; and Fi stands for the forecast value. The LEPS value ranges from
0 to 1, with 0 representing a perfect score. This score measures the distance between
simulated and observed values in cumulative probability space. The updated LEPS
algorithm prevents the score from ?bending back?, oroverestimating scores near the
extremes (?), which is preferable in this extreme study.
The MVI is defined as in ?, and measures how the model simulates the variabil-
ity of the observational field. This score provides valuable information in terms of
how dispersed or spread out extreme precipitation cases are. Instead of summarizing
all variables, I calculated MVI for each extreme precipitation indicator:
1
MV I = (? ? )2 (4.2)
?
where ? is the ratio of simulated to observed variance for a specific index in
each region. The MVI ranges from 0 to 1 ,with 0 indicating a perfect score.
101
It is impossible to provide a single measurement containing all performance
information. Therefore, this study adopted the methodology from multiple criteria
decision making (MCDM). Each model is a potential candidate, and each score is
one vote. In the same measurement, voters (measurement) prefer to the model (can-
didate) have higher score. Then, since the same model might perform differently in
differentclimate regimes (e.g., it might perform well near coastal areas but poorly in
mountain areas), I further divided the continental United States into six subregions
according to the Third National Climate Assessment (NCA; ?), as shown in Fig-
ure 4.1. By using these subregions, the results can be fairer (since it prevents the
gerrymandered districts effect seen in social science) and quickly compared to other
studies in the future. Each model uses 144 indicators (?votes,? four measurements
multiplied by six regions multiplied by six extreme precipitation indicators). Using
more indicators produced better coverage of overall ensemble performance.
Furthermore, the use of multiple scores rather than a single indicator can help
to prevent overfitting in model development. The cross-entropy optimal rank ag-
gregation algorithm was used to calculate the ?optimal rank? results from these
indicators (?). The weight of each subregion was their ratio of area relative to the
contiguous United States. This optimal rank took into account all significant fea-
tures of extreme precipitation, the four principal skill measurement scores, and the
performance difference in many climate regions. Hence OptiRankDSCV represented
the most comprehensive performance of each member.
OptiRankDSCV provided highly condensed information on performance from
each model. However, this is not enough to understand the simulation spatial pat-
102
tern distribution. Hence, in addition to the optimal rank analysis, this study further
investigated each model?s spatial pattern performance of P95 by using the Method
for Object-Based Diagnostic Evaluation (MODE) tool. The Research Applications
Laboratory in National Center for Atmospheric Research (NCAR), USA developed
the MODE tool (?). The MODE provides a meaningful spatial score that considers
several features factors (e.g., orientation, location, shape, and intensity percentile).
This score mimics humans understanding of ?regions of interest? in a objective,
nonjudgmental way. The MODE first resolved the objects in the raw data using a
convolution process, which replaces a field?s value with its surrounding grid points
within a preset distance. After the convolution, the MODE tool applied a filter
to remove precipitation values below 1 mmhour?1 (the AMS glossary definition of
light rain). This filter helps to focus on the only impactful grid point in extreme
precipitation simulation analysis. Then MODE tool then calculated related attri-
butions for each object. When those attributes were ready, the MODE conducted
the ?matching and merging? step. In the merging process, MODE tries to mimic a
human meteorologist by clustering related grid points into a single physically mean-
ingful object. In the matching process, MODE relates the most reasonable matching
objects in forecast fields to similar observations by a sophisticated fuzzy algorithm
(?). Finally, ?total interest? measured relative performance from models (?). The
?total interest? considers all-important pattern distribution features in space (e.g.,
centroid distance, orientation, and area).
103
4.4 Performance analysis of individual ensemble members
Figure 4.2 uses the OptiRankDSCV framework to compare the spatial pattern
skill of winter extreme precipitation simulation of CPN in OptiRankDSCV frame-
work. All statistics are based the 21-year period from 1989-2009. The scores were
scaled to the 0-1 range by dividing the value range of each score by each indicator
in every region. In winter the WRF-22 ranked and CWRF ranked second. In other
models, higher resolutions simulations always almost performed better than their
lower counterparts, except for CRCM5-UQAM. No single model outperformed other
models in terms of all indicators in every region, but the optimal rank revealed a
reasonable order of the skills of the different members. The best-performing mem-
ber, WRF-22, had the best rmse for all regions except the Midwest. However, its
improvement in rmse was accompanied by degradation in MVI, where it had the
weakest performance in rainydays (RD), consective dry days (CDD) over Southeast
and maximum five rainy days (R5D) over the Southwest. A better rmse accompa-
nied by a weaker MVI indicates that the improvement from WRF-22 rmse may have
been inflated by the damping effect, which causes underestimation of the variability
of corresponding fields. The WRF-22 simulation of P95 over Southeast scored the
lowest in almost all measurements. WRF-22?s poor performance in MVI is accom-
panied by its low score in RD. As observed by ?, RD strongly correlates with P95
performance in winter. Hence, the WRF-22?s weak performance in winter over the
Southeast was partly due to its deficiency in RD simulation. CWRF had a stable
performance and relatively high scores in most regions, whereas it performed rela-
104
RD
DRI
R5D
R10 Northeast
P95
CDD
RD
DRI
R5D
R10 Southeast
P95
CDD
RD
DRI
R5D
R10 GreatPlains
P95
CDD
RD MVI
DRI
R5D
R10 Midwest
P95
CDD
RD
DRI
R5D
R10 Southwest
P95
CDD
RD
DRI
R5D
R10 Northwest
P95
CDD
RD
DRI
R5D
R10 Northeast
P95
CDD
RD
DRI
R5D
R10 Southeast
P95
CDD
RD
DRI
R5D
R10 GreatPlains
P95
CDD
RD LEPS
DRI
R5D
R10 Midwest
P95
CDD
RD
DRI
R5D
R10 Southwest
P95
CDD
RD value
DRI
R5D 1.00
R10 Northwest
P95 0.75
CDD
RD
DRI 0.50
R5D
R10 Northeast 0.25
P95
CDD
RD 0.00
DRI
R5D
R10 Southeast
P95
CDD
RD
DRI
R5D
R10 GreatPlains
P95
CDD
RD COR
DRI
R5D
R10 Midwest
P95
CDD
RD
DRI
R5D
R10 Southwest
P95
CDD
RD
DRI
R5D
R10 Northwest
P95
CDD
RD
DRI
R5D
R10 Northeast
P95
CDD
RD
DRI
R5D
R10 Southeast
P95
CDD
RD
DRI
R5D
R10 GreatPlains
P95
CDD
RD RMSE
DRI
R5D
R10 Midwest
P95
CDD
RD
DRI
R5D
R10 Southwest
P95
CDD
RD
DRI
R5D
R10 Northwest
P95
CDD
Figure 4.2: Winter overall performance of 1989-2009 mean extreme indicators mea-
sured by multi-score in all six subregions. Color represents scores scaled by their
range in each row. Horizontally, the locations of models represent the optimal rank
aggregated from all 144 indicators, with1t0h5e most left is the best performed.
WRF?22
CWRF
CRCM5?UQAM?22
WRF?44
CRCM5?UQAM?11
CRCM5?UQAM?44
CanRCM4?22
RCA4?44
CanRCM4?44
HIRHAM5?44
RegCM4?22
RegCM4?44
tively weak in the Northeast in terms of RD, DRI, and R10 variability. RegCM-44
ranked lowest in terms of overall performance, but its RD simulations over Midwest
and Southwest were reasonably good.
Figure 4.3 compares the spatial pattern skill of extreme precipitation spring
simulation using the OptiRankDSCV framework (1980-2009). Generally, the rank-
ing of models in spring resembled that of in winter. CWRF and WRF ranked
highest, followed by models from CRCM5. RCA4 and HIRMAM ranked in the
middle, and RegCM4 had the weakest performance. Meanwhile, almost all models?
high-resolution simulations performed better than their low-resolution simulations,
except CRCM5, whose 44km simulation was the best. CWRF performed best in
terms of overall extreme simulations. In terms of MVI, CWRF performed the best
over almost every subregion, which highlights its ability to simulate pattern vari-
ability in extreme precipitation. Its performance in LEPS was mixed over Great
Plains, where it earned the best score in Great Plains R10, but had a relatively
lower score in RD and R5D. Similarly, in the Southwest, CWRF performed well in
RD, R5D, and R10, but relatively weaker in DRI and P95. Interestingly, the second
best member, which was CRCM5-UQAM-44, showed the same performance pattern
in Southwest LEPS; it also had performed better in RD, R5D, and R10, but rela-
tively weaker in DRI and P95. CWRF generally performed well in terms of pattern
correlation, with the only outliers in the Southwest, where CWRF had some diffi-
culty in RD simulation. Interestingly, CWRF scored poorly on this same indicator
for both COR and RMSE, but received excellent MVI and acceptable LEPS scores.
Hence, CWRF captured the RD in the Southwest by its distribution in probability
106
RD
DRI
R5D
R10 Northeast
P95
CDD
RD
DRI
R5D
R10 Southeast
P95
CDD
RD
DRI
R5D
R10 GreatPlains
P95
CDD
RD MVI
DRI
R5D
R10 Midwest
P95
CDD
RD
DRI
R5D
R10 Southwest
P95
CDD
RD
DRI
R5D
R10 Northwest
P95
CDD
RD
DRI
R5D
R10 Northeast
P95
CDD
RD
DRI
R5D
R10 Southeast
P95
CDD
RD
DRI
R5D
R10 GreatPlains
P95
CDD
RD LEPS
DRI
R5D
R10 Midwest
P95
CDD
RD
DRI
R5D
R10 Southwest
P95
CDD
RD value
DRI
R5D 1.00
R10 Northwest
P95 0.75
CDD
RD
DRI 0.50
R5D
R10 Northeast 0.25
P95
CDD
RD 0.00
DRI
R5D
R10 Southeast
P95
CDD
RD
DRI
R5D
R10 GreatPlains
P95
CDD
RD COR
DRI
R5D
R10 Midwest
P95
CDD
RD
DRI
R5D
R10 Southwest
P95
CDD
RD
DRI
R5D
R10 Northwest
P95
CDD
RD
DRI
R5D
R10 Northeast
P95
CDD
RD
DRI
R5D
R10 Southeast
P95
CDD
RD
DRI
R5D
R10 GreatPlains
P95
CDD
RD RMSE
DRI
R5D
R10 Midwest
P95
CDD
RD
DRI
R5D
R10 Southwest
P95
CDD
RD
DRI
R5D
R10 Northwest
P95
CDD
Figure 4.3: Same as Fig4.2 except for spring.
107
CWRF
CRCM5?UQAM?44
WRF?22
WRF?44
CRCM5?UQAM?22
RCA4?44
CRCM5?UQAM?11
CanRCM4?22
CanRCM4?44
RegCM4?22
HIRHAM5?44
RegCM4?44
space as well as its variance, but did not correctly represented the shape of spatial
distribution.
Figure 4.4 compares the spatial pattern skill of extreme precipitation sum-
mer simulation using the OptiRankDSCV framework (1980-2009). CWRF again
was the best model this season. Overall, CWRF showed the greatest skill in terms
of all measurements and overall subregions. Its outstanding performance was con-
sistent with a previous study ?, which explained the physical mechanisms behind
its success. Its only relatively weak scores were RD and CDD over Midwest, and
DRI over the Southeast in terms of its MVI. Interestingly, the RegCM-44 had a
relatively better score for these three indicators, which was rare since it performed
poorly overall. Simulations from CRCM5 showed consistent skill, as in other sea-
sons. CRCM5-44 ranked second, outperforming its high-resolution counterpart, as
it did in autumn as well. This indicates that for CRCM5 a higher resolution sim-
ulation was not only computationally costly but also detrimental to its simulation
skill in its current model framework. One potential explanation is that the assump-
tions in CRCM5 cumulus and stratiform precipitations were not compatible with a
higher resolution simulation. WRF summer performance became relatively weaker
compared to its performance in other seasons. Interestingly, like CRCM5, WRF?s
low-resolution simulation outperformed its higher resolution simulation. This only
occurred in summer for WRF, and may be due to the fact that the most dominant
summer precipitation type is convective, and higher resolution simulation does not
result in a better understanding of convective activity. Without a better physical
understanding and representation of convective activity, use of a higher resolution
108
RD
DRI
R5D
R10 Northeast
P95
CDD
RD
DRI
R5D
R10 Southeast
P95
CDD
RD
DRI
R5D
R10 GreatPlains
P95
CDD
RD MVI
DRI
R5D
R10 Midwest
P95
CDD
RD
DRI
R5D
R10 Southwest
P95
CDD
RD
DRI
R5D
R10 Northwest
P95
CDD
RD
DRI
R5D
R10 Northeast
P95
CDD
RD
DRI
R5D
R10 Southeast
P95
CDD
RD
DRI
R5D
R10 GreatPlains
P95
CDD
RD LEPS
DRI
R5D
R10 Midwest
P95
CDD
RD
DRI
R5D
R10 Southwest
P95
CDD
RD value
DRI
R5D 1.00
R10 Northwest
P95 0.75
CDD
RD
DRI 0.50
R5D
R10 Northeast 0.25
P95
CDD
RD 0.00
DRI
R5D
R10 Southeast
P95
CDD
RD
DRI
R5D
R10 GreatPlains
P95
CDD
RD COR
DRI
R5D
R10 Midwest
P95
CDD
RD
DRI
R5D
R10 Southwest
P95
CDD
RD
DRI
R5D
R10 Northwest
P95
CDD
RD
DRI
R5D
R10 Northeast
P95
CDD
RD
DRI
R5D
R10 Southeast
P95
CDD
RD
DRI
R5D
R10 GreatPlains
P95
CDD
RD RMSE
DRI
R5D
R10 Midwest
P95
CDD
RD
DRI
R5D
R10 Southwest
P95
CDD
RD
DRI
R5D
R10 Northwest
P95
CDD
Figure 4.4: Same as Fig4.2 except for summer.
109
CWRF
CRCM5?UQAM?44
CRCM5?UQAM?22
RCA4?44
CRCM5?UQAM?11
WRF?44
RegCM4?22
CanRCM4?22
HIRHAM5?44
WRF?22
CanRCM4?44
RegCM4?44
might be not only computationally more costly but also inaccurate. Summer RCA-
44 had its highest rank in terms of its performance in all four seasons. Given the
importance of convective related precipitation in summer, the improvement in RCA-
44 summer performance might be due to its use of a modified Kain-Fritsch scheme
compared other models, as ? pointed out that the principle difference in cumulus
scheme used in RCA-44 was that the shallow convection was not precipitable.
Figure 4.5 compares the spatial pattern skill of extreme precipitation autumn
simulation using the OptiRankDSCV framework (1980-2009). CWRF ranked high-
est of all members for the overall model performance, . and it?s MVI scores were
generally the best. However, there were two outliers: R5D in Northeast and CDD
in the Midwest. CWRF?s relatively weak MVI of CDD in the Midwest in both
autumn and summer indicate that the causes of the underlying bias are connected,
as ? showed that the physics causes in both summer and autumn over Midwest are
the same. The WRF-22 simulation ranked second behind CWRF. WRF-22 gener-
ally had a better outcome in terms of RMSE, whereas its MVI was not as good as
CWRF. WRF-22?s relatively weaker performance in both LEPS and COR indicated
this improvement in rmse from WRF-22 might haved occurred due to the ?damping
effect? (underestimation in variability). The CRCM5-44 simulation skill ranked be-
hind WRF-22. As mentioned before, CRCM5-44 outperformed its high-resolution
counterpart in both summer and autumn. CRCM5-44, WRF-22, and WRF-44 all
had low scores in pattern correlation of R5D over Great Plains. Over the Great
Plains in summer, R5D may have a more significant social-economic impact than
any other indicator in the same region. Thus, CWRF?s stable performance from
110
RD
DRI
R5D
R10 Northeast
P95
CDD
RD
DRI
R5D
R10 Southeast
P95
CDD
RD
DRI
R5D
R10 GreatPlains
P95
CDD
RD MVI
DRI
R5D
R10 Midwest
P95
CDD
RD
DRI
R5D
R10 Southwest
P95
CDD
RD
DRI
R5D
R10 Northwest
P95
CDD
RD
DRI
R5D
R10 Northeast
P95
CDD
RD
DRI
R5D
R10 Southeast
P95
CDD
RD
DRI
R5D
R10 GreatPlains
P95
CDD
RD LEPS
DRI
R5D
R10 Midwest
P95
CDD
RD
DRI
R5D
R10 Southwest
P95
CDD
RD value
DRI
R5D 1.00
R10 Northwest
P95 0.75
CDD
RD
DRI 0.50
R5D
R10 Northeast 0.25
P95
CDD
RD 0.00
DRI
R5D
R10 Southeast
P95
CDD
RD
DRI
R5D
R10 GreatPlains
P95
CDD
RD COR
DRI
R5D
R10 Midwest
P95
CDD
RD
DRI
R5D
R10 Southwest
P95
CDD
RD
DRI
R5D
R10 Northwest
P95
CDD
RD
DRI
R5D
R10 Northeast
P95
CDD
RD
DRI
R5D
R10 Southeast
P95
CDD
RD
DRI
R5D
R10 GreatPlains
P95
CDD
RD RMSE
DRI
R5D
R10 Midwest
P95
CDD
RD
DRI
R5D
R10 Southwest
P95
CDD
RD
DRI
R5D
R10 Northwest
P95
CDD
Figure 4.5: Same as Fig4.2 except for autumn.
111
CWRF
WRF?22
CRCM5?UQAM?44
WRF?44
CRCM5?UQAM?11
CRCM5?UQAM?22
RCA4?44
CanRCM4?22
HIRHAM5?44
RegCM4?22
CanRCM4?44
RegCM4?44
CWRF exhibits its potential importance for future projection. For all CPN mem-
bers in autumn, higher resolutions were linked to improved performance, except for
CRCM5, whose low-resolution simulations always performed better.
The above comparisons used the newly proposed framework to aggregate in-
formation into a concise ranking order, which can then be used to guide the model
development. The ranking can help to answer questions such as whether the newly
implemented schemes can boost performance, or whether higher or lower resolution
runs should be used. However, the ranking system cannot reveal the distribution of
interesting fields, which is centrally useful to gaining a better physics understand-
ing. Thus, the study employed the MODE tool to investigate the spatial pattern
distributions. It would be impractical to analyze the spatial patterns of all fields
(which is the reason for the development of a concise ranking system in the first
place), so therefore I used P95 as the example field in the following analysis.
Figure 4.6 compares 21-year winter P95 distributions, including the total in-
terest score (values range from 0 to 1, with 1 indicating the best performance.) ERI
has a small area of the identified object (for objects differing greatly in size, the
contribution of the weight of the centroid separation to the denominator is small).
Given that larger area coverage indicates a more substantial economic impact, this
comparison used a convolution radius of 5. Objects in which the simulations and
observations matched are colored gray, while objects that did not match are colored
dark blue. MODE identified the three largest objects, located in the north-west,
Southwest, and Southeast coastal areas (largest). The interest values for the largest
object were quite large (0.88-0.979), indicating that all models were capable of cap-
112
OBS ERI ECP CanRCM4-22 CanRCM4-44 CRCM5-11 CRCM5-22
0.880 0.970 0.937 0.934 0.944 0.938
CRCM5-44 HIRHAM5-44 RCA4-44 RegCM4-22 RegCM4-44 WRF-22 WRF-44
0.965 0.900 0.979 0.968 0.912 0.931 0.913
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 35 40 45 50
Figure 4.6: Geographic distributions of 1980-2009 mean winter P95 amount [mm
day?1] (color) and results of MODE (grey) for observed (OBS), assimilated (ERI),
and simulated by all CPN members. A grey area represents the ?interest? area
identified by the MODE tool, with total interest score showing in the lower-left
corner. The black lines represent the convex hull identified by MODE, which was
used to calculate shape features.
turing the essential extreme precipitation features in winter. However, the mod-
els captured somewhat different details. ERI had the lowest total interest values
compared to other regional models, due to its inability to produce sufficiently in-
tense precipitation. Meanwhile, its center of its maximum shifted inland. ERI also
was unable to produce sufficiently intense precipitation near coastal lines. CWRF
ranked second (0.97), with improved intensity but relatively oversized coverage.
CanRCM4-22 scored slightly higher (0.973) than its lower resolution counterpart
CanRCM4-44, because it could capture a more realistic P95 distribution near the
coastal line, whereas CanRCM4-44 underestimated P95 near north Florida as well
as in Louisiana and Texas. CRCM5 was the only model whose lower resolution sim-
ulation had a higher total interest than its high-resolution counterpart. CRCM5-11
significantly overestimated P95 over Louisiana, although the location of its maxi-
113
OBS ERI ECP CanRCM4-22 CanRCM4-44 CRCM5-11 CRCM5-22
Unidentified
0.891 0.877 0.872 0.892 0.898
CRCM5-44 HIRHAM5-44 RCA4-44 RegCM4-22 RegCM4-44 WRF-22 WRF-44
0.913 0.868 0.829 0.929 0.923 0.909 0.925
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 35 40 45 50
Figure 4.7: Same as Fig4.6 except for spring.
mum center was reasonable compared to observations. HIRAM5-44 had the smallest
total interest values of all the regional models; it not only overestimated the intensity
but also misidentified the shape of P95. RCA4-44 ranked first by total interest score
(0.979), but understated precipitation intensity. RegCM4-22 performed better than
RegCM4-44, but both overestimated P95 near the Eastern coastal line. WRF-22
and WRF-44 had similar problems as RegCM simulations.
Figure 4.7 compares 21-year spring P95 distributions. The interest values in
this season range from 0.868 to 0.929. ERI has a small area of the identified ob-
ject. Due to ERI?s significant underestimation of P95 in this season, MODE was
unable to identify the region of interest. CWRF had a reasonable score of 0.981.
The score was not highest was due to an oversized heavy precipitation region. The
majority of models in this season shared the common problem of overestimating the
extent of heavy precipitation to the far north of the U.S. CanRCM-22 performed
slightly better than CanRCM-44, but both shifted the maximum center northward
114
OBS ERI ECP CanRCM4-22 CanRCM4-44 CRCM5-11 CRCM5-22
Unidentified
0.921 0.771 0.553 0.831 0.828
CRCM5-44 HIRHAM5-44 RCA4-44 RegCM4-22 RegCM4-44 WRF-22 WRF-44
0.805 0.798 0.000 0.846 0.843 0.818 0.828
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 35 40 45 50
Figure 4.8: Same as Fig4.6 except for summer.
and overestimated its intensity [35mm day?1]. Similarly, in winter, CRCM?s low-
resolution simulation performed better than its high-resolution simulation, which
overestimated the intensity as well as coverage. HIRAM5-44 still performed rela-
tively poorly compared to other models with a total interest of 0.868, and it again
overestimated both the intensity (mostly more than 35mm day?1) and coverage.
RCA4-44 had the smallest total interest in spring, with a systematic underestima-
tion in P95. RegCM-22 scored a higher total interest (0.929) than its low-resolution
counterpart RegCM-44. However, the difference was not significant. WRF-44 per-
formed better than WRF22 in terms of total interest. Its biggest improvement was
primarily due to the weaker intensity from the coarse resolution simulation.
Figure 4.8 compares 21-year summer P95 distributions. The interest values
in this season range from 0.553 to 0.921. The relatively lower total interest shows
the difficulty of simulating summer extreme precipitation. Due to ERI?s significant
underestimation of P95, MODE could not identify an ?interest region? from ERI
115
in summer. CWRF had the highest total interest score [0.921], which shows its
outstanding performance in summer extreme precipitation simulation. It poten-
tially overestimated P95 near big lakes, which might occur due to its lake models.
CanRCM4-22 scored much higher than CanRCM-44, which barely produced any P95
greater than 26 [mm day?1] Both had their maximum center shifted Northeast com-
pared to observations. In summer, CRCM5 high-resolution simulation had higher
total interest than its low-resolution counterpart. However, all CRCM5 members
produced too much rain near coastal lines in summer, when observations showed
no intense coastal precipitation. As in other seasons, HIRHAM5-44 produced too
much rain northward, and the coverage of rain greater than 26 [mm day?1] from
HIRHAM5-44 was much higher than the one from CWRF. RCA-44 again underesti-
mated P95, whose P95 were less than 26 [mm day?1] over the majority of the land.
As a result, MODE could not find a matching object in RCA4-44 to the ?interest-
region? over the Central Plains in observations. RegCM again overestimated the
P95 and produced too much precipitation near the east and south coastal lines.
WRF-44 had a higher total interest [0.828] than WRF higher resolution simulations
[0.818], but the improvement was mostly due to shrinking overestimation bias. Nei-
ther WRF simulation could realistically capture the shape and center of P95, and
both overestimated P95 near coastal regions.
Figure 4.9 compares the 21-year autumn P95 distributions. The total interest
scores range from 0.846 to 0.980. Due to underestimation, ERI again could not iden-
tify an interest region. Again, CWRF earned the highest score (0.980). Compared
to other models, CWRF was able to simulate both size and location in a reason-
116
OBS ERI ECP CanRCM4-22 CanRCM4-44 CRCM5-11 CRCM5-22
Unidentified Unidentified
0.980 0.872 0.888 0.880
CRCM5-44 HIRHAM5-44 RCA4-44 RegCM4-22 RegCM4-44 WRF-22 WRF-44
0.846 0.862 0.869 0.911 0.911 0.899 0.866
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 35 40 45 50
Figure 4.9: Same as Fig4.6 except for autumn.
able match to observations. The low-resolution CanRCM4 model could not identify
an interest objects, while its high- resolution counterpart produced insufficient pre-
cipitation (0.872), and shifted the location inland. Higher resolution simulations
from CRCM5 scored higher (0.888) and produced stronger precipitation, whereas
low-resolution simulation produced less rain and scored lower (0.846) The moderate
resolution produced a score in between (0.880). HIRHAM5 rained too heavily in
the middle of the U.S. over a vast area of coverage, and its distribution was in-
correctly shaped, and the center shifted to the north, which led to a relatively low
score (0.862). The maximum center from RCA4-44 simulation shifted to the east
coastal line, and did not produce intense enough precipitation near the south coast.
Both simulations from RegCM4 had the same total interest (0.911): they had the
common problem that produced too much rain near the coastal area but not enough
rain inland. On the contrary, simulations from WRF underestimated precipitation
near the coastal area, with stronger inland precipitation. The high-resolution sim-
117
ulation from WRF produced more precipitation in a reasonable location compared
to the results from WRF-44 (0.866), which was too weak in general.
4.5 Bayesian model average methods
BMA is an ensemble method to weight each member by the marginal posterior
probability that the Mk is the data generating model. Following ?, the forecast PDF
p(y) can be written as:
?K
p(y) = p(y|Mk)p(M Tk|y ) (4.3)
k=1
Here Mk represents the kth ensemble member, y represents the extreme sea-
sonal precipitation indicator in each grid point (e.g. P95), and yT represents training
data. This study used the period from 1989 to 2003 as training data yT , and from
2004 to 2009 as cross-validation data. In the BMA assumption, p(Mk|yT ) repre-
sents the posterior probability that Mk is the data generating model. ? calculated
p(M Tk|y ) using an expectation-maximization (EM) algorithm. p(y|Mk)represents
the probability forecast outcome from Mk. ? approximated p(y|Mk) by a normal
distribution:
y|fk ? N(ak + bk, ?2) (4.4)
The probability forecast model for y|fk is:
118
? ? Normal(0, 1)
? ? Normal(1, 1)
?? ? ? + ? ? fk
? (4.5)
??? ??, ?? > 0
? =
??? 0, ?? ? 0
y|fk ? ?(?, ?clim)
The above equations are the Tobit model (censored regression model) to de-
scribe the relationship between simulated and observed extreme precipitation. This
probability model was calculated using the MCMC method (?) with the No U-turn
Sampler (NUTS) algorithm (?). p(y|Mk) is the weighting in the Bayesian model
average, while previous studies calculated this variable using maximum likelihood,
(??). Here I examined three new variations: Akaike weighting, Akaike weight-
ing with bootstrapping, and stacking. Akaike weighting employs the information
from the previous probability model to calculate the Akaike Information Criterion
(AIC). The AIC was calculated with the LOO (Pareto-smoothed importance sam-
pling Leave-One-Out cross-validation) algorithm (?). With the calculated AIC, the
weight was calculated as (?)
1
e AIC2 i
wi = ? 1 (4.6)K e AIC2 i1
The above calculation relies on the accuracy of AIC, and since the AIC it-
119
self can have uncertainty, I also compared the Bootstrapping AIC outcome (?). The
above calculations all have the strong assumption the data generating model (DGM)
is among the ensemble members. However, this assumption is theoretically unreal-
istic: given the complexity of the earth system, none of the numerical models will
be the true reality but only an approximation. Hence, the third option ?stacking?
algorithm is more theoretically favorable in extreme precipitation ensemble analy-
sis. The stacking algorithm uses the predicted result fk from the previous Bayesian
model and calculates the weight by minimizing the logarithmic score in training
periods (?).
n
1 ? ?K ?K
max log wip(yi|y?i,Mk), s.t.wk0 ? wi, wk = 1 (4.7)
n
1 k=1 k=1
4.6 Ensemble performance analysis
Figure 4.10 compares observed and simulated mean seasonal precipitation dis-
tributions for six cross-validation years (2004- 2009). Given the previously men-
tioned outstanding performance from CWRF, it represented the best performance
a single model can achieve in terms of extreme precipitation simulation. In all sea-
sons, CWRF produced a much realistic outcome than the driving ERI data, whose
P95 was mostly weaker than actual observations. The mean of CPN members pro-
duced a smooth distribution of P95, which lost the detailed information of pattern
features, and was also generally weaker than the outcome from CWRF and observa-
tions. On the other hand, results from BMA method further improved the outcome
120
P95 (mm day 1)
DJF MAM JJA SON
OBS
ERI
CWRF
EnsMean
B-Akaike
Akaike
stacking
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 35 40 45 50
Figure 4.10: Geographic distributions of 2004-2009 (cross-validation) mean sea-
sonal P95 amount [mm day?1] observed (OBS), assimilated (ERI) simulated by
CWRF, composite ensemble mean (EnsMean), post-processed by bootstrapping
Akaike method (B-Akaike), post-processed by Akaike method (Akaike), and post-
processed by stacking method (stacking) for winter (DJF), spring (MAM), sum-
mer(JJA),and autumn(SON).
121
from the CPN members, and the difference among different BMA methods was not
significant. In winter, observed P95 was greater than 30 [mm day?1] over the Gulf
States. ERI could not produce sufficiently intense precipitation, and its precipita-
tion area was much too small. CWRF, representing the best outcome from CPN
members, produced sufficiently strong P95 and better coverage. The ensemble mean
generated a broader P95 coverage with an intensity of more than 28 [mm day?1],
but was not strong enough to reach the observed value of 30 [mm day?1]. All three
BMA methods produced similar P95 distribution patterns, with Akaike and stacking
producing relatively stronger P95 (greater than 30 [mm day?1]). B-Akaike was rela-
tively weaker compared to the other two. The BMA method also produced a better
distribution pattern than other method; it preserved the details of distribution, and
reduced CWRF?s overestimation over the Northeast coast without compromising
the intensity over the Gulf States. In spring, ERI missed most P95 greater than 30
[mm day?1] over the Gulf States, whereas CWRF produced strong enough P95 but
with the maximum center shifted inland. Again, the ensemble mean lost valuable
details of pattern distribution, and it did not improve the overestimation problem
near the east coast.
The difference between BMA methods was insignificant. All BMA methods
reduced the overestimation problem over the Northeast and produced a more precise
spatial distribution pattern with details. However, no BMA methods could produce
sufficiently strong P95 over Texas (all produced less than 30 [mm day?1]), and they
also underestimated the P95 over the coast of Alabama. In summer, P95 from
ERI were all less than 20 [mm day?1]. ERI also failed to capture the shape and
122
location of the maximum P95. CWRF improved substantially, producing sufficiently
strong P95 (greater than 35 [mm day?1]) and better capturing the location of the
maximum center. The ensemble mean captured the maximum location, but its P95
was not intense enough, and the problematic overestimation near the Northeast
coast remained. The three results from BMA were quite similar, with the Akaike
method slightly stronger than the other two. The BMA method produced a much
more realistic pattern distribution, not only reducing problematic overestimation in
the Northeast coast but also reducing CWRF?s overestimation over the Northwest
central region (North and South Dakotas) from greater than 35 [mm day?1] to the
observed 20 [mm day?1]. In Autumn, ERI simulated P95 better than in other
seasons. ERI captured the location of maximum P95 over the Gulf States, but
still underestimated the intensity and did not produce sufficiently strong P95 near
the south coast. CWRF produced stronger P95, but overestimated its area with
more P95 greater than 15 [mm day?1]. The CWRF-simulated central maximum
did not pass 30 [mm day?1] , whileobserved P95 was higher than 45 [mm day?1]
over Louisiana. As in other seasons, the ensemble mean was less intense than the
observed results and much detailed information was lost. The outcome from the
BMA methods was consistent to that of other seasons. BMA results were more
precise with regards to pattern distribution, and also were much stronger near the
Gulf States coast (greater than 35 [mm day?1]). The BMA method also improved
P95 over Georgia, where both the ensembles mean and CWRF failed to produce
sufficiently intense precipitation. In all seasons, the ensemble mean lost detailed
distribution information over the Rocky Mountains, while the BMA appeared able
123
to capture those details.
Figure 4.11 compares the winter observed upper (75%) and lower (25%) confi-
dence interval (CI) with bootstrapping, as well as BMA-estimated CI for P95 during
the six cross-validation years (2004- 2009). For observations, the CI was calculated
by bootstrapping the seasonal mean P95 1000 times during the six cross-validation
years (6000 samples for each grid). The bootstrapping CI was calculated by boot-
strapping all 12 members together 1000 times during the six cross-validation years
(72000 samples for each grid). For BMA methods, CI was calculated from high
probability density (HPD) of all posterior samples during training years (144000
samples for each grid). The bootstrapping CI from all CPN members was relatively
small. For most areas, the difference between members was smaller than 1 [mm
day?1]. The under dispersive was because bootstrapping did not learn the informa-
tion on uncertainty from the training data like the BMA did, but just estimated
uncertainty from the cross-validation data itself. Hence, it was overconfident and
underestimated uncertainty. Meanwhile, CI estimated by bootstrapping shows that
CPN members produced too much rain over the Northeast coast and insufficiently
strong precipitation near the coast of the Gulf States. The CI from three methods
of BMA were very similar (Akaike method again produced slightly stronger precipi-
tation), and the upper and lower CI were more dispersive than the simple composite
ensemble results by bootstrapping. The relatively large CI provided more conserva-
tive judgments by learning from the training data. Furthermore, observed CI were
mostly contained within the CI from BMA methods; specifically, for the Northeast
coast, there was no unrealistic overestimation in low-end or high-end CI from BMA
124
25 % CI 75 % CI
OBS OBS
Boot Boot
B-Akaike B-Akaike
Akaike Akaike
stacking stacking
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 35 40 45 50
Figure 4.11: Geographic distributions of 2004-2009 (cross-validation) winter P95
confidence interval (CI) amount [mm day?1]. Estimated by bootstrapping obser-
vational data (OBS), estimated by bootstrapping ensembles (boot), estimated by
applying bootstrapping Akaike method on ensembles (B-Akaike), estimated by ap-
plying Akaike method on ensembles (Akaike), and estimated by applying stacking
method on ensembles (stacking).left lower (25%) CI; left upper (75%).
125
methods.
Figure compares spring observed upper (75%), and lower (25%) CI with boot
and BMA estimated CI for P95 during the six cross-validation years (2004-2009).
The results from bootstrapping were again too optimistic since they did not account
for model uncertainty. Meanwhile, the overestimation over the Northeast coast was
very significant from the boot method, which makes sense because bootstrapping
method cannot reduce bias in simulations. Furthermore, like the ensemble mean,
boot-strapping ensemble members lost the details of pattern distribution. The re-
sults over the mountainous area were too smooth. More importantly, the scattered
massive precipitation centers over the Gulf States became a continuous precipitation
field. The BMA method results largely reduced the overestimation near the North-
east coast area. Meanwhile, they intensified P95 near the Gulf States coast, where
bootstrapping underestimated. They also produced more detailed information on
pattern distribution (e.g., over the Rocky Mountains and in the Great Plains).
Figure 4.13 compares the summer observed upper (75%) and lower (25%) CI
with bootstrapping and BMA estimated CI for P95 during the six cross-validation
years (2004- 2009). The CI from the boot method was less than 2 [mm day?1]
for most areas, which was even smaller than the observed natural variability alone.
The same problems of overestimated P95 near the northeast coast and lost details
of distribution patterns persisted in summer. The BMA methods largely improved
performance over the Northeast coast. The difference between different BMA meth-
ods remained relatively small. The BMA methods estimated a relatively high upper
level of P95 compared to observations over Texas. This estimation was quite rea-
126
25 % CI 75 % CI
OBS OBS
Boot Boot
B-Akaike B-Akaike
Akaike Akaike
stacking stacking
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 35 40 45 50
Figure 4.12: Same as Fig4.11 except for spring.
127
25 % CI 75 % CI
OBS OBS
Boot Boot
B-Akaike B-Akaike
Akaike Akaike
stacking stacking
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 35 40 45 50
Figure 4.13: Same as Fig4.11 except for summer.
128
sonable, given that Texas owned the highest number of tornadoes (155) per state
during 1981- 2010 (NOAA, 2019). Hence, the BMA methods not only considered
the uncertainty from model error but also the uncertainty from natural variability.
In this study, the CI of observation from the bootstrapping method underestimated
nature variability due to limitations in the available test years.
Figure 4.14 compares autumn observed upper (75%), and lower (25%) CI with
boot and BMA estimated CI for P95 during the six cross-validation years (2004-
2009). Bootstrapping results exhibited the same problem as other seasons, except
over Northeast, where the high observed P95 reduced the issues from overestimation.
The BMA methods correctly produced higher CI over the northeast coast as well,
which indicated that reducing P95 over the Northeast coast in other seasons was
not merely shifting the magnitude. More interestingly, BMA methods produced
substantial, strong P95 near eastern Texas, which did not appear in the relatively
short period of observation data, which indicates, as in summer, that the BMA
methods learned from the training data.
Figure 4.15 compares the change in theunbiased ignorance score (IGN) (
Siegert, 2014) between the composite ensemble and the three BMA methods re-
sults during the six cross-validation years (2004-2009). IGN has many advantages.
In particular, it does not assume the underlying shape of the probability density
function (PDF). Hence it is a very desirable indicator, given the non-normality of
extreme precipitation. Furthermore, for continuous simulation, ? found that IGN
was the only proper local and smooth score. In this study, the IGN was calculated
following the unbiased IGN definition from ?, which is suitable since the BMA meth-
129
25 % CI 75 % CI
OBS OBS
Boot Boot
B-Akaike B-Akaike
Akaike Akaike
stacking stacking
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 35 40 45 50
Figure 4.14: Same as Fig4.11 except for autumn.
130
IGN bias (mm day 1)
DJF MAM JJA SON
B-Akaike
Akaike
stacking
-4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0.5 1 1.5 2 2.5 3 3.5 4
Figure 4.15: Geographic distributions of 2004-2009 (cross-validation) IGN of BMA
methods minus IGN of raw composite ensembles: bootstrapping Akaike method
(B-Akaike), Akaike method (Akaike), and stacking method (stacking).
ods had larger sample spaces than the raw data. IGN rewards both accuracy and
sharpness. A lower IGN means better performance. In all seasons, over the most
areas, the members based on BMA methods had smaller unbiased IGN compared
to raw ensemble data, which indicates a consistent overall improvement from BMA
processing. The magnitude of potential improvement in IGN changed with seasons,
with the maximum in summer and the minimum in winter. The biggest improve-
ments occurred in the summer Gulf States, which was the area with the highest risk
of extreme precipitation, e.g., Texas. The regions with the highest skill moved away
from land in autumn. Meanwhile, over the Rocky Mountains, improvements became
significant in autumn. In winter, the changes were more scattered, and performance
in some mountain areas even deteriorated. This might be because the P95 was too
weak in the winter mountain area, and consequently the signal to noise ratio was too
131
weak in the training data over this area. In spring, therewas a slight improvement
over the Midwest region.
4.7 Summary and conclusions
This study was a continuation of a larger effort to improve extreme precipi-
tation simulation over the U.S., as demonstrated in my previous studies (??). One
central scientific question I aim to address is whether the ensemble method can ef-
fectively improve extreme precipitation simulation, and if so, what actual algorithm
should be applied to improve the performance. To answer this question, this study
defined three consecutive goals.
The first goal is to identify the best performing model in extreme precipita-
tion. To achieve this goal, this study proposed a new MCDM-based OptiRankD-
SCV framework, with multiple extreme indicators and performance scores. The
OptiRankDSCV framework provided a consistent, fair platform that demonstrates
the overall performance from each model and can also act as a testbed for new
model development. This study used the OptiRankDSCV framework to analyze the
performance of the CPN members in 1989-2009 over the contiguous United States.
The results showed that CWRF outperformed all other CPN members by multiple
skill scores in terms of multiple extreme precipitation metrics over the whole United
States.
Overall, this systematic analysis demonstrated that many models could im-
prove performance in extreme precipitation simulation by using a higher resolution.
132
However, there are still outliers like CRCM5, whose low-resolution simulation always
outperformed its high-resolution simulation. In summer, the reversal of this phe-
nomenon is more prominent. On the other hand, the results showed that resolution
was not the most critical factor in determining model performance. Many low-
resolution simulations outperformed high-resolution simulations in all four seasons.
Both of the above phenomena highlighted the importance of physics understanding
to model development. A higher resolution may or may not be useful. In particu-
lar, more studies should focus on improving model performance through improved
physics understanding.
Then this study used the MODE tool to demonstrate differences in models?
ability to capture P95 pattern distribution. The MODE generally produced consis-
tent results as in the OptiRankDSCV framework. Furthermore, the MODE diag-
nostic tool provided a human mimic analysis, which is more physically meaningful
than pure single digits. CWRF showed outstanding performance by MODE mea-
surements. Particularly in summer, it earned significant higher total interest values
(0.921) than any other members. This result confirmed previous studies, and further
provided a solid, nonjudgmental score to evaluate P95 distribution performance.
The second goal was to provide an ensemble method that is either more theo-
retically reasonable or more computationally efficient for climate study. After com-
paring different BMA methods and considering the particular requirements from
a climate modeling point of view, I proposed MCMC-based BMA methods to do
ensemble analysis of CPN members. The stacking method is the most theoretically
sound choice for an open-M situation such as the earth system. The AIC-based
133
method was more computationally efficient. The B-AIC method included posterior
uncertainty. There is no significant difference in their performance, therefore, the
computational more efficient algorithm is a better choice.
The third goalwas to test the newly proposed methods against both the en-
semble mean and the best-performing simulation from CWRF. Results showed that
BMA methods significantly improved the simulation?s ability to capture the general
spatial pattern. Compared with the mean of ensemble members, BMA methods re-
duced overestimation of P95 greater 25 [mm day?1] in both spring and winter. Also,
the BMA methods corrected the dislocation of P95 in summer and autumn. The
BMA methods further significantly reduced P95 overestimation in the Northeast
coastal area of the U.S. in both winter and spring. BMA method not only reduced
overestimation, but BMA methods also enhanced the simulated intensity in the
Gulf States coastal area in autumn. One more benefit of the BMA method was that
the BMA did not compromise the accuracy of the simulation. The high-resolution
details were able to be retained in the BMA methods output.
Compared to the optimal model output from CWRF, BMA methods reduced
overestimation in both winter and spring. In summer, even though the intensity
from BMA methods was weaker than observations, BMA reduced overestimation
over the Midwest area (where CWRF simulated P95 greater than 25 [mm day?1])
and produced a more reasonable shape of the distribution. In autumn, the BMA
methods again provided a more precise distribution location, whose P95 maximum
center (greater than 35 [mm day?1]) was more similar to the observational data.
Overall, the newly proposed BMA methods significantly improved the performance
134
of extreme precipitation, not only compared to the ensemble mean, but also com-
pared to the best performing single model. Meanwhile, this study demonstrated
that there was no significant difference between the three methods. This result
has substantial practical value. Since the AIC method is the most computation-
ally efficient but performs similarly to the other two theoretically more sound BMA
methods (stacking and B-AIC), it is safe to adopt the AIC without worrying about
the potential theoretical deficiency.
Probabilistic BMA methods can not only improve the performance of simula-
tion distribution, but also can provide valuable information on model uncertainty.
Hence, this study further compared the CI from BMA methods and bootstrapping
outcomes from the raw ensemble model. In all seasons, high CI from BMA methods
showed more reasonable coverage than the simple bootstrapping on the raw ensemble
models. Compared to bootstrapping ensemble members, there was less area where
bootstring observations exceeded the upper CI values estimated by BMA methods.
On the other hand, bootstrapping ensemble members severely underestimated the
CI range, which means that without BMA the simple bootstrapping method can
be under-dispersive and overlook the actual uncertainty in the model simulation.
Hence the uncertainty estimation by BMA methods were more conservative and
reliable. This study also adopted the IGN score to further measur the performance
of probabilistic extreme precipitation forecasts. This study considered the effect
of sample size in this measurement; hence, I adopted the unbiased IGN score (?).
The results show that overall the newly proposed methods successfully improved
probability prediction compared to the composite ensemble results. In particular,
135
over the summer Gulf States, BMA methods significantly added more value than
the composite method. This improvement was particularly useful in Texas, which
suffered the most tornados of all U.S. states. As in the case of the ensemble mean
output, there was no significant difference in terms of model uncertainty estimation
from different BMA methods. This suggests that it is safe to adopt the compu-
tationally efficient AIC method accompanied by the MCMC algorithm to perform
BMA.
Finally, the above analyses show that the BMA method using the MCMC
algorithm not only performed better than the arithmetic ensemble mean, but also
provided more reasonable uncertainty estimations. Furthermore, the AIC weights
were practically equivalent to a more theoretically sounding stacking method or the
bootstrapping version of AIC.
In our subsequent papers, I will implement this system of the BMA method
into future projections, and specifically extreme precipitation projections. I hope
that this newly proposed method will help reduce errors in extreme precipitation
simulation. Furthermore, it may help to provide more reasonable and reliable un-
certainty estimation information for extreme precipitation projection.
136
Chapter 5: Future work: projections of future extreme precipitation
changes and impacts
The ultimate purpose of this study is to provide a better estimation of how
extreme precipitation will change in the future. I have already conducted a prelimi-
nary analysis of the future extreme precipitation changes based on the NA-CORDEX
data. This study calculated the change of extreme precipitation by using RCP85
(2006-2100) minus the historical (1950-2005 ) simulated extreme precipitation. This
study adopted the extreme value theory and compared the changes in extreme values
for different return levels.
As demonstrated in figure 5.1, generally, all over the U.S. extreme values
systematically increased, with summer Texa and west coastal getting much stronger
precipitation.
Meanwhile, the preliminary results on changes in rainy days figure 5.2 shows
there will be less number of rainydays in the Gulf States and Arizona. Given the
actual number of rainydays in Arizona is less than ten, this clearly shows over these
areas, the rain will become less frequent. Meanwhile, as figure 5.2 shows, the future
projections might also have under dispersive problem by a simple bootstrapping
method.
137
Ensemble projected extreme precipitation change  (mm/day)  
DJF MAM JJA SON
Return Level=5 year
Return Level=20 year
Return Level=50 year
Return Level=100 year
-40 -35 -30 -25 -20 -15 -10 -5 5 10 15 20 25 30 35 40
Figure 5.1: Geographic distributions of projected changes in extreme precipitation
for different return levels.
Ensemble projected RAINYDAYS(day) RCP85 minus 20THC 
DJF MAM JJA SON
CI=25.0%
CI=75.0%
-4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0.5 1 1.5 2 2.5 3 3.5 4
Figure 5.2: Geographic distributions of projected changes in the number of rainy-
days, with confidence interval estimated by the bootstrapping method.
138
We hope to apply the newly proposed BMA methods into the future projec-
tions combined with the outcome from CWRF to provide a more reliable outcome.
Ultimately make it a better world.
139
Bibliography
Akaike, H. (1974), A new look at the statistical model identification, in Selected
Papers of Hirotugu Akaike, pp. 215?222, Springer.
Akaike, H. (1998), Information theory and an extension of the maximum likelihood
principle, in Selected papers of hirotugu akaike, pp. 199?213, Springer.
Alexander, L., X. Zhang, T. Peterson, J. Caesar, B. Gleason, A. Klein Tank, M. Hay-
lock, D. Collins, B. Trewin, and F. Rahimzadeh (2006), Global observed changes
in daily climate extremes of temperature and precipitation, Journal of Geophysical
Research: Atmospheres, 111(D5).
Allan, R. P., and B. J. Soden (2008), Atmospheric warming and the amplification
of precipitation extremes, Science, 321(5895), 1481?1484.
Allen, M., and P. Stott (2003), Estimating signal amplitudes in optimal fingerprint-
ing, Part I: Theory, Climate Dynamics, 21(5-6), 477?491.
Allen, M. R., and W. J. Ingram (2002), Constraints on future changes in climate
and the hydrologic cycle, Nature, 419(6903), 224?232.
Anderson, B. T., D. J. Gianotti, and G. D. Salvucci (2015), Detectability of historical
trends in station-based precipitation characteristics over the continental United
States, Journal of Geophysical Research: Atmospheres, 120(10), 4842?4859.
Asadieh, B., and N. Krakauer (2015), Global trends in extreme precipitation: cli-
mate models versus observations, Hydrology and Earth System Sciences, 19(2),
877?891.
Ashouri, H., S. Sorooshian, K.-L. Hsu, M. G. Bosilovich, J. Lee, M. F. Wehner,
and A. Collow (2016), Evaluation of NASA?s MERRA precipitation product in
reproducing the observed trend and distribution of extreme precipitation events
in the United States, Journal of Hydrometeorology, 17(2), 693?711.
Auld, H., and D. Maclver (2006), Changing weather patterns, uncertainty and in-
frastructure risks: emerging adaptation requirements, in 2006 IEEE EIC Climate
Change Conference, pp. 1?10, IEEE.
Barnston, A. G., M. K. Tippett, M. L. L?Heureux, S. Li, and D. G. DeWitt (2012),
Skill of real-time seasonal ENSO model predictions during 2002-11: is our capabil-
ity increasing?, Bulletin of the American Meteorological Society, 93(5), 631?651.
140
Bechtold, P., J.-P. Chaboureau, A. Beljaars, A. Betts, M. Ko?hler, M. Miller, and
J.-L. Redelsperger (2004), The simulation of the diurnal cycle of convective precip-
itation over land in a global model, Quarterly Journal of the Royal Meteorological
Society, 130(604), 3119?3137.
Bechtold, P., M. Ko?hler, T. Jung, F. Doblas-Reyes, M. Leutbecher, M. J. Rodwell,
F. Vitart, and G. Balsamo (2008), Advances in simulating atmospheric variabil-
ity with the ECMWF model: From synoptic to decadal time-scales, Quarterly
Journal of the Royal Meteorological Society, 134(634), 1337?1351.
Bechtold, P., N. Semane, P. Lopez, J.-P. Chaboureau, A. Beljaars, and N. Bormann
(2014), Representing equilibrium and nonequilibrium convection in large-scale
models, Journal of the Atmospheric Sciences, 71(2), 734?753.
Belmecheri, S., F. Babst, A. R. Hudson, J. Betancourt, and V. Trouet (2017),
Northern Hemisphere jet stream position indices as diagnostic tools for climate
and ecosystem dynamics, Earth Interactions, 21(8), 1?23.
Bentler, P. M. (1990), Comparative fit indexes in structural models., Psychological
bulletin, 107(2), 238.
Bernard, J. M., and A. F. Smith (1994), Bayesian theory, John Wiley & Sons.
Betts, A., and M. Miller (1986), A new convective adjustment scheme. Part II:
Single column tests using GATE wave, BOMEX, ATEX and arctic air-mass data
sets, Quarterly Journal of the Royal Meteorological Society, 112(473), 693?709.
Bhattacharya, R., S. Bordoni, and J. Teixeira (2017), Tropical precipitation ex-
tremes: Response to SST-induced warming in aquaplanet simulations, Geophysi-
cal Research Letters, 44(7), 3374?3383.
Blackadar, A. K. (1962), The vertical distribution of wind and turbulent exchange
in a neutral atmosphere, Journal of Geophysical Research, 67(8), 3095?3102.
Boyle, J., and S. A. Klein (2010), Impact of horizontal resolution on climate model
forecasts of tropical precipitation and diabatic heating for the TWP-ICE period,
Journal of Geophysical Research: Atmospheres, 115(D23).
Boyles, R., A. Marshall, and F. Proschan (1985), Inconsistency of the maximum
likelihood estimator of a distribution having increasing failure rate average, The
Annals of Statistics, 13(1), 413?417.
Bretherton, C. S., and S. Park (2009), A new moist turbulence parameterization in
the Community Atmosphere Model, Journal of Climate, 22(12), 3422?3448.
Brinkop, S., and E. Roeckner (1995), Sensitivity of a general circulation model to
parameterizations of cloud?turbulence interactions in the atmospheric boundary
layer, Tellus A, 47(2), 197?220.
141
Brown, J. R., C. Jakob, and J. M. Haynes (2010), An evaluation of rainfall frequency
and intensity over the Australian region in a global climate model, Journal of
Climate, 23(24), 6504?6525.
Burnham, K. P., and D. R. Anderson (2004), Multimodel inference: understanding
AIC and BIC in model selection, Sociological methods & research, 33(2), 261?304.
Cairo, A. (2016), The truthful art: Data, charts, and maps for communication, New
Riders.
Caldwell, P., H.-N. S. Chin, D. C. Bader, and G. Bala (2009), Evaluation of a WRF
dynamical downscaling simulation over California, Climatic Change, 95(3-4), 499?
521.
Cattiaux, J., and A. Ribes (2018), Defining single extreme weather events in a
climate perspective, Bulletin of the American Meteorological Society, (2018).
Catto, J. L., and S. Pfahl (2013), The importance of fronts for extreme precipitation,
Journal of Geophysical Research: Atmospheres, 118(19).
Cha, D.-H., and D.-K. Lee (2009), Reduction of systematic errors in regional climate
simulations of the summer monsoon over East Asia and the western North Pacific
by applying the spectral nudging technique, Journal of Geophysical Research:
Atmospheres, 114(D14).
Chakraborty, A., and T. Krishnamurti (2006), Improved seasonal climate forecasts
of the South Asian summer monsoon using a suite of 13 coupled ocean?atmosphere
models, Monthly weather review, 134(6), 1697?1721.
Changnon, S. (1997), Trends in hail in the United States, in Proceedings of the
Workshop on the Social and Economic Impacts of Weather, National Center for
Atmospheric Research, Boulder, CO, pp. 19?34.
Chen, C.-T., and T. Knutson (2008), On the verification and comparison of extreme
rainfall indices from climate models, Journal of Climate, 21(7), 1605?1621.
Choi, H. I., and X.-Z. Liang (2010), Improved terrestrial hydrologic representation
in mesoscale land surface models, Journal of Hydrometeorology, 11(3), 797?809.
Choi, H. I., P. Kumar, and X.-Z. Liang (2007), Three-dimensional volume-averaged
soil moisture transport model with a scalable parameterization of subgrid topo-
graphic variability, Water resources research, 43(4).
Choi, H. I., X.-Z. Liang, and P. Kumar (2013), A conjunctive surface?Subsurface
flow representation for mesoscale land surface models, Journal of Hydrometeorol-
ogy, 14(5), 1421?1442.
142
Choi, I.-J., E. K. Jin, J.-Y. Han, S.-Y. Kim, and Y. Kwon (2015), Sensitivity of
diurnal variation in simulated precipitation during East Asian summer monsoon
to cumulus parameterization schemes, Journal of Geophysical Research: Atmo-
spheres, 120(23), 11?971.
Chou, C., J. C. Chiang, C.-W. Lan, C.-H. Chung, Y.-C. Liao, and C.-J. Lee (2013),
Increase in the range between wet and dry season precipitation, Nature Geo-
science, 6(4), 263.
Chou, M.-D., and M. J. Suarez (1999), A Solar Radiation Parameterization for
Atmospheric Studies. Volume 15.
Chou, M.-D., M. J. Suarez, X.-Z. Liang, M. M.-H. Yan, and C. Cote (2001), A
thermal infrared radiation parameterization for atmospheric studies.
Christensen, J. H., F. Boberg, O. B. Christensen, and P. Lucas-Picher (2008), On
the need for bias correction of regional climate change projections of temperature
and precipitation, Geophysical Research Letters, 35(20).
Christensen, O. B., M. Drews, J. H. Christensen, K. Dethloff, K. Ketelsen,
I. Hebestadt, and A. Rinke (2007), The HIRHAM regional climate model. Version
5 (beta).
Coppola, E., F. Giorgi, S. Rauscher, and C. Piani (2010), Model weighting based on
mesoscale structures in precipitation and temperature in an ensemble of regional
climate models, Climate Research, 44(2-3), 121?134.
Coumou, D., and S. Rahmstorf (2012), A decade of weather extremes, Nature Cli-
mate Change, 2(7), 491?496.
Cuijpers, J., and P. Duynkerke (1993), Large eddy simulation of trade wind cumulus
clouds, Journal of the Atmospheric Sciences, 50(23), 3894?3908.
Curriero, F. C., J. A. Patz, J. B. Rose, and S. Lele (2001), The association between
extreme precipitation and waterborne disease outbreaks in the United States,
1948?1994, American journal of public health, 91(8), 1194?1199.
Cuxart, J., P. Bougeault, and J.-L. Redelsperger (2000), A turbulence scheme al-
lowing for mesoscale and large-eddy simulations, Quarterly Journal of the Royal
Meteorological Society, 126(562), 1?30.
Dai, A. (2006), Precipitation characteristics in eighteen coupled climate models,
Journal of Climate, 19(18), 4605?4630.
Dai, A., G. A. Meehl, W. M. Washington, T. M. Wigley, and J. M. Arblaster
(2001), Ensemble simulation of twenty-first century climate changes: Business-as-
usual versus CO2 stabilization, Bulletin of the American Meteorological Society,
82(11), 2377?2388.
143
Dai, Y., X. Zeng, R. E. Dickinson, I. Baker, G. B. Bonan, M. G. Bosilovich, A. S.
Denning, P. A. Dirmeyer, P. R. Houser, and G. Niu (2003), The common land
model, Bulletin of the American Meteorological Society, 84(8), 1013?1024.
Daly, C., G. Taylor, and W. Gibson (1997), The PRISM approach to mapping
precipitation and temperature, in Proc., 10th AMS Conf. on Applied Climatology,
pp. 20?23, Citeseer.
Daniels, A. E., J. F. Morrison, L. A. Joyce, N. L. Crookston, S.-C. Chen, and S. G.
McNulty (2012), Climate projections FAQ.
Davis, C. A., B. G. Brown, R. Bullock, and J. Halley-Gotway (2009), The method for
object-based diagnostic evaluation (MODE) applied to numerical forecasts from
the 2005 NSSL/SPC Spring Program, Weather and Forecasting, 24(5), 1252?1267.
Dee, D., S. Uppala, A. Simmons, P. Berrisford, P. Poli, S. Kobayashi, U. Andrae,
M. Balmaseda, G. Balsamo, and P. Bauer (2011), The ERA-Interim reanalysis:
Configuration and performance of the data assimilation system, Quarterly Journal
of the royal meteorological society, 137(656), 553?597.
Delage, Y. (1997), Parameterising sub-grid scale vertical transport in atmospheric
models under statically stable conditions, Boundary-Layer Meteorology, 82(1),
23?48.
Dickinson, E., A. Henderson-Sellers, and J. Kennedy (1993), Biosphere-atmosphere
transfer scheme (BATS) version 1e as coupled to the NCAR community climate
model.
Dittus, A. J., D. J. Karoly, S. C. Lewis, L. V. Alexander, and M. G. Donat (2016),
A multiregion model evaluation and attribution study of historical changes in
the area affected by temperature and precipitation extremes, Journal of Climate,
29(23), 8285?8299.
Doelling, D. R., R. Bhatt, B. R. Scarino, A. Gopalan, C. O. Haney, P. Minnis, and
K. M. Bedka (2016), A consistent AVHRR visible calibration record based on
multiple methods applicable for the NOAA degrading orbits. Part II: Validation,
Journal of Atmospheric and Oceanic Technology, 33(11), 2517?2534.
Donat, M. G., L. V. Alexander, N. Herold, and A. J. Dittus (2016), Temperature
and precipitation extremes in century-long gridded observations, reanalyses, and
atmospheric model simulations, Journal of Geophysical Research: Atmospheres,
121(19), 11?174.
Doswell III, C. A., H. E. Brooks, and R. A. Maddox (1996), Flash flood forecasting:
An ingredients-based methodology, Weather and Forecasting, 11(4), 560?581.
Durre, I., M. J. Menne, B. E. Gleason, T. G. Houston, and R. S. Vose (2010),
Comprehensive automated quality assurance of daily surface observations, Journal
of Applied Meteorology and Climatology, 49(8), 1615?1633.
144
Easterling, D. R., G. A. Meehl, C. Parmesan, S. A. Changnon, T. R. Karl, and L. O.
Mearns (2000a), Climate extremes: observations, modeling, and impacts, science,
289(5487), 2068?2074.
Easterling, D. R., J. Evans, P. Y. Groisman, and T. Karl (2000b), Observed variabil-
ity and trends in extreme climate events: a brief review, Bulletin of the American
Meteorological Society, 81(3), 417.
Ek, M., K. Mitchell, Y. Lin, E. Rogers, P. Grunmann, V. Koren, G. Gayno, and
J. Tarpley (2003), Implementation of Noah land surface model advances in the
National Centers for Environmental Prediction operational mesoscale Eta model,
Journal of Geophysical Research: Atmospheres, 108(D22).
Evans, J. P., M. Ekstro?m, and F. Ji (2012), Evaluating the performance of a WRF
physics ensemble over South-East Australia, Climate Dynamics, 39(6), 1241?1258.
Fan, J., T.-C. Hu, and Y. K. Truong (1994), Robust non-parametric function esti-
mation, Scandinavian journal of statistics, pp. 433?446.
Ferguson, T. S. (1982), An inconsistent maximum likelihood estimate, Journal of
the American Statistical Association, 77(380), 831?834.
Fischer, E. M., U. Beyerle, and R. Knutti (2013), Robust spatially aggregated pro-
jections of climate extremes, Nature Climate Change, 3(12), 1033.
Fraley, C., A. E. Raftery, and T. Gneiting (2010), Calibrating multimodel forecast
ensembles with exchangeable and missing members using Bayesian model averag-
ing, Monthly Weather Review, 138(1), 190?202.
Francq, C., and J.-M. Zako??an (2010), Inconsistency of the MLE and inference based
on weighted LS for LARCH models, Journal of Econometrics, 159(1), 151?165.
Frei, C., J. H. Christensen, M. De?que?, D. Jacob, R. G. Jones, and P. L. Vidale
(2003), Daily precipitation statistics in regional climate models: Evaluation and
intercomparison for the European Alps, Journal of Geophysical Research: Atmo-
spheres, 108(D3).
Frich, P., L. V. Alexander, P. Della-Marta, B. Gleason, M. Haylock, A. K. Tank,
and T. Peterson (2002), Observed coherent changes in climatic extremes during
the second half of the twentieth century, Climate research, 19(3), 193?212.
Fu, Q., and K. Liou (1992), On the correlated k-distribution method for radiative
transfer in nonhomogeneous atmospheres, Journal of the Atmospheric Sciences,
49(22), 2139?2156.
Fu, Q., and K. N. Liou (1993), Parameterization of the radiative properties of cirrus
clouds, Journal of the Atmospheric Sciences, 50(13), 2008?2025.
145
Gan, Y., X.-Z. Liang, Q. Duan, H. I. Choi, Y. Dai, and H. Wu (2015), Stepwise
sensitivity analysis from qualitative to quantitative: Application to the terrestrial
hydrological modeling of a Conjunctive Surface-Subsurface Process (CSSP) land
surface model, Journal of Advances in Modeling Earth Systems, 7(2), 648?669.
Gandin, L. S., and A. H. Murphy (1992), Equitable skill scores for categorical fore-
casts, Monthly Weather Review, 120(2), 361?370.
Ghil, M., P. Yiou, S. Hallegatte, B. Malamud, P. Naveau, A. Soloviev, P. Friederichs,
V. Keilis-Borok, D. Kondrashov, and V. Kossobokov (2011), Extreme events:
dynamics, statistics and prediction, Nonlinear Processes in Geophysics, 18(3),
295?350.
Giorgi, F., and L. O. Mearns (2002), Calculation of average, uncertainty range, and
reliability of regional climate changes from AOGCM simulations via the ?reliabil-
ity ensemble averaging?(REA) method, Journal of Climate, 15(10), 1141?1158.
Giorgi, F., and C. Shields (1999), Tests of precipitation parameterizations avail-
able in latest version of NCAR regional climate model (RegCM) over continental
United States, Journal of Geophysical Research: Atmospheres, 104(D6), 6353?
6375.
Giorgi, F., C. Jones, and G. R. Asrar (2009), Addressing climate information needs
at the regional level: the CORDEX framework, World Meteorological Organiza-
tion (WMO) Bulletin, 58(3), 175.
Giorgi, F., E. Coppola, F. Solmon, L. Mariotti, M. Sylla, X. Bi, N. Elguindi, G. Diro,
V. Nair, and G. Giuliani (2012), RegCM4: model description and preliminary
tests over multiple CORDEX domains, Climate Research, 52, 7?29.
Gleckler, P. J., K. E. Taylor, and C. Doutriaux (2008), Performance metrics for
climate models, Journal of Geophysical Research: Atmospheres, 113(D6).
Gneiting, T., and A. E. Raftery (2007), Strictly proper scoring rules, prediction, and
estimation, Journal of the American Statistical Association, 102(477), 359?378.
Gregory, D., J.-J. Morcrette, C. Jakob, A. Beljaars, and T. Stockdale (2000), Revi-
sion of convection, radiation and cloud schemes in the ECMWF Integrated Fore-
casting System, Quarterly Journal of the Royal Meteorological Society, 126(566),
1685?1710.
Grell, G. A., and D. De?ve?nyi (2002), A generalized approach to parameterizing
convection combining ensemble and data assimilation techniques, Geophysical
Research Letters, 29(14), 38?1.
Groisman, P. Y., T. R. Karl, D. R. Easterling, R. W. Knight, P. F. Jamason, K. J.
Hennessy, R. Suppiah, C. M. Page, J. Wibig, and K. Fortuniak (1999), Changes
in the probability of heavy precipitation: important indicators of climatic change,
Climatic Change, 42(1), 243?283.
146
Groisman, P. Y., R. W. Knight, D. R. Easterling, T. R. Karl, G. C. Hegerl, and
V. N. Razuvaev (2005), Trends in intense precipitation in the climate record,
Journal of climate, 18(9), 1326?1350.
Han, J., and H.-L. Pan (2011), Revision of convection and vertical diffusion schemes
in the NCEP global forecast system, Weather and Forecasting, 26(4), 520?533.
Harding, K. J., and P. K. Snyder (2015), The relationship between the Pacific?
North American teleconnection pattern, the Great Plains low-level jet, and North
Central US heavy rainfall events, Journal of Climate, 28(17), 6729?6742.
Hastings, W. K. (1970), Monte Carlo sampling methods using Markov chains and
their applications.
Haylock, M., and N. Nicholls (2000), Trends in extreme rainfall indices for an up-
dated high quality data set for Australia, 1910-1998, International Journal of
Climatology, 20(13), 1533?1541.
Hegerl, G. C., E. Black, R. P. Allan, W. J. Ingram, D. Polson, K. E. Trenberth, R. S.
Chadwick, P. A. Arkin, B. B. Sarojini, and A. Becker (2018), Challenges in quan-
tifying changes in the global water cycle, Bulletin of the American Meteorological
Society, 99(1).
Herman, G. R., and R. S. Schumacher (2016), Extreme precipitation in models: An
evaluation, Weather and Forecasting, 31(6), 1853?1879.
Herold, N., L. Alexander, M. Donat, S. Contractor, and A. Becker (2016), How
much does it rain over land?, Geophysical Research Letters, 43(1), 341?348.
Herold, N., A. Behrangi, and L. V. Alexander (2017), Large uncertainties in ob-
served daily precipitation extremes over land, Journal of Geophysical Research:
Atmospheres, 122(2), 668?681.
Herrera, S., L. Fita, J. Ferna?ndez, and J. M. Gutie?rrez (2010), Evaluation of the
mean and extreme precipitation regimes from the ENSEMBLES regional climate
multimodel simulations over Spain, Journal of Geophysical Research: Atmo-
spheres, 115(D21).
Higgins, R., Y. Yao, E. Yarosh, J. E. Janowiak, and K. Mo (1997), Influence of the
Great Plains low-level jet on summertime precipitation and moisture transport
over the central United States, Journal of Climate, 10(3), 481?507.
Hoffman, M. D., and A. Gelman (2014), The No-U-Turn sampler: adaptively setting
path lengths in Hamiltonian Monte Carlo., Journal of Machine Learning Research,
15(1), 1593?1623.
Holton, J. R. (2012), An introduction to dynamic meteorology, American Journal
of Physics.
147
Holtslag, A., and B. Boville (1993), Local versus nonlocal boundary-layer diffusion
in a global climate model, Journal of Climate, 6(10), 1825?1842.
Holtslag, A., E. De Bruijn, and H. Pan (1990), A high resolution air mass trans-
formation model for short-range weather forecasting, Monthly Weather Review,
118(8), 1561?1575.
Hong, S.-Y., J. Dudhia, and S.-H. Chen (2004), A revised approach to ice microphys-
ical processes for the bulk parameterization of clouds and precipitation, Monthly
Weather Review, 132(1), 103?120.
Hooper, D., J. Coughlan, and M. Mullen (2008), Structural equation modelling:
Guidelines for determining model fit, Articles, p. 2.
Iacono, M. J., J. S. Delamere, E. J. Mlawer, M. W. Shephard, S. A. Clough, and
W. D. Collins (2008), Radiative forcing by long-lived greenhouse gases: Calcula-
tions with the AER radiative transfer models, Journal of Geophysical Research:
Atmospheres, 113(D13).
Iguchi, T., W.-K. Tao, D. Wu, C. Peters-Lidard, J. A. Santanello, E. Kemp, Y. Tian,
J. Case, W. Wang, and R. Ferraro (2017), Sensitivity of CONUS summer rain-
fall to the selection of cumulus parameterization schemes in NU-WRF seasonal
simulations, Journal of Hydrometeorology, 18(6), 1689?1706.
Iorio, J., P. Duffy, B. Govindasamy, S. Thompson, M. Khairoutdinov, and D. Ran-
dall (2004), Effects of model resolution and subgrid-scale physics on the simula-
tion of precipitation in the continental United States, Climate Dynamics, 23(3-4),
243?258.
Jamili, A. (2016), Robust job shop scheduling problem: Mathematical models, exact
and heuristic algorithms, Expert systems with Applications, 55, 341?350.
Janjic?, Z. I. (1994), The step-mountain eta coordinate model: Further developments
of the convection, viscous sublayer, and turbulence closure schemes, Monthly
Weather Review, 122(5), 927?945.
Janjic?, Z. I. (2000), Comments on ?Development and evaluation of a convection
scheme for use in climate models?, Journal of the Atmospheric Sciences, 57(21),
3686?3686.
Kain, J. S. (2004), The Kain?Fritsch convective parameterization: an update, Jour-
nal of applied meteorology, 43(1), 170?181.
Kain, J. S., and J. M. Fritsch (1993), Convective parameterization for mesoscale
models: The Kain-Fritsch scheme, in The representation of cumulus convection
in numerical models, pp. 165?170, Springer.
148
Kang, I.-S., Y.-M. Yang, and W.-K. Tao (2015), GCMs with implicit and explicit
representation of cloud microphysics for simulation of extreme precipitation fre-
quency, Climate Dynamics, 45(1-2), 325?335.
Karl, T. R., and R. W. Knight (1998), Secular trends of precipitation amount,
frequency, and intensity in the United States, Bulletin of the American Meteoro-
logical society, 79(2), 231?241.
Key, J. T., L. R. Pericchi, and A. F. Smith (1999), Bayesian model choice: what
and why, Bayesian statistics, 6, 343?370.
Kharin, V. V., and F. W. Zwiers (2002), Climate predictions with multimodel en-
sembles, Journal of Climate, 15(7), 793?799.
Kharin, V. V., F. W. Zwiers, X. Zhang, and G. C. Hegerl (2007), Changes in
temperature and precipitation extremes in the IPCC ensemble of global coupled
model simulations, Journal of Climate, 20(8), 1419?1444.
Khoei, A., and S. Gharehbaghi (2007), The superconvergence patch recovery tech-
nique and data transfer operators in 3D plasticity problems, Finite Elements in
Analysis and Design, 43(8), 630?648.
Kirschbaum, D., R. Adler, D. Adler, C. Peters-Lidard, and G. Huffman (2012),
Global distribution of extreme precipitation and high-impact landslides in 2010
relative to previous years, Journal of hydrometeorology, 13(5), 1536?1551.
Kirtman, B. P., and D. Min (2009), Multimodel ensemble ENSO prediction with
CCSM and CFS, Monthly Weather Review, 137(9), 2908?2930.
Kline, R. B. (2016), Principles and practice of structural equation modeling, Guilford
publications.
Krishnamurti, T., C. Kishtawal, T. E. LaRow, D. R. Bachiochi, Z. Zhang, C. E.
Williford, S. Gadgil, and S. Surendran (1999), Improved weather and seasonal
climate forecasts from multimodel superensemble, Science, 285(5433), 1548?1550.
Kunkel, K. E., and X.-Z. Liang (2005), GCM simulations of the climate in the
central United States, Journal of Climate, 18(7), 1016?1031.
Kunkel, K. E., K. Andsager, and D. R. Easterling (1999), Long-term trends in
extreme precipitation events over the conterminous United States and Canada,
Journal of Climate, 12(8), 2515?2527.
Kunkel, K. E., K. Andsager, X.-Z. Liang, R. W. Arritt, E. S. Takle, W. J.
Gutowski Jr, and Z. Pan (2002), Observations and regional climate model simula-
tions of heavy precipitation events and seasonal anomalies: A comparison, Journal
of Hydrometeorology, 3(3), 322?334.
149
Kunkel, K. E., D. R. Easterling, D. A. Kristovich, B. Gleason, L. Stoecker, and
R. Smith (2012), Meteorological causes of the secular variations in observed ex-
treme precipitation events for the conterminous United States, Journal of Hy-
drometeorology, 13(3), 1131?1141.
Kunkel, K. E., T. R. Karl, H. Brooks, J. Kossin, J. H. Lawrimore, D. Arndt,
L. Bosart, D. Changnon, S. L. Cutter, and N. Doesken (2013), Monitoring and
understanding trends in extreme storms: State of knowledge, Bulletin of the
American Meteorological Society, 94(4), 499?514.
Lesk, C., P. Rowhani, and N. Ramankutty (2016), Influence of extreme weather
disasters on global crop production, Nature, 529(7584), 84?87.
Leung, L. R., Y. Qian, X. Bian, W. M. Washington, J. Han, and J. O. Roads (2004),
Mid-century ensemble regional climate change scenarios for the western United
States, Climatic Change, 62(1-3), 75?113.
Li, F., W. D. Collins, M. F. Wehner, D. L. Williamson, J. G. Olson, and C. Algieri
(2011), Impact of horizontal resolution on simulation of precipitation extremes
in an aqua-planet version of Community Atmospheric Model (CAM3), Tellus A:
Dynamic Meteorology and Oceanography, 63(5), 884?892.
Li, J., and H. Barker (2005), A radiation algorithm with correlated-k distribution.
Part I: Local thermal equilibrium, Journal of the atmospheric sciences, 62(2),
286?309.
Li, J., and K. Shibata (2006), On the effective solar pathlength, Journal of the
atmospheric sciences, 63(4), 1365?1373.
Li, Y., K. Guan, G. D. Schnitkey, E. DeLucia, and B. Peng (2019), Excessive rainfall
leads to maize yield loss of a comparable magnitude to extreme drought in the
United States, Global change biology.
Liang, X.-Z., and F. Zhang (2013), The cloud?aerosol?radiation (CAR) ensemble
modeling system, Atmospheric Chemistry and Physics, 13(16), 8335?8364.
Liang, X.-Z., K. E. Kunkel, and A. N. Samel (2001), Development of a regional
climate model for US Midwest applications. Part I: Sensitivity to buffer zone
treatment, Journal of Climate, 14(23), 4363?4378.
Liang, X.-Z., L. Li, K. E. Kunkel, M. Ting, and J. X. Wang (2004a), Regional
climate model simulation of US precipitation during 1982?2002. Part I: Annual
cycle, Journal of Climate, 17(18), 3510?3529.
Liang, X.-Z., L. Li, A. Dai, and K. E. Kunkel (2004b), Regional climate model sim-
ulation of summer precipitation diurnal cycle over the United States, Geophysical
Research Letters, 31(24).
150
Liang, X.-Z., M. Xu, W. Gao, K. Kunkel, J. Slusser, Y. Dai, Q. Min, P. R. Houser,
M. Rodell, and C. B. Schaaf (2005a), Development of land surface albedo param-
eterization based on Moderate Resolution Imaging Spectroradiometer (MODIS)
data, Journal of Geophysical Research: Atmospheres, 110(D11).
Liang, X.-Z., H. I. Choi, K. E. Kunkel, Y. Dai, E. Joseph, J. X. Wang, and P. Ku-
mar (2005b), Surface boundary conditions for mesoscale regional climate models,
Earth Interactions, 9(18), 1?28.
Liang, X.-Z., M. Xu, H. Choi, K. Kunkel, L. Rontu, J.-F. Geleyn, M. D. Mu?ller,
E. Joseph, and J. X. Wang (2006), Development of the regional climate-weather
research and forecasting model (CWRF): Treatment of subgrid topography effects,
in Proceedings of the 7th Annual WRF User?s Workshop, Boulder, CO, pp. 19?22.
Liang, X.-Z., M. Xu, K. E. Kunkel, G. A. Grell, and J. S. Kain (2007), Regional
climate model simulation of US?Mexico summer precipitation using the optimal
ensemble of two cumulus parameterizations, Journal of Climate, 20(20), 5201?
5207.
Liang, X.-Z., K. E. Kunkel, G. A. Meehl, R. G. Jones, and J. X. Wang (2008), Re-
gional climate models downscaling analysis of general circulation models present
climate biases propagation into future change projections, Geophysical research
letters, 35(8).
Liang, X.-Z., M. Xu, X. Yuan, T. Ling, H. I. Choi, F. Zhang, L. Chen, S. Liu, S. Su,
and F. Qiao (2012a), Regional climate-weather research and forecasting model,
Bulletin of the American Meteorological Society, 93(9), 1363?1387.
Liang, X.-Z., M. Xu, W. Gao, K. R. Reddy, K. Kunkel, D. L. Schmoldt, and A. N.
Samel (2012b), A distributed cotton growth model developed from GOSSYM and
its parameter determination, Agronomy journal, 104(3), 661?674.
Liang, X.-Z., C. Sun, X. Zheng, Y. Dai, M. Xu, H. I. Choi, T. Ling, F. Qiao,
X. Kong, and X. Bi (2018), CWRF performance at downscaling China climate
characteristics, Climate Dynamics, pp. 1?26.
Lim, K.-S. S., and S.-Y. Hong (2010), Development of an effective double-moment
cloud microphysics scheme with prognostic cloud condensation nuclei (CCN) for
weather and climate models, Monthly weather review, 138(5), 1587?1612.
Ling, T., X.-Z. Liang, M. Xu, Z. Wang, and B. Wang (2011), A multilevel ocean
mixed-layer model for 2-dimension applications, Acta Oceanol. Sin, 33(3), 1?10.
Ling, T., M. Xu, X.-Z. Liang, J. X. Wang, and Y. Noh (2015), A multilevel ocean
mixed layer model resolving the diurnal cycle: Development and validation, Jour-
nal of Advances in Modeling Earth Systems, 7(4), 1680?1692.
Louis, J.-F. (1979), A parametric model of vertical eddy fluxes in the atmosphere,
Boundary-Layer Meteorology, 17(2), 187?202.
151
Mahoney, K., M. Alexander, J. D. Scott, and J. Barsugli (2013), High-resolution
downscaled simulations of warm-season extreme precipitation events in the Col-
orado Front Range under past and future climates, Journal of Climate, 26(21),
8671?8689.
Martynov, A., R. Laprise, L. Sushama, K. Winger, L. S?eparovic?, and B. Dugas
(2013), Reanalysis-driven climate simulation over CORDEX North America do-
main using the Canadian Regional Climate Model, version 5: model performance
evaluation, Climate dynamics, 41(11-12), 2973?3005.
May, W. (2004), Simulation of the variability and extremes of daily rainfall during
the Indian summer monsoon for present and future times in a global time-slice
experiment, Climate Dynamics, 22(2-3), 183?204.
McInnes, K., K. Walsh, G. Hubbert, and T. Beer (2003), Impact of sea-level rise
and storm surges on a coastal community, Natural Hazards, 30(2), 187?207.
Meehl, G. A., L. Goddard, J. Murphy, R. J. Stouffer, G. Boer, G. Danabasoglu,
K. Dixon, M. A. Giorgetta, A. M. Greene, and E. Hawkins (2009), Decadal predic-
tion: can it be skillful?, Bulletin of the American Meteorological Society, 90(10),
1467.
Menne, M. J., I. Durre, R. S. Vose, B. E. Gleason, and T. G. Houston (2012), An
overview of the global historical climatology network-daily database, Journal of
atmospheric and oceanic technology, 29(7), 897?910.
Min, S.-K., X. Zhang, F. W. Zwiers, and G. C. Hegerl (2011), Human contribution
to more-intense precipitation extremes, Nature, 470(7334), 378?381.
Moradkhani, H., C. M. DeChant, and S. Sorooshian (2012), Evolution of ensemble
data assimilation for uncertainty quantification using the particle filter-Markov
chain Monte Carlo method, Water Resources Research, 48(12).
Morrison, H., and J. Milbrandt (2010), Comparison of two-moment bulk mi-
crophysics schemes in idealized supercell thunderstorm simulations, Monthly
Weather Review.
Morrison, H., and J. A. Milbrandt (2015), Parameterization of cloud microphysics
based on the prediction of bulk ice particle properties. Part I: Scheme description
and idealized tests, Journal of the Atmospheric Sciences, 72(1), 287?311.
Morrison, H., G. Thompson, and V. Tatarskii (2009), Impact of cloud microphysics
on the development of trailing stratiform precipitation in a simulated squall line:
Comparison of one-and two-moment schemes, Monthly Weather Review, 137(3),
991?1007.
Nakanishi, M., and H. Niino (2006), An improved Mellor?Yamada level-3 model:
Its numerical stability and application to a regional prediction of advection fog,
Boundary-Layer Meteorology, 119(2), 397?407.
152
Nakanishi, M., and H. Niino (2009), Development of an improved turbulence closure
model for the atmospheric boundary layer, Journal of the Meteorological Society
of Japan. Ser. II, 87(5), 895?912.
Niu, G.-Y., Z.-L. Yang, K. E. Mitchell, F. Chen, M. B. Ek, M. Barlage, A. Kumar,
K. Manning, D. Niyogi, and E. Rosero (2011), The community Noah land sur-
face model with multiparameterization options (Noah-MP): 1. Model description
and evaluation with local-scale measurements, Journal of Geophysical Research:
Atmospheres, 116(D12).
NOAA (2018), NOAA National Centers for Environmental Information (NCEI) U.S.
Billion-Dollar Weather and Climate Disasters.
NOAA (2019), NOAA National Centers for Environmental Information (NCEI) U.S.
Billion-Dollar Weather and Climate Disasters.
Nordeng, T. E. (1994), Extended versions of the convective parametrization scheme
at ECMWF and their impact on the mean and transient activity of the model in
the tropics., Research Department Technical Memorandum, 206, 1?41.
Oleson, K., G.-Y. Niu, Z.-L. Yang, D. Lawrence, P. Thornton, P. Lawrence,
R. Sto?ckli, R. Dickinson, G. Bonan, and S. Levis (2008), Improvements to the
Community Land Model and their impact on the hydrological cycle, Journal of
Geophysical Research: Biogeosciences, 113(G1).
Pal, J. S., E. E. Small, and E. A. Eltahir (2000), Simulation of regional-scale wa-
ter and energy budgets: Representation of subgrid cloud and precipitation pro-
cesses within RegCM, Journal of Geophysical Research: Atmospheres, 105(D24),
29,579?29,594.
Palmer, T. N. (2000), Predicting uncertainty in forecasts of weather and climate,
Reports on Progress in Physics, 63(2), 71.
Park, S., and C. S. Bretherton (2009), The University of Washington shallow con-
vection and moist turbulence schemes and their impact on climate simulations
with the Community Atmosphere Model, Journal of Climate, 22(12), 3449?3469.
Pathirana, A., H. B. Denekew, W. Veerbeek, C. Zevenbergen, and A. T. Banda
(2014), Impact of urban growth-driven landuse change on microclimate and ex-
treme precipitation?A sensitivity study, Atmospheric Research, 138, 59?72.
Pendergrass, A. G., and D. L. Hartmann (2014), The atmospheric energy constraint
on global-mean precipitation change, Journal of Climate, 27(2), 757?768.
Pfahl, S., P. A. O?Gorman, and E. M. Fischer (2017), Understanding the regional
pattern of projected future changes in extreme precipitation, Nature Climate
Change, 7(6), 423.
153
Pielke, R., and C. Landsea (1998), WEATHER FORECASTS, Weather Forecasts,
13, 621.
Pielke, R. A. (1999), Nine fallacies of floods, Climatic Change, 42(2), 413?438.
Pihur, V., S. Datta, and S. Datta (2009), RankAggreg, an R package for weighted
rank aggregation, BMC bioinformatics, 10(1), 62.
Platnick, S., M. D. King, K. G. Meyer, G. Wind, N. Amarasinghe, B. Marchant,
G. T. Arnold, Z. Zhang, P. A. Hubanks, and B. Ridgway (2015), MODIS cloud op-
tical properties: User guide for the Collection 6 Level-2 MOD06/MYD06 product
and associated Level-3 Datasets, Version, 1, 145.
Pleim, J. E. (2007), A combined local and nonlocal closure model for the atmo-
spheric boundary layer. Part I: Model description and testing, Journal of Applied
Meteorology and Climatology, 46(9), 1383?1395.
Potts, J., C. Folland, I. Jolliffe, and D. Sexton (1996), Revised ?LEPS? scores for
assessing climate model simulations and long-range forecasts, Journal of Climate,
9(1), 34?53.
Preacher, K. J., and A. F. Hayes (2008), Asymptotic and resampling strategies for
assessing and comparing indirect effects in multiple mediator models, Behavior
research methods, 40(3), 879?891.
Prein, A. F., R. M. Rasmussen, K. Ikeda, C. Liu, M. P. Clark, and G. J. Hol-
land (2017), The future intensification of hourly precipitation extremes, Nature
Climate Change, 7(1), 48.
Qian, Y., H. Yan, Z. Hou, G. Johannesson, S. Klein, D. Lucas, R. Neale, P. Rasch,
L. Swiler, and J. Tannahill (2015), Parametric sensitivity analysis of precipitation
at global and local scales in the Community Atmosphere Model CAM5, Journal
of Advances in Modeling Earth Systems, 7(2), 382?411.
Qiao, F., and X.-Z. Liang (2015), Effects of cumulus parameterizations on predic-
tions of summer flood in the Central United States, Climate dynamics, 45(3-4),
727?744.
Qiao, F., and X.-Z. Liang (2016), Effects of cumulus parameterization closures on
simulations of summer precipitation over the United States coastal oceans, Journal
of Advances in Modeling Earth Systems, 8(2), 764?785.
Qiao, F., and X.-Z. Liang (2017), Effects of cumulus parameterization closures on
simulations of summer precipitation over the continental United States, Climate
Dynamics, 49(1-2), 225?247.
Raftery, A. E., T. Gneiting, F. Balabdaoui, and M. Polakowski (2005), Using
Bayesian model averaging to calibrate forecast ensembles, Monthly weather re-
view, 133(5), 1155?1174.
154
Randles, C., A. Da Silva, V. Buchard, P. Colarco, A. Darmenov, R. Govindaraju,
A. Smirnov, B. Holben, R. Ferrare, and J. Hair (2017), The MERRA-2 aerosol
reanalysis, 1980 onward. Part I: System description and data assimilation evalu-
ation, Journal of climate, 30(17), 6823?6850.
Rasch, P., and J. Kristja?nsson (1998), A comparison of the CCM3 model climate
using diagnosed and predicted condensate parameterizations, Journal of Climate,
11(7), 1587?1614.
Reynolds, R. W., T. M. Smith, C. Liu, D. B. Chelton, K. S. Casey, and M. G.
Schlax (2007), Daily high-resolution-blended analyses for sea surface temperature,
Journal of Climate, 20(22), 5473?5496.
Roeckner, E., K. Arpe, L. Bengtsson, M. Christoph, M. Claussen, L. Du?menil,
M. Esch, M. A. Giorgetta, U. Schlese, and U. Schulzweida (1996), The atmo-
spheric general circulation model ECHAM-4: Model description and simulation
of present-day climate.
Roeckner, E., G. Ba?uml, L. Bonaventura, R. Brokopf, M. Esch, M. Giorgetta,
S. Hagemann, I. Kirchner, L. Kornblueh, and E. Manzini (2003), The atmospheric
general circulation model ECHAM 5. PART I: Model description.
Rosseel, Y. (2012), Lavaan: An R package for structural equation modeling and
more. Version 0.5?12 (BETA), Journal of statistical software, 48(2), 1?36.
Rubin, D. B. (1981), The bayesian bootstrap, The annals of statistics, pp. 130?134.
Samuelsson, P., S. Gollvik, and A. Ullerstig (2006), The land-surface scheme of the
Rossby Centre regional atmospheric climate model (RCA3), SMHI.
Samuelsson, P., C. G. Jones, U. WillEn, A. Ullerstig, S. Gollvik, U. Hansson,
E. Jansson, C. Kjellstrom, G. Nikulin, and K. Wyser (2011), The Rossby Centre
Regional Climate model RCA3: model description and performance, Tellus A:
Dynamic Meteorology and Oceanography, 63(1), 4?23.
Scha?r, C., N. Ban, E. M. Fischer, J. Rajczak, J. Schmidli, C. Frei, F. Giorgi, T. R.
Karl, E. J. Kendon, and A. M. K. Tank (2016), Percentile indices for assessing
changes in heavy precipitation events, Climatic Change, pp. 1?16.
Schulz, J., P. Albert, H.-D. Behr, D. Caprion, H. Deneke, S. Dewitte, B. Durr,
P. Fuchs, A. Gratzki, and P. Hechler (2009), Operational climate monitoring
from space: the EUMETSAT Satellite Application Facility on Climate Monitoring
(CM-SAF), Atmospheric Chemistry and Physics, 9(5), 1687?1709.
Scinocca, J., V. Kharin, Y. Jiao, M. Qian, M. Lazare, L. Solheim, G. Flato, S. Biner,
M. Desgagne, and B. Dugas (2016), Coordinated global and regional climate mod-
eling, Journal of Climate, 29(1), 17?35.
155
Semmler, T., and D. Jacob (2004), Modeling extreme precipitation events?a cli-
mate change simulation for Europe, Global and Planetary Change, 44(1), 119?127.
Siegert, S., C. A. Ferro, and D. B. Stephenson (2014), Evaluating ensemble fore-
casts by the Ignorance score?Correcting the finite-ensemble bias, arXiv preprint
arXiv:1410.8249.
Siler, N., and G. Roe (2014), How will orographic precipitation respond to sur-
face warming? An idealized thermodynamic perspective, Geophysical Research
Letters, 41(7), 2606?2613.
Sillmann, J., V. Kharin, X. Zhang, F. Zwiers, and D. Bronaugh (2013), Climate
extremes indices in the CMIP5 multimodel ensemble: Part 1. Model evaluation
in the present climate, Journal of Geophysical Research: Atmospheres, 118(4),
1716?1733.
Sillmann, J., T. Thorarinsdottir, N. Keenlyside, N. Schaller, L. V. Alexander,
G. Hegerl, S. I. Seneviratne, R. Vautard, X. Zhang, and F. W. Zwiers (2017),
Understanding, modeling and predicting weather and climate extremes: Chal-
lenges and opportunities, Weather and climate extremes, 18, 65?74.
Skamarock, W. C., J. B. Klemp, J. Dudhia, D. O. Gill, D. M. Barker, W. Wang,
and J. G. Powers (2005), A description of the advanced research WRF version 2,
Tech. rep., National Center For Atmospheric Research Boulder Co Mesoscale and
Microscale . . . .
Sloughter, J. M. L., A. E. Raftery, T. Gneiting, and C. Fraley (2007), Probabilistic
quantitative precipitation forecasting using Bayesian model averaging, Monthly
Weather Review, 135(9), 3209?3220.
Smith, A. B., and J. L. Matthews (2015), Quantifying uncertainty and variable
sensitivity within the US billion-dollar weather and climate disaster cost estimates,
Natural Hazards, 77(3), 1829?1851.
Smith, L. A. (2002), What might we learn from climate forecasts?, Proceedings of
the National Academy of Sciences, 99, 2487?2492.
Stephens, G. L., T. L?Ecuyer, R. Forbes, A. Gettelmen, J.-C. Golaz, A. Bodas-
Salcedo, K. Suzuki, P. Gabriel, and J. Haynes (2010), Dreary state of precipitation
in global models, Journal of Geophysical Research: Atmospheres, 115(D24).
Subin, Z. M., W. J. Riley, and D. Mironov (2012), An improved lake model for
climate simulations: Model structure, evaluation, and sensitivity analyses in
CESM1, Journal of Advances in Modeling Earth Systems, 4(1).
Sun, C., and X.-Z. Liang (?submitted?,a), Improving U.S. extreme precipitation
simulation: Sensitivity to physics parameterizations, Journal of Climate.
156
Sun, C., and X.-Z. Liang (?submitted?,b), Improving U.S. extreme precipitation
simulation: Dependence on cumulus parameterization and underlying mechanism,
Journal of Climate.
Sun, D.-Z., J. Fasullo, T. Zhang, and A. Roubicek (2003), On the radiative and
dynamical feedbacks over the equatorial Pacific cold tongue, Journal of Climate,
16(14), 2425?2432.
Sun, Y., S. Solomon, A. Dai, and R. W. Portmann (2006), How often does it rain?,
Journal of Climate, 19(6), 916?934.
Sundqvist, H. (1978), A parameterization scheme for non-convective condensation
including prediction of cloud water content, Quarterly Journal of the Royal Me-
teorological Society, 104(441), 677?690.
Tao, W.-K., J. Simpson, and M. McCumber (1989), An ice-water saturation adjust-
ment, Monthly Weather Review, 117(1), 231?235.
Tao, W.-K., J. Simpson, D. Baker, S. Braun, M.-D. Chou, B. Ferrier, D. Johnson,
A. Khain, S. Lang, and B. Lynn (2003), Microphysics, radiation and surface
processes in the Goddard Cumulus Ensemble (GCE) model, Meteorology and
Atmospheric Physics, 82(1), 97?137.
Taylor, K. E. (2001), Summarizing multiple aspects of model performance in a single
diagram, Journal of Geophysical Research: Atmospheres, 106(D7), 7183?7192.
Team, S. S. (2008), SRB Data, Hampton, VA, USA: NASA Atmospheric Science
Data Center (ASDC).
Thompson, G., and T. Eidhammer (2014), A study of aerosol impacts on clouds and
precipitation development in a large winter cyclone, Journal of the atmospheric
sciences, 71(10), 3636?3658.
Thompson, G., R. M. Rasmussen, and K. Manning (2004), Explicit forecasts of win-
ter precipitation using an improved bulk microphysics scheme. Part I: Description
and sensitivity analysis, Monthly Weather Review, 132(2), 519?542.
Thompson, G., P. R. Field, R. M. Rasmussen, and W. D. Hall (2008), Explicit
forecasts of winter precipitation using an improved bulk microphysics scheme. Part
II: Implementation of a new snow parameterization, Monthly Weather Review,
136(12), 5095?5115.
Tiedtke, M. (1989), A comprehensive mass flux scheme for cumulus parameterization
in large-scale models, Monthly Weather Review, 117(8), 1779?1800.
Tran, M.-N. (2011), A criterion for optimal predictive model selection, Communi-
cations in Statistics?Theory and Methods, 40(5), 893?906.
157
Trenberth, K. E., A. Dai, R. M. Rasmussen, and D. B. Parsons (2003), The changing
character of precipitation, Bulletin of the American Meteorological Society, 84(9),
1205?1217.
Tripathi, O. P., and F. Dominguez (2013), Effects of spatial resolution in the sim-
ulation of daily and subdaily precipitation in the southwestern US, Journal of
Geophysical Research: Atmospheres, 118(14), 7591?7605.
Turner, B. M., and P. B. Sederberg (2012), Approximate Bayesian computation
with differential evolution, Journal of Mathematical Psychology, 56(5), 375?385.
USGCRP, D. (2017), Climate Science Special Report: Fourth National Climate
Assessment, Volume I, Page, US Global Change Research Program, Washington,
DC, USA.
Vehtari, A., A. Gelman, and J. Gabry (2017), Practical Bayesian model evaluation
using leave-one-out cross-validation and WAIC, Statistics and Computing, 27(5),
1413?1432.
Verseghy, D., N. McFarlane, and M. Lazare (1993), CLASS?A Canadian land sur-
face scheme for GCMs, II. Vegetation model and coupled runs, International Jour-
nal of Climatology, 13(4), 347?370.
Verseghy, D. L. (1991), CLASS?A Canadian land surface scheme for GCMs. I. Soil
model, International Journal of Climatology, 11(2), 111?133.
Vrugt, J. A., C. Ter Braak, C. Diks, B. A. Robinson, J. M. Hyman, and D. Higdon
(2009), Accelerating Markov chain Monte Carlo simulation by differential evolu-
tion with self-adaptive randomized subspace sampling, International Journal of
Nonlinear Sciences and Numerical Simulation, 10(3), 273?290.
Wang, J., and V. R. Kotamarthi (2015), High-resolution dynamically downscaled
projections of precipitation in the mid and late 21st century over North America,
Earth?s Future, 3(7), 268?288.
Wang, W., and N. L. Seaman (1997), A comparison study of convective parameteri-
zation schemes in a mesoscale model, Monthly Weather Review, 125(2), 252?278.
Watanabe, M., H. Shiogama, T. Yokohata, Y. Kamae, M. Yoshimori, T. Ogura, J. D.
Annan, J. C. Hargreaves, S. Emori, and M. Kimoto (2012), Using a multiphysics
ensemble for exploring diversity in cloud?shortwave feedback in GCMs, Journal
of Climate, 25(15), 5416?5431.
Wehner, M. F., R. L. Smith, G. Bala, and P. Duffy (2010), The effect of horizon-
tal resolution on simulation of very extreme US precipitation events in a global
atmosphere model, Climate dynamics, 34(2-3), 241?247.
158
Wilcox, E. M., and L. J. Donner (2007), The frequency of extreme rain events in
satellite rain-rate estimates and an atmospheric general circulation model, Journal
of Climate, 20(1), 53?69.
Wilson, D. R., A. Bushell, A. M. Kerr-Munslow, J. D. Price, C. J. Morcrette, and
A. Bodas-Salcedo (2008), PC2: A prognostic cloud fraction and condensation
scheme. II: Climate model simulations, Quarterly Journal of the Royal Meteoro-
logical Society, 134(637), 2109?2125.
Wuebbles, D. J., K. Kunkel, M. Wehner, and Z. Zobel (2014), Severe weather in
United States under a changing climate, Eos, Transactions American Geophysical
Union, 95(18), 149?150.
Wuebbles, D. J., D. W. Fahey, and K. A. Hibbard (2017), Climate science special
report: fourth national climate assessment, volume I.
Xie, J., and M. Zhang (2017), Role of internal atmospheric variability in the 2015
extreme winter climate over the North American continent, Geophysical Research
Letters, 44(5), 2464?2471.
Xie, S., M. Zhang, J. S. Boyle, R. T. Cederwall, G. L. Potter, and W. Lin (2004),
Impact of a revised convective triggering mechanism on Community Atmosphere
Model, Version 2, simulations: Results from short-range weather forecasts, Jour-
nal of Geophysical Research: Atmospheres, 109(D14).
Xie, S.-P., C. Deser, G. A. Vecchi, M. Collins, T. L. Delworth, A. Hall, E. Hawkins,
N. C. Johnson, C. Cassou, and A. Giannini (2015), Towards predictive under-
standing of regional climate change, Nature Climate Change.
Xu, K.-M., and D. A. Randall (1996), A semiempirical cloudiness parameterization
for use in climate models, Journal of the atmospheric sciences, 53(21), 3084?3102.
Xu, M., X.-Z. Liang, A. Samel, and W. Gao (2014), MODIS consistent vegeta-
tion parameter specifications and their impacts on regional climate simulations,
Journal of Climate, 27(22), 8578?8596.
Yang, Z.-L., G.-Y. Niu, K. E. Mitchell, F. Chen, M. B. Ek, M. Barlage, L. Longuev-
ergne, K. Manning, D. Niyogi, and M. Tewari (2011), The community Noah land
surface model with multiparameterization options (Noah-MP): 2. Evaluation over
global river basins, Journal of Geophysical Research: Atmospheres, 116(D12).
Yao, Y., A. Vehtari, D. Simpson, and A. Gelman (2018), Using stacking to average
Bayesian predictive distributions (with discussion), Bayesian Analysis, 13(3), 917?
1003.
Yuan, X., and X.-Z. Liang (2011), Evaluation of a Conjunctive Surface-Subsurface
Process Model (CSSP) over the contiguous United States at regional-local scales,
Journal of Hydrometeorology, 12(4), 579?599.
159
Yuan, X., X.-Z. Liang, and E. F. Wood (2012), WRF ensemble downscaling seasonal
forecasts of China winter precipitation during 1982?2008, Climate dynamics, 39(7-
8), 2041?2058.
Zhang, C., Y. Wang, and K. Hamilton (2011a), Improved representation of boundary
layer clouds over the southeast Pacific in ARW-WRF using a modified Tiedtke
cumulus parameterization scheme, Monthly Weather Review, 139(11), 3489?3513.
Zhang, X., F. W. Zwiers, G. C. Hegerl, F. H. Lambert, N. P. Gillett, S. Solomon,
P. A. Stott, and T. Nozawa (2007), Detection of human influence on twentieth-
century precipitation trends, Nature, 448(7152), 461?465.
Zhang, X., L. Alexander, G. C. Hegerl, P. Jones, A. K. Tank, T. C. Peterson,
B. Trewin, and F. W. Zwiers (2011b), Indices for monitoring changes in extremes
based on daily temperature and precipitation data, Wiley Interdisciplinary Re-
views: Climate Change, 2(6), 851?870.
Zhang, X., H. Wan, F. W. Zwiers, G. C. Hegerl, and S.-K. Min (2013), Attributing
intensification of precipitation extremes to human influence, Geophysical Research
Letters, 40(19), 5252?5257.
Zhu, J., and X.-Z. Liang (2013), Impacts of the Bermuda high on regional climate
and ozone over the United States, Journal of Climate, 26(3), 1018?1032.
Zobel, Z., J. Wang, D. J. Wuebbles, and V. R. Kotamarthi (2018), Evaluations
of high-resolution dynamically downscaled ensembles over the contiguous United
States, Climate dynamics, 50(3-4), 863?884.
160