ABSTRACT Title of Dissertation: ASSIMILATION OF PRECIPITATION AND NONLOCAL OBSERVATIONS IN THE LETKF, AND COMPARISON OF COUPLED DATA ASSIMILATION STRATEGIES WITH A COUPLED QUASI-GEOSTROPHIC ATMOSPHERE-OCEAN MODEL Cheng Da Doctor of Philosophy, 2022 Dissertation Directed by: Professor Eugenia Kalnay Department of Atmospheric and Oceanic Science Among the data assimilation methods, the Ensemble Kalman Filter (EnKF) has gained popularity due to its ease of implementation and incorporation of the ?errors of the day? [Kalnay, 2003]. While the EnKF can successfully assimilate a wide range of observations, it encounters difficulty handling two types of observations: a) observations with non-Gaussian errors such as hydrometeors and precipitation, and b) nonlocal (i.e., path-integrated) observations such as radiance, both of which are vital for weather monitoring and forecasting, since non-Gaussian observations are often associated with severe weather, and nonlocal observations contribute the most to the improved weather forecast skill in the modern assimilation systems. The satellite mission, the Global Precipitation Measurement (GPM), provides several products belonging to these two types of observations since its launch in 2014. Different strategies are developed in this dissertation to assimilate these two types of observations in the EnKF system. To assimilate GPM surface precipitation with non-Gaussian errors, we extended the Gaussian transformation approach developed by Lien et al. [2013, 2016a,b] to a regional model. We transformed the observed and modeled precipitation into Gaussian variables (anamorphosis), whose errors also become more Gaussian. We then allowed the transformed precipitation to adjust the dynamic variables and hydrometeors directly through the ensemble error covariance in the EnKF so that the model could ?remember? the correct dynamics. Four typhoon cases in 2015 were studied to investigate the impact of GPM precipitation assimilation on typhoon forecast. Results show that model analysis by additional precipitation assimilation agrees more favorably with various independent observations, which leads to an improved typhoon forecast up to 72 hours. Localizing nonlocal observations in the EnKF is another challenging problem. Observation localization is needed in the EnKF to reduce sampling errors caused by the small ensemble size. Unlike conventional observations with single observed locations, those nonlocal observations such as radiance are path-integrated measurements and do not have single observed locations. One common empirical single-layer vertical localization (SLVL) approach localizes nonlocal observations at their weighting function (WF) peaks with symmetric Gaussian-shape localization functions. While the SLVL approach is appropriate for observations with symmetric Gaussian-shaped WFs, it might have difficulty handling observations properly with broad asymmetric WFs or multiple WF peaks, which are typical for clear-sky radiance from sounding or trace-gas sensitive channels of hyperspectral infrared sensors. A multi-layer vertical localization (MLVL) method is developed as an extension of the SLVL, which explicitly considers the WF shape in the formulation and generates the localization value based on the cumulative influences from all components that constitute the nonlocal observations. Observing system simulation experiments assimilating 1-D and 3-D nonlocal observations show that the MLVL has comparable or better performance than the SLVL when assimilating narrow-WF observations, and superior performance than the SLVL when assimilating observations with broad WFs or multiple WF peaks. In the last part, we switch our focus to coupled data assimilation in preparation for assimilating GPM precipitation into different earth components through strongly-coupled data assimilation. Few studies have systematically compared ensemble and variational methods with different coupled data assimilation (CDA) strategies (i.e., uncoupled DA (UCDA), weakly-coupled DA (WCDA), and strongly-coupled DA (SCDA)) for coupled models, though such comparison are essential to understand different methods and have been extensively conducted for uncoupled models. We developed a coupled data assimilation testbed for a coupled quasi-geostrophic atmosphere-ocean model that allows systematic comparison between ensemble and variational methods under different CDA strategies. Results show that WCDA and SCDA improve the coupled analysis compared with UCDA for both 3D-Var and ETKF. It is found that the ocean analysis by SC ETKF is consistently better than the one by WC ETKF, a phenomenon not observed for the 3D-Var method. Different SCDA methods are then compared together under different observation networks. When both atmosphere and ocean observations are assimilated, the SC incremental 4D-Var and ETKF share a similar analysis RMSE smaller than SC 3D- Var, for both atmosphere and ocean. An ECMWF CERA-like assimilation system, which adopts the outer-loop-coupling approach instead of utilizing the coupled-state background error covariance, achieves a similar RMSE as the SC 4D-Var and ETKF. When only atmospheric observations are assimilated, all variational-based DA methods using static background error covariance fail to stabilize the RMSE for the oceanwithin the experiment periods (about 27.4 years), while the flow-dependent ETKF does stabilize the analysis after about 10 years. Among all the variational systems, CERA shows larger ocean analysis RMSE than SC 3D-Var and 4D-Var, which indicates the outer-loop-coupling alone is not enough to replace the role of a coupled-state background error covariance. ASSIMILATION OF PRECIPITATION AND NONLOCAL OBSERVATIONS IN THE LETKF, AND COMPARISON OF COUPLED DATA ASSIMILATION STRATEGIES WITH A COUPLED QUASI-GEOSTROPHIC ATMOSPHERE-OCEAN MODEL by Cheng Da Dissertation submitted to the Faculty of the Graduate School of the University of Maryland, College Park in partial fulfillment of the requirements for the degree of Doctor of Philosophy 2022 Advisory Committee: Professor Eugenia Kalnay, Chair/Advisor Dr. Tse-chun Chen, Co-Chair/Co-Advisor Professor Brian Hunt Professor James Carton Professor Jonathan Poterjoy Dr. Safa Mote ? Copyright by Cheng Da 2022 Dedication This dissertation is dedicated to my hero Professor Eugenia Kalnay. ii Acknowledgments First, I would like to express my deepest gratitude to my advisor, Prof. Eugenia Kalnay. Eugenia saved me out during the darkest period in my life. I still remember the moment vividly when she said that she would be willing to be my advisor near the corridor of the 4th floor in the Atlantic building. My life has changed entirely since then. During my Ph.D., Eugenia gave me infinite freedom to explore different DA topics. Without meeting her, my understanding of DA would be limited to observation impact assessment. Neither would I get the chance to learn other fascinating and important topics such as EFSO and bred vectors. Eugenia has taught me more than knowledge. For many NWP and DA problems, Eugenia always has simple solutions different from others, which later proved to be very effective. From Eugenia, I learned that I should not be worried about proposing different methods in my research as long as it works. Eugenia is not only a great scientist with a tremendous passion for research but also the best advisor that I can ever work with. She is highly supportive of all her students. More importantly, she sees the good parts of every student and always trusts and encourages her students. I had lost faith in myself and my research several times. But every time after discussion with Eugenia, I regained my confidence to keep going. I am also fortunate to meet Jorge Rivas, the son of Eugenia. Both Jorge and Eugenia are very talented people who are knowledgeable and insightful, but they always have enormous patience to listen to others, and then share their iii opinions without judging others. Before meeting them, my life experience taught me that the arrogance of a person grows with her/his knowledge. Jorge and Eugenia completely changed my biased view and let me realize what kind of person I want to become. I would like to thank my co-advisor Dr. Tse-chun Chen, who is always happy to listen to my ideas on DA and everything else. Besides, he helps me realize that many research topics are far more interesting than observation impact assessment. When I joined the lab, I was very technical and only cared about DA applications. Through the discussion with him, I gradually shifted my interests to DA methods and started to gain knowledge from other disciplines. I am thankful to Profs. Brian Hunt, James Carton, Jonathan Poterjoy and Dr. Safa Mote for being my committee members. In addition, I would like to thank Prof. Brian Hunt for his elegant LETKF paper that helps a VarDA infidel transits smoothly to the ensemble DA, Prof. James Carton for his guidance on assimilating altimeter observations, Prof. Jonathan Poterjoy for all his inspiring discussions on my work, and Dr. Safa Mote for his introduction of Julia and Zotero. I am grateful to Prof. Takemasa Miyoshi for providingmewith the precious internship opportunity at RIKEN, working on precipitation assimilation on typhoon forecasts. Without the starting work done at RIKEN, I can never get the NASANESSF fellowship. I am also grateful to Prof. Steve Penny for his wonderful class AOSC658E about coupled data assimilation and his numerous suggestions for my research and career. I would like to thank Professor Kayo Ide for organizing the Weather- chaos events, which are essentially my DA happy hours every semester. My sincere thanks are also to Prof. Fuqing Zhang for all his advice on my research and career. It is a privilege to work with talented players, especially those with good tempers. iv I would like to thank my talented unofficial mentors in Eugenia?s group, Drs Guo-yuan Lien (precipitation DA, coding), Travis Sluka (ocean DA, coding), Takuma Yoshida (error growth, LV, SV, BV), Kriti Bhargava (model bias, IAU), and Eviatar Bach (computer graphics), who all shared their DA knowledge with me. I also thank Dr. Luyu Sun and other members in the Weather-chaos group for their exciting presentations and stimulating discussions on DA. I would like to thank Lei Zhang, Chu-chun Chang, and Drs. Xiaoxu Tian, William Miller and Sarah Benish at AOSC, and my former lab members at Florida State University. Out of campus, I am grateful to Drs. Ji Zhang, Cong Zhang, andMengtao Yin and Ms. Lu Ren for their friendships in my life. I know that all my happy years at UMD mentioned above could not be possible if not for the help of Prof. James Carton, Tamara Hendershot at AOSC, and Linda Carter at ESSIC in 2015. I?m very grateful to them. Near the end, I would like to express my deepest gratitude to my parents Feng Da and Xiaobing Cheng, and my wife Qing Zhu, for their unconditional love and support. Finally, I would like to thank all my silent peers macbook 13/15, skywork, K- computer, Deepthought2, Aditya for running my experiments day and night without any complaints. And thank you, my lucky satellite the Global Precipitation Measurement. May you live long and prosper. v Table of Contents Dedication ii Acknowledgements iii Table of Contents vi List of Tables ix List of Figures x List of Abbreviations xvi Chapter 1: Introduction 1 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1.1 Non-Gaussian data assimilation . . . . . . . . . . . . . . . . . . 2 1.1.2 Precipitation assimilation with Gaussian transformation . . . . . 4 1.1.3 Vertical localization for nonlocal observations . . . . . . . . . . . 5 1.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.3 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Chapter 2: Improved Tropical Cyclone Predictions by Assimilation of Satellite- retrieved Surface Precipitation with Gaussian Transformation 9 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2 Overview of four typhoon cases . . . . . . . . . . . . . . . . . . . . . . 13 2.2.1 Typhoon Chan-hom . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.2.2 Typhoon Nangka . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.2.3 Typhoon Soudelor . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.2.4 Typhoon Goni . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.3 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.3.1 Model configuration . . . . . . . . . . . . . . . . . . . . . . . . 16 2.3.2 Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.3.3 Precipitation assimilation with Gaussian transformation . . . . . 20 2.3.4 Experiment setup . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.4.1 Impact of precipitation assimilation on analysis and 6-hour forecast 23 2.4.2 Overall track and intensity forecast error . . . . . . . . . . . . . . 33 2.4.3 Importance of precipitation retrievals over typhoon regions . . . . 39 vi 2.4.4 Verification against GPS-RO retrieved wet profiles over typhoon regions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 2.4.5 Impact of precipitation CDF on typhoon forecast errors . . . . . . 47 2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Chapter 3: Multi-layer Observation Localization for Nonlocal Observations in the Local Ensemble Transform Kalman Filter. 53 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 3.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 3.2.1 The equivalent form of the LETKF without vertical localization . 59 3.2.2 Interpretation of existing methods and the MLVL . . . . . . . . . 64 3.2.3 Implementation of MLVL in the LETKF . . . . . . . . . . . . . . 67 3.2.4 Extension to other sequential-observation-processing EnSRFs . . 69 3.3 MLVL applied to the selected AMSU-A and CrIS channels . . . . . . . . 71 3.3.1 AMSU-A observations . . . . . . . . . . . . . . . . . . . . . . . 71 3.3.2 CrIS observations . . . . . . . . . . . . . . . . . . . . . . . . . . 72 3.4 Experiment design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 3.4.1 The three-layer QG Model . . . . . . . . . . . . . . . . . . . . . 76 3.4.2 Observation network . . . . . . . . . . . . . . . . . . . . . . . . 77 3.4.3 Data assimilation settings . . . . . . . . . . . . . . . . . . . . . 78 3.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 3.5.1 8-member experimentswhen assimilating observationswith narrow WFs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 3.5.2 8-member experimentswhen assimilating observationswith broad WFs & multiple WF peaks . . . . . . . . . . . . . . . . . . . . . 84 3.5.3 Comparison of the best MLVL and SLVL results . . . . . . . . . 86 3.5.4 Sensitivity to the ensemble size and assigned weights in MLVL . 89 3.5.5 Experiments with 3-dimensional integrated observations . . . . . 92 3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 Chapter 4: Systematic Comparison of Coupled Data Assimilation Strategies for a Coupled Quasi-geostrophic Atmosphere-Ocean Model 99 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 4.1.1 EnKF-based CDA . . . . . . . . . . . . . . . . . . . . . . . . . 100 4.1.2 Variational-based CDA . . . . . . . . . . . . . . . . . . . . . . . 103 4.1.3 Objectives of this study . . . . . . . . . . . . . . . . . . . . . . . 106 4.2 Method: 3D-Var, 4D-Var and ETKF . . . . . . . . . . . . . . . . . . . . 108 4.2.1 3D-Var . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 4.2.2 Strong-constraint incremental 4D-Var . . . . . . . . . . . . . . . 110 4.2.3 ETKF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 4.3 Experiment design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 4.3.1 Coupled atmosphere-ocean model . . . . . . . . . . . . . . . . . 113 4.3.2 Data Assimilation settings . . . . . . . . . . . . . . . . . . . . . 114 4.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 vii 4.4.1 Background error covariance and correlationmatrix for the coupled states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 4.4.2 Comparison of CDA strategies for 3D-Var . . . . . . . . . . . . . 118 4.4.3 Comparison of CDA strategies for ETKF . . . . . . . . . . . . . 119 4.4.4 Comparison of SCDA methods for full observation network . . . 124 4.4.5 Comparison of SCDAmethodswhen only assimilating atmosphere observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 Chapter 5: Summary and Future Directions 137 5.1 Summary of the dissertation . . . . . . . . . . . . . . . . . . . . . . . . 137 5.2 Future directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 Appendix A: Statement of Originality 143 Bibliography 144 viii List of Tables 2.1 Settings for each typhoon case study in the experiment NoGSMaP . . . . 23 3.1 Observation error and operator for the local and nonlocal observations in the experiment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 3.2 Experiment sets included in different experiment groups. . . . . . . . . . 81 4.1 Average analysis RMSE (Units: 10?5) of 3D-VarOver the last 4?104??? (~11.0 Years) and of the ETKF after 2,500 MTU (for the remaining ~26.7 years) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 ix List of Figures 2.1 Best track of typhoons Chan-hom, Nangka, Soudelor and Goni over Western pacific in 2015. The white area shows the model domain adopted in this study. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.2 MSLP analysis at 00UTC on July 9 (1st row) and 06UTC on July 11 (2nd row) for the experiment NoGSMaP (1st column) and GSMaP (2nd column). The location of typhoons center from the RMSC best track is marked by the red plus sign. . . . . . . . . . . . . . . . . . . . . . . . . 25 2.3 10-mwind analysis at 00UTCon July 9 from the experiment (a)NoGSMaP, (b) GSMaP, and the wind retrievals from the ASCAT onboard Metop-B, and the 6-hour forecast validated at 12UTC on July 11 from the experiment (d) NoGSMaP, (b) GSMaP, and the wind retrievals from the ASCAT onboard Metop-B. The thin black box shows the swath covered by ASCAT. 26 2.4 The hourly precipitation at 00UTC on July 9 from the experiment (a) NoGSMaP, (b)GSMaP, and (e) the satellite-retrieved surface precipitation. Panels (b)(d)(f) are similar to panels (a)(c)(e) except they are for for the analysis at 06UC on July 11. . . . . . . . . . . . . . . . . . . . . . . . . 28 2.5 The hourly precipitation validated at 06UTCon July 9 from the experiment (a)NoGSMaP, (b)GSMaP, and (e) the satellite-retrieved surface precipitation. Panels (b)(d)(f) are similar to panels (a)(c)(e) except they are for for the hourly precipitation forecast validated at 12UTCon July 11. Both forecasts are initialized 6 hours ago from the ensemble analysis mean. . . . . . . . 30 2.6 Analysis of (a)-(b) precipitable water (unit: mm), (d)-(e) liquid water path, and (g)-(h) solid water path at 1800UTC on July 12 from the experiment NoGSMaP ((a),(d),(g)) andGSMaP ((b)(e)(h)). (c) shows the precipitation water based on ERA-Interim analysis. (f) and (i) shows the retrieved liquid water path and ice water path (??/?2) from GMI at around 1731UTC on July 12 in 2015. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 2.7 Mean absolute track error (unit: km) as a function of forecast lead time up to 72-hours for each individual typhoon in the experiment w/ (red) and w/o (black) assimilation of GSMaP surface precipitation. The number of 6-hour cycles used to calculate the error at each forecast lead time is indicated by the gray bar and the right axis. . . . . . . . . . . . . . . . . 34 x 2.8 Mean MSLP error (unit: hPa) as a function of forecast lead time up to 72-hours for each individual typhoon in the experiment w/ (red) and w/o (black) assimilation of GSMaP surface precipitation. The number of 6-hour cycles used to calculate the error at each forecast lead time is indicated by the gray bar and the right axis. . . . . . . . . . . . . . . . . 36 2.9 Mean absolute track error (left) and MSLP error (right) as a function of forecast lead time up to 72-hours for all four typhoons in the experiment w/ (red) and w/o (black) assimilation of GSMaP surface precipitation. The number of 6-hour cycles used to calculate the error at each forecast lead time is indicated by the gray bar and the right axis.. . . . . . . . . . . . . 37 2.10 Time series of the analysis MSLP (hPa) of typhoon (a) Nangka and (b) Soudelor for the experiment w/o (black) and w/ (red) surface precipitation assimilation. The 6-hour MSLP analysis are shown by the thick colored lines with open circles. The 72-hour forecast initialized from the ensemble mean analysis for the first 10 6-hour cycles in the figures are also shown in the figure thin colored lines. The RMSC best track data is shown by the thick gray lines. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 2.11 120-hour track forecast initialized from the analysis at 00UTConAugust 2, 2015, for the experimentGSMaP (denoted asALL, purple), GSMaP_CORE (red) and GSMaP_ENV (blue). The best track is shown in black. The circles represent the forecasted typhoon location every 6 hours. Forecasts after 72-hours are in light colors in each track. . . . . . . . . . . . . . . . 40 2.12 120-hour track forecast in the experiment ALL (purple), CORE (red) and ENV (blue). The best track is shown in black. The circles represent the forecasted typhoon location every 6 hours. Forecasts after 72-hours are in light colors in each track. . . . . . . . . . . . . . . . . . . . . . . . . . . 42 2.13 (a) Horizontal distribution of collocated GPS-RO retrieved wet profiles for typhoons Chan-hom (red), Nangka (orange), Soudelor (blue) and Goni (green) surround theRMSCbest tracks, (b)Number of collocatedGPS-RO profile variables as a function of the vertical pressure and radial distance from the typhoon center . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 2.14 Temperature and humidity RMSE and bias of 6-hour forecasts validated against the collocated GPS-RO wet profiles for the experiments GSMaP (red) andNoGSMaP (black) for typhoons Chan-hom (1st row) andNangka (2nd row). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 2.15 Same as Fig. 2.14 except for typhoons Soudelor (1st row) and Goni (2nd row). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 2.16 Temperature and humidity RMSE and bias of 6-hour forecasts validated against the collocated GPS-RO wet profiles for the experiments GSMaP (red) and NoGSMaP (black) for all four typhoons. . . . . . . . . . . . . . 47 2.17 Mean absolute track error (left) and MSLP error (right) as a function of forecast lead time up to 72-hours for typhoon Soudelor in the experiments NoGSMaP (balck), GSMaP (red), GSMaP30 (pink), GSMaP30_OBS (blue), and GSMaPyr2014_OBS (cyan). . . . . . . . . . . . . . . . . . . 49 xi 2.18 CDFs used for precipitation transformation at the model grid 140.13oE, 15.65oN in the experiments NoGSMaP (balck), GSMaP (red), GSMaP30 (pink), GSMaP30_OBS (blue), and GSMaPyr2014_OBS (cyan). . . . . . 50 3.1 WFs of selected channels from theAMSU-AandCrIS 431-channel subsets that are routinely assimilated in the operational data assimilation systems. Channels in group (a) are AMSU-A channels with Gaussian-shape WFs. Groups (b)-(e) show selected channels from CrIS. Channels in group (b) have narrow and symmetricWFswhile those in group (c) have long-tailed, asymmetric WFs. Group (d) presents two subgroups of surface-sensitive channels but with very different half-widths. Channels in group (e) have more than one WF peaks. . . . . . . . . . . . . . . . . . . . . . . . . . . 57 3.2 WF of one CrIS channel with double WF peaks. The largest WF peak is located at vertical grid 4, while the secondary WF is located at surface (grid 6). We are now trying to update the analysis at grid 5. . . . . . . . . 64 3.3 (a) SLVL, (b) MLVL, and (c) normalized MLVL functions for NOAA-19 AMSU-A channel 1 with vertical localization scales ranging from 0.2 to 0.8. ? = 1 is used in Eq.(3.13b) for the MLVL functions. The WF and the boxcar localization function from the F07 method (threshold: 0.25) are respectively indicated by the black and gray lines. Panels (d)-(f) are the same as (a)-(c) except for channel 7. . . . . . . . . . . . . . . . . . . . . 73 3.4 (a) SLVL, (b-d) MLVL, and (e-g) normalized MLVL functions for the 240th channel (channel 604 with the wavelength of 9.7382 ?m) of the NOAA-20CrIS 431-channel subsetswith vertical localization scales ranging from0.2 to 0.8. ? = 1,2,3 are respectively used in Eq.(3.13b) for theMLVL functions in (b)(e), (c)(f) and (d)(g). The WF and the boxcar localization function from the F07 method (threshold: 0.25) are respectively indicated by the black and gray lines. . . . . . . . . . . . . . . . . . . . . . . . . 75 3.5 Spatial distribution of (a) ?radiosonde-like? observations (blue circle) and 1-dimensional integrated observations (gray plus), and (b) ?radiosonde- like? observations (blue circle) and 3-dimensional integrated observations (gray). The slanted gray lines in (b) show the spatial projection of the 3-D integrated observations while the gray dots indicate the locations of their observation components at 500hPa. . . . . . . . . . . . . . . . . . . . . . 79 3.6 Time-averaged analysis RMSE for the streamfunction (1st row, unit: 106?2/?), zonal wind (2nd row, unit: ?/?) and meridional wind (3rd row, unit: ?/?) in the Southern Hemisphere with the SLVL (blue) and MLVL (red) methods for EXP_1D_Snp. The vertical localization scale that leads to the smallest RMSE for each localization method is indicated by the solid dots. The colored plus signs show which localization method leads to statistically smaller RMSE at the 5% significance level. Amissing plus sign indicates no significant difference between two methods at this localization scale. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 xii 3.7 Time-averaged analysis RMSE for the streamfunction (unit: 106?2/?) in the Southern Hemisphere with the SLVL (blue) and MLVL (red) methods for EXP_1D_Snp (1st row), EXP_1D_Sbp (2nd row), and EXP_1D_Stp (3rd row). The vertical localization scale that leads to the smallest RMSE for each localization method is indicated by the solid dots. The colored plus signs show which localization method leads to statistically smaller RMSE at the 5% significance level. A missing plus sign indicates no significant differences between two methods at this localization scale. . . 85 3.8 Global-averaged analysis RMSE for the streamfunction (unit: 106?2/?) by assimilating nonlocal observations with the optimal SLVL (blue) and MLVL (red) scales, and without the vertical localization (green) for the experiment EXP_1D_Snp (1st row), EXP_1D_Sbp (2nd row), and EXP_1D_Stp (3rd row). The control experimentwithout nonlocal observation assimilation is indicated by the gray line. . . . . . . . . . . . . . . . . . . 87 3.9 Time-averaged analysis RMSE for the streamfunction (1st row, unit: 106?2/?), zonal wind (2nd row, unit: ?/?) and meridional wind (3rd row, unit: ?/?) with the optimal SLVL (blue) and MLVL (red) scales in the Southern and Northern Hemisphere for EXP_1D_Snp (1st column), EXP_1D_Sbp (2nd column), and EXP_1D_Stp (3rd column). . . . . . . 88 3.10 Time-averaged analysis RMSE difference for the (a-c) streamfunction (unit: 106?2/?) and (d-f) zonal wind (unit: ?/?) in EXP_1D_Snp. Panels (g-i) are the same as (a)-(f) except for the experiment EXP_1D_Stp. Negative (positive) difference values mean the MLVL (SLVL) leads to smaller RMSE. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 3.11 Time-averaged analysisRMSE for the 500hPa streamfunction (unit: 106?2/?) in the Southern Hemisphere with the SLVL (blue) and MLVL (red) methods for (a) EXP_1D_Snp, (b) EXP_1D_Sbp and (c) EXP_1D_Stp for ensemble sizes of 10, 12 and 16. The vertical localization scale that leads to the smallest RMSE for each localization method is indicated by the solid dots. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 3.12 Time-averaged analysisRMSE for the 500hPa streamfunction (unit: 106?2/?) in the Globe using the MLVLmethods with the order of weights ? = 1,2,4 (used in Eq. (3.13)) for EXP_1D_Stp. . . . . . . . . . . . . . . . . . . . 93 3.13 Illustration of (a) SLVL and (b) MLVL for localizing the 3-dimensional integrated observations. This nonlocal observation has its components ?1,?2,?3 located at 800hPa, 500hPa, and 200hPa and at three different horizontal grids. Its WF maxima is at ?2. M is the analysis grid to update. 94 3.14 Time-averaged analysisRMSE for the 500hPa streamfunction (unit: 106?2/?) in the Northern Hemisphere with the SLVL (blue) and MLVL (red) methods for EXP_3D_Stp (1st row), EXP_3D_Sbp (2nd row), andEXP_3D_Stp (3rd row). The vertical localization scale that leads to the smallest RMSE for each localization method is indicated by the solid dots. The colored plus signs show which localization method leads to statistically smaller RMSE at the 5% significance level. A missing plus sign indicates no significant difference between two methods at this localization scale. . . . 95 xiii 4.1 The climatological background covariancematrixB (left) and its corresponding correlation matrix C (right). The atmospheric (1-20) and oceanic (21-36) state vectors are separated by the black lines. . . . . . . . . . . . . . . . . 117 4.2 The analysis RMSE of the atmosphere (top) and ocean (bottom) for the strongly coupled (gray), weakly coupled (red), and uncoupled 3D-Var with forcing provided by the climatological mean (blue), or updated forcing from truth every 30 MTU (~3 days, green) and every 10 MTU (~1 day, orange). Results are shown for the last 1,000 MTU for the atmosphere and last 5,000 MTU for the ocean. The WC RMSE for the atmosphere is nearly identical to the UC_1day RMSE. . . . . . . . . . . . . . . . . . . 120 4.3 The analysis RMSE for the atmosphere (top) and ocean (bottom) during the spin-up period for strongly-coupled 3D-Var (red) and weakly-coupled 3D-Var (gray). Note that 25 time steps correspond to ~1 days. For the atmosphere, the RMSE from the uncoupled analysis forced by the ocean climatology (blue), or with the ocean states updated every day (orange) or every 3 days (green) are also shown. . . . . . . . . . . . . . . . . . . . . 121 4.4 The analysis RMSE of the atmosphere (top) and ocean (bottom) for the weakly coupled (red), strongly coupled (gray), and uncoupled ETKF with updated forcing from truth every 30 MTU ( 3 days, green), 10 MTU (~1 day, orange), and 2.5 MTU (~6 hr, purple). A 141-step moving average filter is applied. For the atmosphere, RMSE for the WC and SC cases follows closely with the UC_6hours case but exhibits lower RMSE in the ocean. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 4.5 The analysis RMSE for the atmosphere (top) and ocean (bottom) during the spin-up period for strongly-coupled 3D-Var (blue), weakly-coupled ETKF (gray), and strongly-coupled ETKF (red). Note that 25 time steps correspond to ~1 days. For the atmosphere, all three methods finish spin-up after ~250 time steps( ~10 days.). For the ocean, it takes strongly-coupled 3D-Var ~5.5 years (~5? 104 time steps) to finish spin- up while both strongly-coupled and weakly-coupled ETKF finishes the spin-up within 40 days (~0.1?104 time steps). . . . . . . . . . . . . . . . 125 4.6 The analysis RMSE when using a full-coverage observing system for the atmosphere (top) and ocean (bottom) with SCDA 3D-Var (green), 4D- Var (blue), 4D-Var/3DFGAT CERA (cyan dash), 40-member ETKF (red), and 20-member ETKF with constant inflation of 1.01 (gray). A 61-step running average is applied for both panels. The analysis RMSE for the last 5?104 MTU (~13.7 years) is shown next to the label for each method. For the atmosphere, the RMSEs for 4D-Var and CERA are largely overlapped. 126 4.7 The analysisRMSEwhenobserving only the atmosphere for the atmosphere (top) and ocean (bottom) with strongly coupled 3D-Var (green), 4D-Var (blue), 4D-Var/3D-Var CERA (cyan), and 40-member ETKF (red). A 21-step running average filter is applied to the bottom panel. . . . . . . . 129 xiv 4.8 The analysisRMSEwhenobserving only the atmosphere for the atmosphere (top) and ocean anslysis (bottom) with strongly coupled 4D-Var, but applying different multiplicative inflation parameters to its background error covariance B: 2 (cyan), 5 (blue) and 10 (dark blue). The RMSE from the 40-member ETKF (red) is also shown as a reference. . . . . . . 130 4.9 The analysisRMSEwhenobserving only the atmosphere for the atmosphere (top) and ocean anslysis (bottom) by using strongly coupled 4D-Var with block-independent inflation parameters to its background error covariance B (2 for B??? and 10 for B???, green). The RMSE from the 4D-Var with 2?B (cyan), 5?B (blue), and 40-member ETKF (red) are also shown as references. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 xv List of Abbreviations 3D-Var Three-Dimensional Variational data assimilation 4D-Var Four-Dimensional Variational data assimilation AIRS Advanced Infrared Sounder AMSU-A Advanced Microwave Sounding Unit-A ASCAT Advanced SCATterometer CDF Cumulative Distribution Function CERA Coupled ECMWF ReAnalysis CFSR Climate Forecast System Reanalysis CrIS Cross-track Infrared Sounder DA Data Assimilation EAKF Ensemble Adjustment Kalman Filter ECMWF European Centre for Medium-range Weather Forecast ETKF Ensemble Transform Kalman Filter EnKF Ensemble Kalman Filter EnSRF Ensemble Square Root Filter GFS Global Forecasting System GIGG Gamma, Inverse-Gamma and Gaussian GPM Global Precipitation Mission GPROF The Goddard Profiling Algorithm GPSRO Global Positioning System Radio Occultation GSMaP Global Satellite Mapping of Precipitation IASI Infrared Atmospheric Sounding Interferometer IMERG Integrated Multi-satellitE Retrievals for GPM JMA Japan Meteorological Agency LETKF Local Ensemble Transform Kalman Filter MLVL Multi-Layer Vertical Localization NCEP National Centers for Environmental Prediction NICAM Nonhydrostatic Icosahedral Atmospheric Model NWP Numerical Weather Prediction RMSE Root Mean Square Error RTPP Relaxation To Prior Perturbation RTPS Relaxation to Prior Spread SCDA Strongly-Coupled Data Assimilation SCALE Scalable Computing for Advanced Library and Environment xvi SLVL Single-Layer Vertical Localization TMPA TRMMMulti-satellite Precipitation Analysis UCDA Un-Coupled Data Assimilation WCDA Weakly-Coupled Data Assimilation WF Weighting Function xvii Chapter 1: Introduction 1.1 Introduction An accurate and physically balanced estimate of the current atmosphere state is essential to predict weather since it initializes the numerical weather prediction model that generates future predictions. Data assimilation (DA) provides such atmospheric analysis by optimally merging the model simulations and observations based on Bayesian statistics [Kalnay, 2003]. Among the data assimilation methods, the EnKF [Evensen, 2009, Burgers et al., 1998, Bishop et al., 2001, Hunt et al., 2007, Whitaker and Hamill, 2002, Anderson, 2001, 2003] has gained popularity due to its ease of implementation, and incorporation of the ?errors of the day? [Kalnay, 2003]. While the EnKF can successfully assimilate a wide range of local observations with Gaussian errors, it encounters difficulty handling two types of observations properly: a) observations with non-Gaussian errors such as precipitation, and b) nonlocal (i.e., space integrated) observations. Though challenging, those two types of observations are among the most important observations forweathermonitoring and forecasting, since observationswith non-Gaussian errors (e.g., hydrometeors and precipitation) are usually associated with severe weather and nonlocal observations such as radiance acquired from satellites contribute the most to the improved weather forecast skill for the modern assimilation systems [Derber and Wu, 1 1998]. Given their importance for weather, the core observations and retrievals frommany satellite missions are these two types of observations. One such example is the NASA and JAXA joint satellite mission, the Global Precipitation Measurement (GPM). Since its launch in 2014, GPM has been continuing generating the global precipitation maps. GPM carries two main instruments, a microwave imager GMI that provides radiance observations which are nonlocal observations, and a dual-frequency precipitation radar that measures reflectivity. Observations from these two instruments are then used to retrieve two major GPM products: global surface precipitation and hydrometeor profiles that both have non-Gaussian errors. The following subsections explain why EnKF has difficulty assimilating those two types of observations, and summarize the past attempts to assimilate them within the current data assimilation framework. 1.1.1 Non-Gaussian data assimilation While the EnKF and variational methods seek different posterior estimators, they provide the same optimal analysis when the posterior of the model states follows a Gaussian distribution. This Gaussian posterior is ensured by four conditions: a) the prior follows Gaussian distribution, b) the observation likelihood is Gaussian, c) the observation operator is linear, and d) the forward model is linear. Non-Gaussianity of the posterior probability occurs when any of these conditions breaks, resulting in a suboptimal analysis from the EnKF and variational methods. In practice, suboptimality arising from the nonlinearity of the forward model or the observation operator can be alleviated with iterative methods [Courtier et al., 1994, Kalnay et al., 2007a, Yang et al., 2012, Sakov 2 et al., 2012]. However, the prerequisite of Gaussian prior and observation likelihood prevents both EnKF and variational methods from effective assimilation of nonnegative observations such as hydrometeors, which are often associated with severe weather. Differentmethods have been proposed to relax the assumptions ofGaussian prior and likelihood for both variational [Cohn, 1997, Fletcher and Zupanski, 2006, Fletcher, 2010] and EnKF systems [Anderson, 2010, 2019, 2020, Lien et al., 2013, 2016a,b, Bishop, 2016]. More advanced DA methods such as Markov chain Monte Carlo [Posselt et al., 2008, Posselt and Vukicevic, 2010] or particle filters [Poterjoy, 2016, Poterjoy and Anderson, 2016] completely remove these two assumptions if given more computing resources. Within the EnKF framework, Anderson [2010, 2019, 2020] replaced the Gaussian prior in observation space with empirical probability mass functions constructed from the background ensembles, and then calculated the posterior ensembles based on Bayesian statistics. The posterior model states are then updated through linear regression or other nonlinear methods. Lien et al. [2013, 2016a,b] transformed both modeled and observed precipitation into Gaussian variables so that their error (i.e., observation likelihood) is more Gaussian. Bishop [2016] derived ensemble solutions if the prior and observation likelihood follows either Gamma or inverse Gamma distribution, which are often used to describe the errors associated with hydrometeors and aerosols. Among all these methods, the Gaussian transformation approach have been applied to precipitation assimilation with real-observation experiments, comprehensively tested with different surface precipitation retrievals and global NWPmodels [Lien et al., 2016a,b, Kotsuki et al., 2017]. 3 1.1.2 Precipitation assimilation with Gaussian transformation Effective precipitation assimilation is challenging [Bauer et al., 2011, Lien et al., 2013] because: a) precipitation is a nonnegative variable with non-Gaussian errors that violates the basic assumption of current data assimilation methods, and b) forcing the model to rain where observed by moistening or drying the atmosphere, as done in moisture nudging, does not change the main dynamical variable (i.e., potential vorticity). As a consequence, the positive impact of precipitation assimilation is immediately forgotten as soon as the forcing of the "assimilation of moisture" ceases. Lien et al. [2013, 2016a,b] and later Kotsuki et al. [2017] developed a methodology that successfully addressed these problems: They transformed observed and modeled precipitation into variables with more Gaussian-like errors (anamorphosis). They also used the ensemble-based assimilation approach that allows precipitation to directly influence the model dynamical variables (i.e., potential vorticity) since the ensemble members that precipitate closely as observed receive higher weights in the ensemble analysis, and therefore "donate" their more correct dynamics to the analysis mean, thus creating an analysis mean that is closer to having the right dynamics [Kalnay, 2013]. As a result, both Lien et al. [2016a,b] and Kotsuki et al. [2017] showed that assimilating additional remotely sensed precipitation (NASA TMPA, and JAXA GSMaP) improves the global forecast skill for up to five days. Considering the successful assimilation of precipitation retrievals with Gaussian transformation in the global models [Lien et al., 2016a,b, Kotsuki et al., 2017], one following question is: can we extend this precipitation assimilation framework into the applications of regional models and improve the tropical cyclone predictions? 4 A recent study by Emanuel and Zhang [2017] showed that the specification of the inner-core moisture for the initial condition is as important as wind speed for reducing tropical cyclone intensity forecast error. Meanwhile, it is a common practice for the precipitation retrieval community to infer surface precipitation rates based on column hydrometeor distributions such as cloud liquidwater path and icewater path, which implies the direct linkage between precipitation and hydrometeors, which are all closely related to moisture. These inspire us to extend our precipitation assimilation framework to tropical cyclone predictions since assimilating precipitation has the potential to simultaneously improve the potential vorticity and moisture within the tropical cyclone, thus improving its forecasts. 1.1.3 Vertical localization for nonlocal observations Assimilating nonlocal observations with the EnKF is another challenging problem. Due to sampling errors caused by the limited ensemble size, the error covariance needs to be localized in the EnKF systems. Unlike conventional observations that have single well- defined locations when observed, nonlocal observations (e.g., total precipitable water, radiances, GPS RO excess phase, altimeter observations) are intrinsically integrated measurements in space, and thus have no singlewell-defined observation location [Campbell et al., 2010]. An empirical single-layer vertical localization (SLVL) approach, which is widely used to handle these nonlocal observations, is to vertically localize them at their weighting function (WF) peaks with symmetric Gaussian-shape localization functions [Houtekamer and Mitchell, 2005]. The SLVL approach is under the assumption that the 5 most contribution of the observed radiance comes from the atmospheric layer where the WF peak resides, so the SLVL approach assigns the WF peak as the vertical location for radiance observations. This simple empirical SLVL approach with the Gaussian-shaped localization function is shown towork effectively for radiance observationswith symmetric Gaussian-shaped WFs. For example, the multi-year NICAM ensemble reanalysis system [Terasaki et al., 2015, Terasaki and Miyoshi, 2017, Terasaki et al., 2019] adopts the SLVL approach to assimilate AMSU-A clear-sky radiance. Their analysis quality in the SouthernHemisphere has been significantly improved thanks to the additional assimilation of AMSU-A observations [Terasaki and Miyoshi, 2017, Terasaki et al., 2019]. However, not all radiance observations have narrow Gaussian-shaped WFs like AMSU-A. Aswewill show in Chapter 3, some radiance observations can have asymmetric WFs or show more than one WF peaks, such as clear-sky radiance from sounding or trace-gas sensitive channels from hyperspectral infrared sensors [Christophersen, 2015]. The SLVL approach might encounter difficulties localizing these observations. A new localization method is needed to better localize these radiance observations. 1.2 Objectives In this dissertation we focus on on three topics that could potentially enhance the assimilation of satellite observations or its derived products under the GPM satellite mission. In the first two parts, we try to develop different assimilation strategies to assimilate precipitation with non-Gaussian errors (Chapter 2), and nonlocal observations (Chapter 3) in the EnKF. In the last chapter, we switch our focus to coupled data 6 assimilation in preparation of assimilating GPM precipitation into other earth components other than atmosphere. In Chapter 4, we developed a coupled data assimilation testbed for a coupled quasi-geostrophic model, which allows systematic comparison of different coupled data assimilation strategies for both variational and ensemble methods. Here we summarize the objectives of each topic separately. ? Assimilation of surface precipitation with non-Gaussian errors: ? extend the precipitation assimilation framework with Gaussian transformation [Lien et al., 2013, 2016a,b] to regional models by implementing precipitation assimilation module into the regional ensemble assimilation system SCALE- LETKF; ? explore the benefits of assimilating GPM surface precipitation with Gaussian transformation on typhoon forecasts. ? Localization of nonlocal observations in the EnKF: ? develop a localization method that can better localize nonlocal observations with complex WFs such as those from hyperspectral infrared sounders; ? compare the performance between this new localization method and the SLVL method in simulated experiments. ? Comparison of coupled data assimilation strategies for a quasi-coupled geostrophic model: ? implement a coupled data assimilation testbed that includes both variational and ensemble methods under different coupled data assimilation strategies; 7 ? investigate how variational methods respond to different data assimilation strategies; ? investigate howensemblemethods respond to different data assimilation strategies; ? compare the performance of different strongly-coupled data assimilation systems; 1.3 Outline This dissertation is organized as follows. In Chapter 2, we extended the precipitation assimilation framework with Gaussian Transformation to a regional ensemble assimilation model, and used it to investigate the impact of assimilating satellite-derived precipitation on typhoon forecast. In Chapter 3, we developed a multi-layer observation localization method for localize nonlocal observations in theEnKF. Idealized experiments are conducted to compare its performance with the traditional SLVL method when assimilating nonlocal observations. In Chapter 4, we developed a coupled data assimilation testbed for a coupled quasi-geostrophic model to investigate the behaviors of variational and ensemble methods under different coupled data assimilation strategies. Finally, Chapter 5 gives the the summary and future directions. 8 Chapter 2: Improved Tropical Cyclone Predictions by Assimilation of Satellite-retrievedSurface PrecipitationwithGaussianTransformation 2.1 Introduction The NASA and JAXA joint satellite mission, the Global Precipitation Measurement [GPM, Hou et al., 2014], has continuously monitored global rain and snowfall since its launch in 2014. Several new-generation surface precipitation products that fuses GPM observations [Huffman et al., 2015, Kubota et al., 2007, Aonashi et al., 2009] have been developed at NASA and JAXA that provide better rainfall estimates with higher temporal and spatial resolution than their precursors [Gebregiorgis et al., 2018]. Those surface precipitation products are utilized in various applications, such as flash flood warnings [Tang et al., 2017] and hydrologic modeling [Yong et al., 2010]. Among those applications of surface precipitation products, one crucial application for atmosphere modeling is to assimilate surface precipitation into the numerical weather models to improve the model analysis and forecasts. Many attempts have been made to assimilate precipitation [Aonashi, 1993, Tsuyuki, 1996, 1997,Hou et al., 2000,Davolio and Buzzi, 2004, Hou et al., 2004, Koizumi et al., 2005, Lin et al., 2007, Lopez, 2011, 2013]. [Bauer et al., 2011] summarized the major obstacles that prevents effective assimilation 9 of precipitation. Early assimilation attempts mainly used precipitation observations to adjust the model moisture field [Aonashi, 1993, Falkovich et al., 2000, Davolio and Buzzi, 2004]. This moisture nudging approach that forced the model to rain where observed by moistening or drying the atmosphere failed to change the model dynamical variable (i.e., potential vorticity), which is responsible for long-term model evolution [Kalnay, 2013]. Consequently, the impact of precipitation assimilation is immediately forgotten as soon as the forcing of the "assimilation of moisture" ceases. Later studies tried to assimilate precipitation observations directly with the 4D-variational method [Zou et al., 1993, ?upanski and Mesinger, 1995, Lopez, 2011, 2013]. However, the model moist process related to precipitation could be strongly nonlinear [Errico et al., 2001] and even include discontinuous switch on/off processes [Zou et al., 1993, ?upanski and Mesinger, 1995], leaving obstacles for developing its tangent linear and adjoint code as required by 4D-Var. The originally sophisticated moist process had to be simplified when constructing the tangent linear and adjoint models [Mahfouf, 2005, Lopez, 2007]. The Ensemble Kalman filter (EnKF) method [Evensen, 2009] waived the need of linearizing the moist process of the model [Miyoshi and Aranami, 2006]. However, it could not handle the non-Gaussian errors of precipitation observations [Errico et al., 2000, Koizumi et al., 2005, Lopez, 2011, 2013], since Gaussian observation error is a fundamental assumption required for both ensemble and variational methods. Lien et al. [2013, hereafter LMK13] proposed to assimilate precipitation with Gaussian transformation in the Local Ensemble Transform Kalman Filter [LETKF, Hunt et al., 2007] to solve the difficulty related to precipitation assimilation. LMK13 first transformed observed and modeled precipitation into Gaussian variables through the 10 Gaussian anamorphosis procedure [Bocquet et al., 2010], making the transformedprecipitation errormoreGaussian. They also used the ensemble-based assimilation approach that avoids the explicit linearization of the moist process. Through the flow-dependent ensemble error covariance by EnKF, the precipitation observations are allowed to directly influence the model dynamic variables (i.e., potential vorticity), so the ensemble members that precipitate closely as observed receive higher weights, and "donate" their more correct dynamics to the analysis mean, thus creating an analysis mean that is closer to having the right dynamics [Kalnay, 2013]. Through observing system simulation experiments with an intermediate atmosphere general circulation model, LMK13 showed that precipitation assimilation with Gaussian transformation improves the analysis and 5-day forecasts for all model variables, especially when the precipitation observations have large errors. Lien et al. [2016a,b, hereafter LKMH16a, and LMK16b, respectively] further tested the Gaussian transformation approach with real-observation experiments by assimilating the National Aeronautics and Space Administration (NASA) Tropical Rainfall Measuring Mision Multi-satellite Precipitation Analysis [TMPA, Huffman et al., 2007, 2010] into the low-resolution global model NCEP Global Forecasting System (GFS) with the LETKF. LKMH16a compared the prior statistics of converted precipitation through Gaussian transformation and the one through the lognormal transformation [Cohn, 1997], widely used for assimilating precipitation in previous studies. They showed that Gaussian transformation leads to a higher correlation between transformedTMPAandGFSprecipitation. Besides, the correlation of 6-hour accumulated precipitation is higher than instantaneous precipitation. LMK16b assimilated the 6-hour accumulated TMPA precipitation into the GFS and showed that the analysis and 5-day model forecasts are improved for all variables 11 compared with the control experiment that only assimilates the rawinsonde observations. Since the launch of the GPM in 2014 [Hou et al., 2014], several new satellite- retrieved global precipitation products are available from NASA and Japan Aerospace eXploration Agency (JAXA) that are of higher temporal and spatial resolution [Huffman et al., 2015, Kubota et al., 2007, Aonashi et al., 2009]. Kotsuki et al. [2017, hereafter K17] extended the precipitation assimilation framework with Gaussian transformation to the state-of-the-art global forecast model Nonhydrostatic Icosahedral Atmospheric Model [NICAM, Tomita and Satoh, 2004, Satoh et al., 2014, 2008] and the GPM- based precipitation products Global Satellite Mapping of Precipitation (GSMaP) [Kubota et al., 2007, Aonashi et al., 2009]. K17 confirmed the positive impacts of precipitation assimilation on the model analysis and 5-day forecasts reported by LMK16b, and found that precipitation assimilation also improves the precipitation forecast up to 5 days. The Gaussian transformation process requires constructing empirical precipitation Cumulative Distribution Functions (CDFs). K17 also developed an online CDF construction method based on the previous N-day precipitation samples that waived the need to archive the model past long-term precipitation history as in LMK16b, making the Gaussian transformation approach more applicable to operational forecasts. In this study, we extend the precipitation assimilation framework with Gaussian transformation developed by LKM13, LMK16b and K17 to regional models and focus on the impact of precipitation assimilation on typhoon forecasts. We conduct experiments for four typhoon cases in 2015 by assimilating the satellite-derived GSMaP surface precipitation [Kubota et al., 2007, Aonashi et al., 2009] into the regional model SCALE- RMincluded in the ScalableComputing forAdvancedLibrary andEnvironment [Nishizawa 12 et al., 2015, Sato et al., 2015]. The impact of precipitation assimilation is then evaluated by examining the analysis and forecast adjustment on different model variables, the typhoon track and intensity error, and verification against independent observations. Besides these, data denial experiments are conducted to help to understand which portion of precipitation contributes the most to the improved typhoon forecast. We also propose two new approaches to construct the empirical cumulative distribution functions used by Gaussian transformation for typhoon forecast and assess their impact on typhoon forecast errors. This chapter is organized as follows. Section 2.2 gives an overview of four typhoon cases examined in this study. Section 2.3 introduces themodel configuration, observations, theGaussianTransformation process for precipitation assimilation, and experiment design. Section 4.4 presents the results. Section 2.5 gives the summary. 2.2 Overview of four typhoon cases In this study, four typhoons (Chan-hom, Nangka, Soudelor and Goni) over the Western Pacific from June to August in 2015 are studied to evaluate the impact of precipitation assimilation. Figure 2.1 shows their best tracks issued by the Regional Specialized Meteorological Center (RSMC) Tokyo-Typhoon Center. The lifetime of all four typhoons varies from 14 days (Soudelor) to 17 days (Goni). We then describe each case in more details. 13 Figure 2.1: Best track of typhoons Chan-hom, Nangka, Soudelor and Goni over Western pacific in 2015. The white area shows the model domain adopted in this study. 2.2.1 Typhoon Chan-hom After formation near the Marshall Islands at 0600UTC on June 29, Chan-hom evolved to a tropical storm at 1200UTC on 30 June, and it was further intensified to a typhoon at 0000UTC on 7 July. At 1800UTC on 9 July, Chan-hom reached its peak intensity near the Miyakojima Island with its central pressure of 935 hPa. It then turned northeastward over the Eastern China Sea, and made landfall in Zhoushan, China around 0840UTCon 11 July. At 0000UTCon 13 July, Chan-hom transformed into an extratropical cyclone in the Korean Peninsula. 14 2.2.2 Typhoon Nangka Nankga formed as a tropical depression at 1800UTC on 2 July and became a tropical storm at 1800 UTC on 3 July. It was further intensified to a typhoon at 1200 UTC on 6 July near Pohnpei Island, with its maximum intensity with the center pressure of 925hPa and maximum wind of 100kt. On 8-9 July, Nangka and Chan-hom coexisted over the western Pacific. Nangka made its first landfall in Muroto City, Kouchi Prefecture in Japan at 1400 UTC on 16 July, and its second landfall in Kurashiki City, Okayama Prefecture around 21 UTC on 16 July. During its landfall in Japan, Nangka brought intense precipitation which broke local rainfall records over many cities. 2.2.3 Typhoon Soudelor Among these four typhoons in this study, Soudelor is the strongest typhoon over western North Pacific in 2015 with its central pressure as low as 900 hPa. Formed as a tropical depression near the Marshall Islands at 1800 UTC on 29 July, Soudelor kept moving westward until it was upgraded to a typhoon at 0600 UTC on 2 August. During the period between 1800UTC on August 1 to 1800 UTC on August 3, Soudelor experienced rapid intensification with its center pressure deepening by 90hPa in 2 days. This rapid intensification process is examined by many studies such as Honda et al. [2018]. 2.2.4 Typhoon Goni Goni spanned its life between 18 UTC on 13 August and 12UTC on 30 August. Originated near the Marina Island, Goni evolved to a tropical storm at 18UTC on 14 15 August, and further get intensified to be a typhoon at 12UTC on 16 August. Goni then reached its peak intensity with a central pressure of 935hPa at 06 UTC on 17 August and kept moving northwestward. On August 21, Goni turned its direction sharply, moving northwards. It made landfall on Arao City in Japan after 21UTC on 25 August. After that, it soon transformed into an extratropical cyclone. 2.3 Method 2.3.1 Model configuration The regional model included in the Scalable Computing for Advanced Library and Environment [SCALE Nishizawa et al., 2015, Sato et al., 2015], SCALE-RM version 5.2, is used in this study. It is configuredwith one 36-km-mesh domain over theWestern Pacific (Figure 2.1) with 240 ? 180 horizontal grids, and 36 vertical levels with its model top at 27.7km. We adopted this relatively lowhorizontal resolution of 36 kmafter considering the experiment length, the ensemble size deployed in this study, the available computational resources, and the related queue time for the experiments. Besides, the following model physics schemes are adopted in this study: the 6-class (water vapor, cloud water, cloud ice, rain, snow, graupel) single-moment TOMITA08 microphysics scheme [Tomita, 2008], the Model Simulation radiation TraNsfer code (MSTRN)-X based shortwave and longwave radiation scheme [Sekiguchi and Nakajima, 2008], the level 2.5 closure of the Mellor- Yamada-Nakanishi-Niino (MYNN) turbulence scheme [Nakanishi and Niino, 2004], and the Kain-Fritsch cumulus parameterization [Kain and Fritsch, 1990]. The lateral boundary conditions are taken from the 0.5? National Centers for Environmental Prediction (NCEP) 16 Global Forecast System (GFS) analysis every 6 hours. We use the ensemble data assimilation system SCALE-LETKF [Lien et al., 2017] that coupled the Local Ensemble Transform Kalman Filter [LETKF, Hunt et al., 2007] and the SCALE-RM. The analysis variables by SCALE-LETKF are pressure, temperature, zonal, meridional, and vertical winds, and the mixing ratios of water, vapor, cloud water, cloud ice, rain, snow, and graupel. For all the experiments, the ensemble size is fixed at 100. Following Lien et al. [2017] and Honda et al. [2018], the initial ensemble members are drawn randomly from the 0.5? NCEP GFS analysis during the same season of our experiments but in 2014. Since all members share the same the boundary condition, we followed Lien et al. [2017] and Honda et al. [2018], applying simultaneously the multiplicative inflation to the background error covariance with a factor of 1.25, and the relaxation to prior perturbation [RTPP, Zhang et al., 2004] with a factor of 0.8 to maintain the sufficiently large ensemble spread. 2.3.2 Observations The experiments assimilate two types of observations: the NCEP PREPBUFR observations and aggregated Level-3 GSMaP 6-hour accumulative surface precipitation. The NCEP PREPBUFR observations include conventional observations such as those from radiosondes, surface stations, ships, and aircrafts, and various remote-sensed wind observations (e.g., the atmosphere motion vectors derived from geostationary satellites and surface winds retrieved from scatterometers)[Miyoshi and Kunii, 2012]. LKM16 and K17 evaluated the benefits of precipitation assimilation on global model forecasts, and 17 they did not assimilate the remote-sensed wind observation subsets from PREPBUFR in their control experiments except in the spin-up period. Past typhoon studies by Wu et al. [2014, 2015] showed the importance of atmosphere motion vectors in constraining the atmosphere, especially over open oceans where the conventional observations are sparse. Due to this reason, we assimilate remote-sensed wind observations from PREPBUFR in all the experiments. Assimilation of these additional remote-sensed winds is expected to improve the analysis and forecast of the control experiment, and the impact of precipitation assimilation inferred over this control experiment might become smaller than the one inferred over the control experiment without wind assimilation. Nevertheless, it is crucial to understand the added value of precipitation assimilation on top of a more complete observation network. The localization scales of PREPBUFR observations follow Lien et al. [2017] and Honda et al. [2018], with 400km in horizontal and 0.3 in vertical. The Level-3 gridded JAXA GSMaP surface precipitation retrieved from the GPM observations [Kubota et al., 2007, Aonashi et al., 2009] are assimilated in the experiments to assess the impact of precipitation assimilation on typhoon forecasts. Among different versions of GSMaP precipitation, the rain-gauge calibrated GSMaP_Gauge product is adopted in this study since K17 showed that GSMaP_Gauge improved the model analysis the most. Besides, 6-hour accumulated precipitation is assimilated to reduce the timing error of precipitation. The original GSMaP precipitation products provide hourly surface precipitation with the resolution of 0.1? between 60?S and 60?N. We first map the hourly GSMaP precipitation into the SCALE model grid and then aggregate the hourly precipitation to create 6-hour accumulated precipitation ???????6 with the following?? 18 equation that is similar to equation (8) in LKMH16a, ??2 ??????? 1 = ??????? ????? 1 ????? 6?? 2 ?3 + ???? + ??2 3 (2.1) ??=?2 where ???????? is the hourly GSMaP precipitation over the SCALE grid ?? hours before? (i.e., ?? < 0) or after (i.e., ?? > 0) the analysis time. The original hourly GSMaP precipitation products also include quality control flags for each grid. If any hourly precipitation is marked with a bag quality flag, then the 6-hour precipitation at this model grid is abandoned. The SCALE-RMwrites out the 1-hour accumulated precipitation with hourly output time interval, so the model simulated 6-hour precipitation ???????6 is calculated through?? ??3 ??????? = ???????6 ?? (2.2)?? ??=?2 where ???????? is 1-hour accumulated precipitation at the output time step that is ?? hours? away from the analysis time. Following LMK16 and K17, we localize the precipitation observations at 850hPa with a vertical localization scale of 0.3. The horizontal localization scale is set as 250km, the same as K17. Precipitation is assimilated only when 75%members from the ensemble precipitate. Next section discusses how to assimilate 6-hour accumulated precipitation with Gaussian transformation [Lien et al., 2013, 2016a,b, Kotsuki et al., 2017]. 19 2.3.3 Precipitation assimilation with Gaussian transformation LKM13 and LKMH16a introduced the Gaussian transformation to alleviate the non-Gaussianity problem associated with the precipitation observations in the EnKF. The Gaussian transformation process can be summarized as a 2-step procedure. In the first preparation step, the past modeled and observed precipitation samples are collected around everymodel grid to create two separate empirical cumulative distribution functions (CDFs) ??, and ??, where subscripts ? and ? represent CDFs constructed from observed and modeled precipitation. Those CDFs ??, and ?? map input observed and modeled precipitation to their corresponding cumulative probability. At the assimilation step, we calculate the transformed precipitation ?? and ?? through ?? = ??1 [? (???????? 6 )], ? ? = ??1 [? (???????? 6 )] (2.3)?? ?? where ???????6 and ?? ????? 6 are 6-hour accumulated SCALE and GSMaP precipitation,?? ?? ? represents the CDF of the standard Gaussian distribution (? ? ? (0,1)) and ??1 is the inverse process of ?. The assimilation system then assimilates the transformed precipitation instead of the original 6-hour accumulated precipitation. Using ?2 test, LKMH16b showed that the Gaussian transformation process transformed precipitation into Gaussian variables, of which errors also become more Gaussian. K17 showed that observation innovation after Gaussian transformation (?? ? ??) is closer to Gaussian distribution than the one without transformation. The precipitation assimilation module with Gaussian transformation by (2.3) is implemented into the SCALE-LETKF. 20 TheGaussian transformation process requires the construction of empirical CDFs for modeled and observed precipitation. In LMK16b, where they assimilated the precipitation into the GFS-LETKF system, they constructed the model precipitation CDFs by collecting 6-hour accumulated precipitation forecast initialized from the archived 10-year NCEP CFSR reanalysis, which shares the same atmosphere model as GFS. However, this model CDF construction approach is not applicable to newly developed models when long- term reanalysis is not available. K17 proposed an online method to construct the model precipitation CDF that removes the need to archive modeled precipitation in the past: At each analysis cycle, they collected the past 30-day modeled precipitation within a radius of 250km around each model grid to create the new empirical CDF for model precipitation transformation. While the K17 approach removed the need to archive the past model precipitation, running the model over 30 days to collect model precipitation samples before the start of the formal experiment still requires substantial computing resources, especially if we consider the actual experiment length for each typhoon (14 to 17 days). In this study, an alternative approach is proposed to construct the precipitation CDFs for typhoon forecast. 5-day modeled and observed precipitation samples within 2 grids around each model grid are pulled together to create a shared empirical CDF that transforms both modeled and observed precipitation in the data assimilation step (i.e., ??=??=??,?). One short-period experiment showed that precipitation assimilation with the merged CDFs still leads to improved analysis and forecast compared with the one with separate modeled and observed CDFs, so we adopted this CDF construction approach in our experiments. One concern about this approach is that 5 days is relatively short. In section 2.4.5, we conducted sensitivity experiments to examine how typhoon forecast 21 errors vary with the length of the sampling period. Besides, we will show that using the CDFs solely constructed from observed precipitation samples still leads to effective precipitation, improving the model analysis and forecast. Like LMK16b and K17, the observation error of transformed precipitation is set as 0.5 in the SCALE-LETKF. 2.3.4 Experiment setup To evaluate the benefits of GSMaP precipitation on typhoon forecast, we compare results from two experiments: the experimentNoGSMaP that only assimilates PREPBUFR observations, and the experiment GSMaP which simultaneously assimilates PREPBUFR and accumulated 6-hour GSMaP precipitation. As the baseline, the experimentNoGSMaP is 6-hour cycled run from 0000UTC on 19 June to 1800UTC on 24 August in 2015 (about 2 months), covering the main life span of four typhoons (i.e., Chan-hom, Nangka, Soudelor, and Goni). Unlike NoGSMaP, the experiment GSMaP started the case study of each typhoon 4 days earlier before the JMA best track data identified it as a tropical storm. Each case study then covers the main life span of the typhoon to assess the impact of precipitation assimilation adequately. The reason that NoGSMaP does not conduct the same 2-month cycled run is that there is about an 11-day gap between the simulation of typhoon Nangka and Chan-hom. The approach we adopted here saves the computational cost. The initial ensembles for each typhoon study in GSMaP are from the NoGSMaP analysis at the same cycle, so any improvements seen in GSMaP are due to additional precipitation assimilation. The empirical CDFs used in each typhoon study are constructed by collecting the previous 5-day modeled and observed precipitation from 22 the NoGSMaP experiment before the case study starts. Table 2.1 list more details about case study settings in the experiment GSMaP. Besides the two main experiments GSMaP and NoGSMaP, sensitivity experiments are conducted for typhoon Soudelor to understand which portion of precipitation contributes most to the improved typhoon forecast (Section 2.4.3), and how the forecast error varies with the way to construct the precipitation CDFs (Section 2.4.5). Table 2.1: Settings for each typhoon case study in the experiment NoGSMaP Case Name Date when JMA issued TS Experiment periodof this case study CDF construction period 1200UTC on 26 June ? Chan-hom 1200UTC on 30 June 1200UTC on 13 July 0600UTC on 21 June ? (?17 days) 0600UTC on 26 June 1800UTC on 29 June ? Nangka 1800UTC on 3 July 1800UTC on 17 July 1200UTC on 24 June ? (?18 days) 1200UTC on 29 June 0600UTC on 28 July ? Soudelor 0600UTC on 1 August 1200UTC on 12 August 0000UTC on 23 July ? (?15 days) 0000UTC on 28 July 1800UTC on 10 August ? Goni 1800UTC on 14 August 1800UTC on 24 August 1200UTC on 5 August ? (?14 days) 1200UTC on 10 August 2.4 Results 2.4.1 Impact of precipitation assimilation on analysis and 6-hour forecast 2.4.1.1 Impact on mean sea level pressure Different model variables are compared from experiments GSMaP and NoGSMaP to understand the impact of surface precipitation assimilation. Figure 2.2(a)(b) compares the mean sea level pressure (MSLP) at 00UTC on July 9, when three typhoons Linfa 23 (at 117?E), Chan-hom (at 130?E), and Nangka (at 146?E) coexisted in the model domain. Both NoGSMaP and GSMaP show relatively accurate positions of Linfa and Chan-hom. For typhoon Chan-hom, the MSLP analysis (965.2hPa) from GSMaP agrees more favorably with the best track data (965hPa) than NoGSMaP (968.6hPa). Both experiments underestimate the intensity of typhoon Nangka (at 146?E), while GSMaP generates an analysis (983.hPa) closer to the best track (930hPa) than the NoGSMaP (1001.3hPa). It is clear in Figure 2.2(a)(b) typhoon Nangka in NoGSMaP does not have a structure as organized as the one in the GSMaP. Figure 2.2(c)(d) shows the MSLP analysis about two days later at 06UTC on July 11. Without precipitation assimilation, typhoon Nangka in NoGSMaP failed to get intensified quickly, showing a much weaker intensity when compared with the one in GSMaP. 2.4.1.2 Impact on surface wind Surface wind analysis from two experiments are then compared with the 25km surface wind retrievals from the ASCAT onboard Metop-B in Figure 2.3. The ASCAT onboard Metop-B had two passes over typhoon Nangka at approximately 00UTC on July 9, and 12UTC on July 11. The surface wind analysis for typhoon Nangka in the NoGSMaP (Figure 2.3(a)) fails to show organized wind structures, with its maximumwind magnitude (less than 12m/s) far smaller than the ASCAT retrievals (Figure 2.3(c)). By contrast, the wind structure is better represented in the GSMaP (Figure 2.3(b)), with its maximumwind closer to the ASCAT retrievals. One benefits of direct precipitation assimilation using the EnKF over the moisture nudging approach is that EnKF allows direct adjustment of 24 Figure 2.2: MSLP analysis at 00UTC on July 9 (1st row) and 06UTC on July 11 (2nd row) for the experiment NoGSMaP (1st column) and GSMaP (2nd column). The location of typhoons center from the RMSC best track is marked by the red plus sign. 25 dynamically important variables (i.e., wind field) through the ensemble error covariance, so the improvement brought by precipitation assimilation can be ?remembered?. To see this, Figure 2.3 compares the 6-hour forecast of surface wind validated at 12UTC on July 11. Those forecasts are initialized from the ensemble mean analysis at 06UTC on July 11. It is clear that the structure and magnitude of surface wind are better represented by the GSMaP forecast. However, the ASCAT wind shows the maximum wind on the northern side, while both NoGSMaP and GSMaP incorrectly place the maximum wind in the northeastern quadrant. Generally, GSMaP with precipitation assimilation shows better surface wind analysis and forecast than NoGSMaP. Figure 2.3: 10-m wind analysis at 00UTC on July 9 from the experiment (a) NoGSMaP, (b) GSMaP, and the wind retrievals from the ASCAT onboard Metop-B, and the 6-hour forecast validated at 12UTC on July 11 from the experiment (d) NoGSMaP, (b) GSMaP, and the wind retrievals from the ASCAT onboard Metop-B. The thin black box shows the swath covered by ASCAT. 26 2.4.1.3 Impact on precipitation Assimilating surface precipitation also improves the simulated precipitation. The comparison of hourly precipitation from different experiments at 00UTC on July 9 are displayed in Figure 2.4(a)(c)(e). Note that the surface precipitation is not analyzed by the SCALE-LETKF system, so the hourly precipitation validated at 00UTC on July 9 are the same as the background ensembled mean at 00UTC on July 9, which is not altered by precipitation assimilation at 00UTC on July 10. At 00UTC on July 9, the precipitation associated with typhoon Chan-hom (at 130?oE) is better represented in the GSMaP than in the NoGSMaP considering the precipitation coverage and the simulation of its rainband on the eastern side. Similar improvement due to precipitation assimilation is also seen in the case at 06UTC on July 11. Besides, NoGSMaP failed to show an adequate precipitation coverage of typhoon Nangka and its precipitation intensity is too weak when compared with observations, while the coverage and intensity of precipitation of Nangka agrees much favorably with observations in the experiment GSMaP. The following 6-hour forecast of hourly precipitation initialized from the analysis ensemble mean in Figure 2.4 is shown in Figure 2.5. Compared with observation, the precipitation coverage by typhoon Chan-hom (at 130?E) in the NoGSMaP is smaller. Besides, the orientation of the Chan-hom related precipitation coverage in the NoGSMaP is different from observation, with its maximum precipitation incorrectly placed in the eastern quadrant. NoGSMaP also underestimates severely the precipitation associated with typhoon Nangka (at 145?E). In contrast to NoGSMaP, the coverage and orientation of precipitation brought by Chan-hom in the experiment GSMaP is very similar to the 27 Figure 2.4: The hourly precipitation at 00UTC on July 9 from the experiment (a) NoGSMaP, (b) GSMaP, and (e) the satellite-retrieved surface precipitation. Panels (b)(d)(f) are similar to panels (a)(c)(e) except they are for for the analysis at 06UC on July 11. 28 observation, with their maximum precipitation located at 127?E, 24?N. Besides, the precipitation intensity of typhoon Nangka in the GSMaP is also closer to the observation, though its rainband is incorrectly placed in the northern quadrant. 2.4.1.4 Impact on moisture and hydrometeors Comparison of MSLP, surface wind and precipitation against different observations all reveals that assimilation of GSMaP surface precipitation contributes to a stronger typhoon Nangka, which is closer to the observations. Emanuel and Zhang [2017] showed that the the intensity error of tropical cyclone is strongly associated with the moisture uncertainty in the inner core. They indicated that improving moisture in the inner core of tropical cyclone could lead to better tropical cyclone forecast. A recent study by [Honda et al., 2018] confirmed this finding by showing that the stronger MSLP is associated with more abundant precipitable water, which is eseentially the total column water vapor, for Typoon Soudelor during its intensification. To see if this is the case for typhoon Nangka, we compare the precipitable water from two experiments against the Goddard PROFiling algorithm (GPROF) profiles from the Global Measurement Mission (GPM) Microwave Imager (GMI). GPROF profiles not only collocates the total column water vapor of ERA- Interim analysis on each footprint location from GMI observations, but also include retrievals of different hydrometeors. Past studies by Wu et al. [2016], Wu and Zupanski [2017] showed that the assimilation of GROF hydrometeors improves the tropical cyclone analysis, and leads to some improvements of hurricane intensity forecasts. These merits of GPROF makes it an ideal reference to verify the water vapor and hydrometeor fields 29 Figure 2.5: The hourly precipitation validated at 06UTC on July 9 from the experiment (a) NoGSMaP, (b) GSMaP, and (e) the satellite-retrieved surface precipitation. Panels (b)(d)(f) are similar to panels (a)(c)(e) except they are for for the hourly precipitation forecast validated at 12UTC on July 11. Both forecasts are initialized 6 hours ago from the ensemble analysis mean. 30 from model outputs. To enable comparison, we calculate simulated precipitable water PW by integrating the total-column water vapor, and calculate liquid water path LWP and solid water path SWP by using similar equations as Wu and Zupanski [2017] ? ??????? = ? ?? (?)?(?)?? (2.4a) ???????? = ? (?? (?) + ?? (?))?(?)?? (2.4b) ???????? = (?? (?) + ?? (?) + ??) (?)?(?)?? (2.4c) where ?(?) is air density and the ??, ??, ?? , ??, ??, ?? are the mixing ratio of water vapor, cloud water, rain, ice, snow, and graupel at different vertical levels. For GPROF, those quantities (Wu et al., 2016) are calculated as ??????? = ????????????? (2.5a) ???????? = ??????? +??????? (2.5b) ???????? = ??????? (2.5c) where ??????? , ??????? and ??????? are retrieved rainwater path, cloud water path, and ice water path fromGMI, and ????????????? is the collocated precipitable water from ERA-Interim analysis. Figure 2.6 shows one example when GMI passes over typhoon Nangka at around 1800UTC on July 12. Precipitation assimilation improves the PW analysis. Experiment GSMaP and ERA-Interim analysis shares the same location of maximum PW, though the span of maximum PW in GSMaP is greater than that in the ERA-Interim analysis. 31 Figure 2.6: Analysis of (a)-(b) precipitable water (unit: mm), (d)-(e) liquid water path, and (g)-(h) solid water path at 1800UTC on July 12 from the experiment NoGSMaP ((a),(d),(g)) and GSMaP ((b)(e)(h)). (c) shows the precipitation water based on ERA- Interim analysis. (f) and (i) shows the retrieved liquid water path and ice water path (??/?2) from GMI at around 1731UTC on July 12 in 2015. 32 The overall pattern of PW in GSMaP is also similar to the ERA-Interim analysis. By contrast, NoGSMaP without precipitation assimilation shows weaker PW than the ERA- Interim analysis. The location of maximum PW in NoGSMaP is incorrectly placed on the northeastern side of the ERA-Interim analysis. Figure 2.6(d)-(f) shows that both experiments underestimate the LWP of typhoon Nangka, though assimilating precipitation helps to increase the LWP around the typhoon center, making it closer to GMI retrievals. Figure 2.6(g)-(i) compares the SWP. Though GSMaP and noGSMaP shows smaller SWP than the GMI SWP, GSMaP presents a better structure of outer rainbands. For example, GMI shows large SWP south of 16?N, which is also observed in GSMaP though its magnitude is weaker. By contrast, the SWP in NoGSMaP is too small, with its value below 0.4??/?2. 2.4.2 Overall track and intensity forecast error Figure 2.7 shows the mean absolute track error up to 72-hour forecast against the JMA RMSC best track records for the two experiments PRECIP and NoPRECIP. For all four typhoon cases, assimilating additional GSMaP surface precipitation leads to lower mean absolute track error for all the lead time within 72-hour forecast. It is noted that for both experiments the track error at the analysis time is relatively large, with an error around 80km for the experiment PRECIP and 100km for the experiment NoPRECIP. This is because no vortex relocation is conducted after each 6-hour cycle in our experiments, neither did we assimilate the TC vital data (MSLP and TC position). We expect applying these two methods should further reduce the TC track error. 33 Figure 2.7: Mean absolute track error (unit: km) as a function of forecast lead time up to 72- hours for each individual typhoon in the experiment w/ (red) and w/o (black) assimilation of GSMaP surface precipitation. The number of 6-hour cycles used to calculate the error at each forecast lead time is indicated by the gray bar and the right axis. 34 Figure 2.8 displays the mean MSLP error for all typhoons. For Chan-hom, the experiment with additional precipitation assimilation shows improved MSLP forecast up to 36 hours, while its forecasts start to get worse than the experiment NoGSMaP after 36 hours. For Nangka, an almost constant improvement due to precipitation assimilation is seen from the analysis time to 72-hour forecast. Assimilating precipitation also lead to smaller MSLP error up to 66 hours for typhoons Soudelor and Goni. To evaluate the overall gain of precipitation assimilation on the typhoon track and MSLP forecast, we aggregated the results from all cases and showed the overall average track and MSLP error in Figure 2.9. The benefits of assimilating precipitation with Gaussian Transformation can be seen for both the typhoon track and MSLP forecast. A nearly uniform small track forecast improvement is observed from the analysis time to the 72-hour forecast lead time. For the MSLP forecast, assimilating surface precipitation leads to an approximately 10hPa error reduction up to 30-hour forecast lead time, while this error reduction disappears after about another 36 hours. In general, precipitation assimilation with Gaussian transformation leads to improved track and MSLP forecast up to 72-hours. Despite the improved MSLP forecast brought by the precipitation assimilation, both experiments present large MLSP error at the analysis time. Figure 2.10 compares the analysis MSLP from two experiments with the RMSC best track during the period when typhoon Nangka and Soudelor experienced rapid intensification. Both experiments underestimate the MSLP, though the typhoons generated in the experiment GSMaP are stronger, with their values closer to the best track data. This MSLP underestimation problem is probably due to the coarse resolution (36km) adopted in this study. Past studies 35 Figure 2.8: MeanMSLP error (unit: hPa) as a function of forecast lead time up to 72-hours for each individual typhoon in the experiment w/ (red) and w/o (black) assimilation of GSMaP surface precipitation. The number of 6-hour cycles used to calculate the error at each forecast lead time is indicated by the gray bar and the right axis. 36 Figure 2.9: Mean absolute track error (left) and MSLP error (right) as a function of forecast lead time up to 72-hours for all four typhoons in the experiment w/ (red) and w/o (black) assimilation of GSMaP surface precipitation. The number of 6-hour cycles used to calculate the error at each forecast lead time is indicated by the gray bar and the right axis.. adopting coarse resolutions also reported this underestimation issue for the intensity forecast [Miyoshi and Kunii, 2012]. We expect that increasing the model resolution would alleviate this MSLP underestimation problem and contributes to a more realistic intensity forecast. Figure 2.10 also shows the 72-hour deterministic forecasts initialized from the ensemble mean for the first 10 6-hour DA cycles during the selected period. The 72- hour forecasts issued from the experiment GSMaP generally show stronger intensification trends than those from the experiment NoGSMaP. For example, the forecast initialized between 06Z and 18Z on August 2 kept intensifying the typhoon for about 1 day, with its intensification trend matching quite well with the best track data. We also observe several forecasts for Nangka that shows stronger intensification trend, with their values closer to the best track. By constrast, such deepening trend is not seen from the forecasts issued 37 Figure 2.10: Time series of the analysis MSLP (hPa) of typhoon (a) Nangka and (b) Soudelor for the experiment w/o (black) and w/ (red) surface precipitation assimilation. The 6-hour MSLP analysis are shown by the thick colored lines with open circles. The 72-hour forecast initialized from the ensemble mean analysis for the first 10 6-hour cycles in the figures are also shown in the figure thin colored lines. The RMSC best track data is shown by the thick gray lines. 38 from the experiment NoGSMaP. 2.4.3 Importance of precipitation retrievals over typhoon regions To understand which portion of surface precipitation contributes the most to the improved typhoon forecast, data denial experiments is conducted for the typhoon Soudelor. In the experiment GSMaP_ENV, precipitation retrievals at least 500km away from the typhoon center are assimilated in addition to PrepBUFR observations. The typhoon center is identified by the best track data. The experiment GSMaP_CORE assimilates precipitation within a radius of 500km from the typhoon center, plus PrepBUFR. Both experiments are initialized from the ensemble analysis at 1200UTC on July 12 in the experiment GSMaP. Experiments show the precipitation assimilation over typhoon regions is critical for improved typhoon forecast. For example, Figure 2.11 shows the 5-day track forecasts initialized from the analysis ensemble mean at 00UTC on August 2, when Typhoon Soudelor experienced intensification. The forecasted tracks inGSMaP_COREandGSMaP_ALL are very close to the best track in the first 48 hours. By contrast, the predicted track in GSMaP_ENV is on the southwest of the best track data. Besides, compared with best track and other two experiments, the movement of the typhoon in GSMaP_ENV is too slow. After 48 hours, the predicted tracks by GSMaP_CORE and GSMaP_ALL starts to deviates from the best track, and they both appear on the northeastern side of the best track up to 120 hours. Meanwhile, the GSMaP_ENV continue showing an erroneous track on the southwest of the best track. 39 Figure 2.11: 120-hour track forecast initialized from the analysis at 00UTC on August 2, 2015, for the experiment GSMaP (denoted as ALL, purple), GSMaP_CORE (red) and GSMaP_ENV (blue). The best track is shown in black. The circles represent the forecasted typhoon location every 6 hours. Forecasts after 72-hours are in light colors in each track. 40 Figure 2.12 compares the forecast track error and intensity error measured byMSLP for the same cycle. For the track forecast, GSMaP_CORE shows comparable track forecast error as GSMaP_ALL in the first 52 hours, and lower error up to 120 hours. The track error in GSMaP_ENV is the largest among three experiments almost in every forecast lead times. Figure 2.12 indicates all three experiments underestimate the MSLP, which could be partially due to the coarse resolution adopted in this study. However, experiments GSMaP_CORE and GSMaP_ALL with precipitation assimilation over typhoon regions shows the rapid intensification period observed by the best track data though the estimated intensity is weaker than the best track records. GSMaP_ENV shows a much slower intensification, with its estimated intensity even weaker than that in GSMaP_CORE and GSMaP_ALL. 2.4.4 Verification against GPS-RO retrieved wet profiles over typhoon regions To test our hypothesis that assimilating additional precipitation improves the temperature and the humidity over the typhoon regions, we compare the analysis and 6-hour forecast from both experiments with the collocated the GPS Radio occultation (RO) retrieved wet profiles over typhoon regions. Unlike the infrared radiance observations that are sensitive to clouds, GPS RO observations are marginally contaminated by the clouds and precipitation so that their retrieved profiles can provide valuable information over the cloudy and precipitation regions. Besides, GPS RO retrieved profiles also provide temperature and humidity profiles with a dense vertical resolution of 100m. Given the 41 Figure 2.12: 120-hour track forecast in the experiment ALL (purple), CORE (red) and ENV (blue). The best track is shown in black. The circles represent the forecasted typhoon location every 6 hours. Forecasts after 72-hours are in light colors in each track. 42 high quality of GPS-RO profiles, they often serve as the inputs to the radiative transfer models to monitor the accuracies of the observed brightness temperature from spaceborne sensors at operational centers. Those merits of GPS-RO retrieved profiles also make them an ideal source to assess our model analysis over the typhoon regions. In this study, theLevel-2GPSROretrievedwet profiles processed by theConstellation Observation System for Meteorology, Ionosphere, and Climate [COSMIC, R. A. Anthes, 2008] Data Analysis and Archive Center (CDAAC) are used for simulation verification. Those retrieved profiles residing inside the model domain during experiment periods are collected at first from the COSMIC, TerraSAR-X (TSX), Gravity Recovery and Climate Experiment (GRACE), Korea Multi-Purpose Satellite-5 (KOMPSAT-5) and the Global Navigation Satellite System (GNSS) Receiver for Atmospheric Sounding (GRAS) onboard Metop-A and Metop-B satellites. In the second step, only those profiles within 3-hour difference at the validation time andwithin 500 km from the typhoon centers are retained to assess the analysis over the typhoon regions. The locations of those profiles are assigned as the longitude and latitude of perigee point at occultation point. For each profile, temperature or humidity analysis with missing values are also excluded. Following this procedure, 30, 43, 46, and 41 wet profiles (160 profiles in total) for typhoons Chan-hom, Nangka, Soudelor and Goni are collected, and their distribution are shown in Figure 2.13(a). In general, those collocated profiles loosely cover the whole typhoon tracks, meaning they have sampled different stages of simulated typhoons. Figure 2.13(b) shows the sampling frequency as a function of vertical levels and radial distance from the typhoon center for all four typhoons. More verification samples are available for the upper atmosphere and regions away from the typhoon centers. Besides, 43 Figure 2.13: (a) Horizontal distribution of collocated GPS-RO retrieved wet profiles for typhoons Chan-hom (red), Nangka (orange), Soudelor (blue) and Goni (green) surround the RMSC best tracks, (b) Number of collocated GPS-RO profile variables as a function of the vertical pressure and radial distance from the typhoon center very limited samples are seen below 950hPa, with some regions having no verification samples. Given this, the verification against the GPS-RO retrievals will be limited to levels between 100hPa and 950hPa. Figures 2.14 and 2.15 show the RMSE and bias of 6-hour temperature and humidity forecasts of each individual typhoon for the two experiments. They are calculated by interpolating the 6-hour forecast of model temperature and humidity fields from the experimentsGSMaP andNoGSMaP to the location ofGPS-ROprofiles. Only the variables at the vertical levels below the SCALE model top are interpolated from the model output to avoid the extrapolation. For Chan-hom and Nangka, precipitation assimilation leads to reduced RMSE and bias for both temperature and humidity below 300hPa. For Soudelor, marginal improvement is seen for the temperature below300hPa and for the humidity below 700hPa. Precipitation assimilation also contributes to a slightly reduced temperature bias. 44 For Goni, the GSMaP experiments agrees more favorably with the collocated profiles, showing smaller RMSE and bias for temperature below 250hPa and reduced RMSE for humidity below 300hPa. A larger humidity bias is observed between 700hPa and 850hPa after assimilating precipitation. Figure 2.14: Temperature and humidity RMSE and bias of 6-hour forecasts validated against the collocated GPS-RO wet profiles for the experiments GSMaP (red) and NoGSMaP (black) for typhoons Chan-hom (1st row) and Nangka (2nd row). Figure 2.16 calculates the overall temperature and humidity RMSE and bias for 45 Figure 2.15: Same as Fig. 2.14 except for typhoons Soudelor (1st row) and Goni (2nd row). 46 all four typhoons. In general, precipitation assimilation leads to reduced forecasted temperature RMSE and bias under 300hPa and smaller humidity RMSE below 300hPa. Besides 6-hour forecast, we also compared model analysis from two experiments with the collocated profiles and get a result similar to Figure 2.16. Figure 2.16: Temperature and humidity RMSE and bias of 6-hour forecasts validated against the collocated GPS-RO wet profiles for the experiments GSMaP (red) and NoGSMaP (black) for all four typhoons. 2.4.5 Impact of precipitation CDF on typhoon forecast errors Since precipitation CDFs are critical for Gaussian Transformation, we conducted sensitivity experiments for typhoon Soudelor by using different precipitation CDFs for precipitation transformation. Here we focus on two questions. In the experiment GSMaP, we merged 5-day modeled and observed precipitation samples around each model grid. A following question is whether including longer-period precipitation samples in the CDF can enhance the precipitation assimilation and lead to improved typhoon forecast. To 47 answer this question, we conducted the experiment GSMaP30 where we merged 30-day modeled and observed precipitation samples to create the precipitation CDFs. The second question is whether we can obtain comparable typhoon forecasts by adopting the observed-precipitation-basedCDFs, since including themodeled precipitation samples in CDFs requires the archives of model simulated precipitation in the past, which are not always available. It thus could be computational expensive to construct precipitation CDFs if we must run the model to collect those precipitation samples. To explore this question, we conducted the experimentGSMaP30_OBSwhere the precipitation CDFs only includes 30-day observed precipitation, and the experimentGSMaPyr2014_OBS the precipitation CDFs includes observed precipitation from the whole year 2014. Figure 2.17 presents the 72-hour mean track error and MSLP error of typhoon Soudelor due to different precipitation CDFs. GSMaP30 with 30-day samples shows comparable track forecast as GSMaP with 5-day samples, while GSMaP30 shows smaller MSLP error than GSMaP for the 48-hour to 72-hour forecast lead times. Figure 2.17 also shows that typhoon forecast comparable as GSMaP can be obtained by only using 30-day observed precipitation in the experiment GSMaP30_OBS. GSMaPyr2014_OBS shows smallest track errors for almost all lead times among four experiments, while it has similar to a slightly better MSLP forecast performance than GSMaP. We now start to investigate why including more precipitation samples in three experiments (GSMaP30, GSMaP30_OBS, GSMaPyr2014_OBS) contribute to improved forecast. Figure 2.18 shows CDFs of a randomly selected model grid from experiments GSMaP,GSMaP30, GSMaP30_OBSandGSMaPyr2014_OBS.ThoseCDFs fromdifferent experiments show quite different shapes. For example, the precipitation rate of 4mm/6hr 48 Figure 2.17: Mean absolute track error (left) and MSLP error (right) as a function of forecast lead time up to 72-hours for typhoon Soudelor in the experiments NoGSMaP (balck), GSMaP (red), GSMaP30 (pink), GSMaP30_OBS (blue), and GSMaPyr2014_OBS (cyan). corresponds to a probability about 0.99 in GSMaP and probabilities ranging from 0.78 to 0.9 in other three experiments. For the experiment GSMaP, the corresponding CDFs for precipitation rate of 4mm/6hr and 5mm/6hr are almost the same, since GSMaP only utilizes 5-day samples. By contrast, GSMaP30, GSMaP30_OBSandGSMaPyr2014_OBS still shows a clear probability variation for these two precipitation rates, thanks to more precipitation samples included in their CDFs. Nowwe analyze the impact of these different shaped CDFs on data assimilation. The Gaussian transformation process first find the corresponding CDFs values (e.g., ? (???) and ? (???)) of the observed and modeled precipitation. The transformed precipitation ?? and ?? are then calculated by finding the standard Gaussian variable value with the same CDF values, ?? = ??1 [? (???)] and ?? = ??1 [? (???]). The transformed variables ?? are then assimilated into the system. Since the 5-day sampled CDFs might fail to sample 49 Figure 2.18: CDFs used for precipitation transformation at the model grid 140.13oE, 15.65oN in the experiments NoGSMaP (balck), GSMaP (red), GSMaP30 (pink), GSMaP30_OBS (blue), and GSMaPyr2014_OBS (cyan). the large precipitation, modeled and observed precipitation rate greater than a threshold (e.g., about 4mm/6hr in the experiment GSMaP) will correspond to the same probability after transformation, leading to zero observation innovation, ?? ? ?? = ??1 [? (???)] ? ??1 [? (???)] = 0. This zero observation innovation will then lead to zero analysis increment ?x = 0, since ?x = K(?? ? ??) based on the Kalman Filter equation. Therefore, in the experiment GSMaP only small to moderate precipitation rate around typhoons essentially generate nonzero analysis increment to improve the model forecast. By contrast, precipitation CDFs for the experiments GSMaP30, GSMaP30_OBS, GSMaPyr2014_OBS that include longer-period precipitation samples have more large precipitation rates. This enables their CDFs still show a probability variation when precipitation rate exceeds 4mm/6hr, leading to a nonzero observation innovation, ??? ?? ? 0, which finally contributes to a nonzero analysis increment. Therefore, CDFs constructed over longer-periods have the advantage of including more large precipitation samples, which contribute to more effective assimilation of large precipitation observations. 50 2.5 Summary To handle the non-Gaussianity problem associated with precipitation observations, Lien et al. [2013] assimilated surface precipitation with Gaussian transformation in the EnKF and showed effective precipitation assimilation in the simulated experiments with an intermediate general circulation model. Lien et al. [2016a,b] and Kotsuki et al. [2017] further proved with real-observation experiments that precipitation assimilation with Gaussian Transformation improves the model analysis and leads to improved weather forecast up to 5 days by assimilating satellite-retrieved precipitation into the global numerical weather prediction models. In this study, we extended the precipitation assimilation framework [Lien et al., 2013, 2016a,b] into regional models, and explore the benefits of assimilating satellite-retrieved GSMaP precipitation on typhoon forecasts. Four typhoon cases over Western Pacific in 2015 are studied to understand the impact of surface precipitation with Gaussian transformation. The experiments show that additional assimilation of GSMaP precipitation improves the representation of MSLP, surfacewinds, precipitation, moisture, and hydrometeors for typhoon simulations. Generally, precipitation assimilation contributes to small track forecast improvement and lower intensity error measured by MSLP up to 72-hour forecast. It is found that precipitation assimilation helps the model to capture the typhoon intensification more quickly. Data denial experiments indicates that satellite-retrieved precipitation over typhoon regions is mainly responsible for the improved track and intensity forecast. Verification against the GPS-RO wet profiles collected over the typhoon regions confirms improved temperature and moisture analysis (forecast) introduced by precipitation assimilation. To further 51 simplify the process of preparing precipitation CDFs used for Gaussian Transformation, a purely observation-based CDF construction approach is tested to replace the original observation-and-model based CDFs, which could be valuable when the past model precipitation archive is not available. Results show that when the observation-based CDF includes enough precipitation samples, it can lead to similar improvement by precipitation assimilation, as those with the observation-and-model based CDFs. Though we showed many benefits of assimilating GSMaP precipitation on typhoon forecast in this study, it is noticed that themodel resolution adopted in our study is relatively low. It would be desirable to increase our model resolution in our future studies. Besides, given that the precipitation over typhoon regions contribute the most to the improved track and intensity forecast, the precipitation CDF can be formulated in an alternative way. Instead of creating CDFs on each model grid by collecting precipitation samples from its nearby grids, historical GSMaP precipitation samples could be collected near the typhoon centers based on the historical best track data. The CDFs constructed from these samples should be more representative of typhoon-related precipitation. Using these CDFs for Gaussian transformation to assimilate precipitation over typhoon regions might lead to further analysis and forecast improvement. 52 Chapter 3: Multi-layerObservationLocalization forNonlocalObservations in the Local Ensemble Transform Kalman Filter. 3.1 Introduction In the modern observing system, a large portion of observations belongs to nonlocal observations. Unlike conventional observationswithwell-defined locationswhen observed, nonlocal observations are intrinsically integrated measurements in space, and thus have no single well-defined observation location [Campbell et al., 2010]. The most common nonlocal observations include radiance observations obtained from spaceborne sensors, and they are not the only nonlocal observations digested in modern assimilation systems. A large proportion of remote-sensed observations are also nonlocal observations. For example, besides radiance observations and column-integrated hydrometeors (e.g., liquid and ice water path) retrieved from microwave sensors, there are also 3-dimensional spatial integrated measurements such as GPS RO excess phase observations [Kuo et al., 2000, Ma et al., 2009]. For ocean data assimilation, altimeter observations are nonlocal observations that provide information on sea surface height [Mogensen et al., 2012]. Nonlocal observations are important to numerical weather prediction. In particular, satellite radiance observations have been playing a critical role in improving global 53 weather forecast skill since the direct assimilation of radiance [Derber and Wu, 1998, Geer et al., 2017]. Now numerical weather prediction relies heavily on satellite radiance observations according to the impact assessments with the tool of Forecast Sensitivity to Observations [FSO, Langland and Baker, 2004, Gelaro and Zhu, 2009, Kalnay et al., 2012,Buehner et al., 2018]. For example, evaluatedwith theEnsemble Forecast Sensitivity to Observations [EFSO, Kalnay et al., 2012], spaceborne microwave and hyperspectral infrared clear-sky radiance observations have a total influence that overweighs the impact from all conventional observations in the modern data assimilation system [Ota et al., 2013]. Recent research advances in all-sky radiance assimilation further show the benefits of assimilating clouds/precipitation-contaminated radiances in improving global forecast skill and tropical cyclone predictions [Geer and Bauer, 2011, Zhang et al., 2016, Honda et al., 2018, Geer et al., 2018, Bonavita et al., 2020]. These all stress the importance of proper assimilation of nonlocal observations in modern assimilation systems. However, proper assimilation of nonlocal observations with the Ensemble Kalman Filter (EnKF) system is known to be a nontrivial problem. Due to sampling errors caused by the limited ensemble size, the error covariance needs to be localized in the EnKF systems. There are generally two methods to localize error covariance in the ensemble systems: model-space localization and observation-space localization. Using a 1-dimensional model, Campbell et al. [2010] show that model-space localization achieves smaller mean analyses error than observation-space localization for radiance assimilation. Model-space localization is initially applied to the hybrid ensemble-variational systems [Houtekamer andMitchell, 1998, Hamill et al., 2001,Wang et al., 2008] and then extended to pure ensemble-based systems such as the Ensemble Transform Kalman Filter [Bishop 54 et al., 2017, Lei et al., 2018] and the Ensemble Square Root Filter [Shlyaeva et al., 2019]. Parallel to the development of themodel-space localization, Lei andWhitaker [2015] show that observation-space localization can have better performance than model-space localization when a negative correlation exists between ensemble state and simulated observation perturbations. In addition to this advantage, further development of the observation-space localization techniques will also benefit the advances of local particle filters, since observation-space localization is utilized by those local particle filters [e.g., Poterjoy, 2016, Potthast et al., 2019]. A recent study by Feng et al. [2020] shows that the structure of observation localization functions significantly influences the final analysis accuracy of particle filters. However, one difficulty for the observation-space localization rises from the vertical localization of nonlocal observations, since they have no single well-defined observation location as conventional observations [Campbell et al., 2010]. An empirical single-layer vertical localization (SLVL) approach widely used to handle these nonlocal observations, is to vertically localize them at their weighting function (WF) peaks with symmetric Gaussian-shaped localization functions [Houtekamer and Mitchell, 2005, HM05]. This simple empirical SLVL approach with the Gaussian-shaped localization function works effectively for radiance observations with symmetric Gaussian-shapedWFs. For example, the multi-year NICAM ensemble reanalysis system [Terasaki et al., 2015, Terasaki and Miyoshi, 2017, Terasaki et al., 2019] adopts the SLVL approach to assimilate AMSU- A clear-sky radiance. Their analysis quality in the Southern Hemisphere has been significantly improved thanks to the assimilation of AMSU-A observations [Terasaki and Miyoshi, 2017, Terasaki et al., 2019]. Miyoshi and Sato [2007, MS07] improve the SLVL 55 approach by replacing the fixed Gaussian-shape vertical localization functions with the flow-dependentWFs calculated from the ensemble background. Instead of setting radiance observations at the WF peaks, Fertig et al. [2007, F07] assimilate nonlocal observations at all levels where the level-dependent WF value exceeds a predefined threshold. Besides these vertical localizationmethods designed specifically for radiance assimilation, several generic methods exist that can be used to acquire localization functions for any observations, such as hierarchical ensemble filter [Anderson, 2007], empirical localization function [Anderson and Lei, 2013], and correlation-cutoff method [Yoshida and Kalnay, 2018, Yoshida, 2019]. Real-observation or observation system simulation experiments (OSSEs) with the localization functions acquired from these methods show improved analyses when compared to the results with the traditional Gaussian-shaped localization functions [Lei et al., 2016, 2020, Yoshida, 2019]. Interestingly, the localization functions derived from these generic methods usually have structures distinct from Gaussian shapes, which indicates localizing nonlocal observations usingGaussian-shape functions is suboptimal. The only burden of these generic methods is that these localization functions are calculated based on samples gathered fromoffline experiments orOSSEs, which could be computationally expensive. While the empirical SLVL approach can properly assimilate nonlocal observations with narrow or symmetric Gaussian-shape WFs (e.g., AMSU-A observations), it has difficulties handling observations with broad asymmetricWFs or with multipleWF peaks, which are typical for clear-sky radiance from sounding or trace-gas sensitive channels onboard hyperspectral infrared sensors [Christophersen, 2015]. As a comparison, we present in Fig.3.1 the WFs of selected channels from the 56 Figure 3.1: WFs of selected channels from the AMSU-A and CrIS 431-channel subsets that are routinely assimilated in the operational data assimilation systems. Channels in group (a) are AMSU-A channels with Gaussian-shapeWFs. Groups (b)-(e) show selected channels from CrIS. Channels in group (b) have narrow and symmetric WFs while those in group (c) have long-tailed, asymmetric WFs. Group (d) presents two subgroups of surface-sensitive channels but with very different half-widths. Channels in group (e) have more than one WF peaks. microwave temperature sounder AMSU-A and hyperspectral infrared sensor CrIS Full- Spectral-Resolution (FSR) 431-channel subsets[Jung et al., 2018]. For AMSU-A (Fig. 3.1(a)), except four surface-sensitive channels (channels 1, 2, 3, 15), all sounding channels share nearly symmetric Gaussian-shaped WFs with similar halfwidths. Among its four surface channels, channels 1-3 have almost identical halfwidthswhile channel 15 is slightly broader. Those common features shared by the AMSU-A channels partially explain why the SLVL approach is so effective in localizing AMSU-A observations. By contrast, channels from CrIS show far more complicated structures. Among its four groups (b)-(e), the SLVL approach would work well for channels in group (b), because 57 they share symmetric Gaussian-shaped WFs similar to AMSU-A channels. The SLVL approach, however, would encounter difficulty dealing with the remaining three groups (c)-(e). In group (c), Channels have broad asymmetric WFs that differ from Gaussian shapes with additional long fat tails. In group (d), two subsets of surface-sensitive channels are presented here but with very different halfwidths, which implies that setting a constant localization value for all channels would be suboptimal when assimilating them. The most problematic ones inconsistent with the SLVL approach are presented in group (e): those channels have more than one WF peak. Vertically localizing these observations at their maximum WF peaks in the EnKF will force the assimilation system to neglect the substantial contributions from other WF peaks. Note that this multi-peak feature is not limited to hyperspectral sensors. It is also shared by many other infrared sensors, such as the infrared imagers onboard geostationary satellites (e.g., the USGOESABI and Himawari 8/9 AHI). Those infrared imagers usually have one channel with its central wavelength around 9.6 ?m, giving their capability to retrieve ozone information. For those channels, there is usually a secondary weak WF peak near the tropopause above its surface WF peak. Clearly, for channels with broad asymmetric WFs or multiple WF peaks, the SLVL approach does not adequately account for contributions from different atmospheric layers and hence is not a satisfying solution. In this chapter, we seek to develop a new localization method that will consider the entire shape of WF, instead of just its peak, when performing vertical localization so that nonlocal observations with complicated WF shapes such as CrIS observations can be properly assimilated in the EnKF. Besides, the new localization method should inherit the merits of the SLVL approach when dealing with local observations, given their 58 effectiveness in localizing point observations. At last, it will be desirable if the newmethod can generate localization functions that share similar features as those acquired from other generic tuning methods [Anderson, 2007, Anderson and Lei, 2013, Yoshida and Kalnay, 2018, Yoshida, 2019]. With these three points in mind, we describe the formulation of our multi-layer observation vertical localization (MLVL) method in Section 3.2. Section 3.3 compares the SLVL and MLVL functions for selected AMSU-A and CrIS channels. Cycled assimilation experiments comparing the SLVL and MLVL methods with a multi- layer QG model are described in Section 3.4, and results are discussed in Section 3.5. Section 3.6 gives the summary of this study. 3.2 Methodology 3.2.1 The equivalent form of the LETKF without vertical localization TheLETKF [Hunt et al., 2007] is a deterministic filter that seeks a local transformation matrix W? so that the background perturbations X? = [x? ? ? ?1 ?x , ...,x ?x ] can be directly? transformed to the analysis perturbation X? = X?W? while satisfying the requirement for its posterior error covariance posed by the Kalman filter equations. Without observation localization, the analysis mean x? and ?-member analysis perturbations X? are updated by x? = x? +X?P?? (Y?)?R?1(y? ?y?) (3.1) X? = X?W? = X? [(? ?1)P??]1/2 (3.2) P?? = [(? ?1)I + (Y? ???? ) R?1Y?]?1 (3.3) 59 where P?? is a ? ? ? matrix. If assuming no correlated observation error, applying the observation localization in the LETKF causes the diagonal matrix R in Eq.(3.3) being replaced by R??? ????(R?1 ?1 ?2 ?? ? ???) = [ , , ..., ] (3.4)?1 ?2 ?? where ? ? , ? = 1,2, ...? is the localization function with its bounded value ? [0,1]. The SLVL approach treats nonlocal observations the same as the local observations in terms (Y?)?R?1Y? and (y? ?y?): each observation occupies only one row in Y? and (y? ?y?). With this formulation, the nonlocal properties of integrated observations cannot be explicitly presented. Given this difficulty, we are motivated to develop an alternative equivalent formulation of the LETKF that can explicitly expand the nonlocal observations at multiple levels while keeping its solution the same as the original LETKF when localization is not imposed. From that new formulation we can then localize contributions from different levels with more flexibility. From Eqs. (3.1)-(3.3), we notice that the observation localization alters the final analysis x? and X? through the term (Y?)?R?1Y?. When localization is not imposed, the final LETKF analysis will not change as long as the value of (Y?)?R?1Y? keeps the same. Our problem then becomes clear: can we find an equivalent matrix Z, which has the same value as (Y?)?R?1Y? when no localization is performed, while the formula for Z can express the nonlocal observations at all vertical levels, instead of a point observation as in the original LETKF? After numerous attempts, we present here one such matrix Z that satisfies our 60 requirement. Considering ? members and ? observations (??? nonlocal observations, and ?? local observations, and no correlated errors bewtween these two batches), it can be proved that if duplicating the nonlocal observation batch Y? ? times while each ?? ? duplicate has the observation errors inflated by ? times (i.e., ?R), the resulting new matrix Y? will lead to the same value as (Y?)?R?1Y?. ???? (Y? )?R?1 Y? = (Y?)?R?1 ????? ???? ???? Y (3.5) To see why Eq. (3.5) holds, the new matrix Y? and R?1 can be written as ???? ???? ???? ?? ?? ??Y? 1 R?1 ??? ??,1 ?? ? ? ???? ??,1 ??? ?Y? ? ? 1 ?1 ???,2 ??? Y? = ???? .. ???? ???? R? ? ??,1 ?? ?1 ???? ? .? ? ,R???? = ? ???? ? . . . ??? , (3.6) ? ?? 1 ?1 ???? Y ??,???? R? ???? ? ??,1 ?? ? ? ? ?Y? R?1?? ? ?? 61 Then we have [ ] (Y? )?R?1? ???? ????Y ? ? ???? = ((Y )? )??? . . . ((Y? )?? ? )?? ?? ?? ?? ((Y ? )?? )?? ? ? ? ??? ???? ????( 1 R?1 ) (Y? ) ? ?? ??????? ?? ?? ?? ?????? ? ? ?? . ?? ?. ? ?.. ? ???? . ??? ??? . ??? 1 ? ( R ?1 ) ? ?? ? ?? ???? ????(Y )? ?? ?? ?? ?? ???????? ? ?? (R?1) ? ? ? ? ?? ? ??? ?? ((Y )? ?? ?? ? ? ?? ?? ? [ ] ????(Y )?? ??????? ?. ?.. ?? = ((Y? )? 1 R?1 ) ? . . . ((Y? )? 1 R?1 ) ? ((Y? )?R?1) ? ??? ? ?? ? ??? ?? ? ?? ? ??? ? ? ???? ???? ? ??(Y ? ) ? ? ?? ??????? ?? ? ? ((Y? ) ? ? ???? ?? ? = [(Y? )? 1 R?1 ? ? ? ?1 ??? ??Y??]??? + [(Y? ?) R? Y?]??? ?=1 = [(Y? ???) R?1??Y? ] ? ? ?1 ??? ??? + [(Y?) R? Y?]??? (3.7) Meanwhile, the matrix without performing duplication of nonlocal observation batches is [ ] ???? ? ? ??(R?1 ) ? ? ? ?(Y?)?R?1 ?? ?? ?????? ?Y = ((Y? )? ) ? ((Y? )? ) ??? ? ??? ? ???? ??? ?? (R??1) ??? ????(Y )?? ????????? [ ] ? ?? ???? ? ? ((Y? )? ? ???? ? ? ? ?(Y? )?? ????? = ((Y? )?R?1 ) ? ? ?1 ? ? ?? ?? ??? ((Y ) R )?? ? ? ???? ???? ?((Y? ) ?? ???? ?? = [(Y? ? ?1 ???) R??Y??]??? + [(Y ? ? ?) R ?1 ? ? Y?]??? (3.8) 62 Comparing Eqs. (3.7) and (3.8) leads to (Y? )?R?1 Y? = (Y?)?R?1Y?. ???? ???? ???? Now consider the more relaxed form of R?1 : If different weights are assigned to ???? each duplicate of nonlocal observation batch (i.e., c??1 = [?1, ?2, ..., ? ]?? ) The condition that c??1 needs to satisfy to hold Eq.(3.5) is ??? [(Y? )?? R?1 Y? ? ??? ? ?? ??]??? = [(Y??) R ?1 ? ??Y??]??? (3.9) ?=1 which requires ??? ?? = 1 (3.10) ?=1 This means as long as we make duplicates of the nonlocal observation batch with their inflated observation errors satisfying Eq.(3.10), the same solution as the original LETKF is guaranteed. This has one substantial implication: If letting the number of duplicates to be equal to the number of model vertical levels for the nonlocal observation simulations (i.e., ? = ?), we are essentially decomposing one nonlocal observations into ? local observations with inflated errors that reside at different vertical levels. Since now the decomposed observations are local observations, we can utilize the traditional localization method to localize them in the LETKF. Note that though the example shown here is for vertical decomposition, thismethod can also be used to expand other nonlocal observations that transverse the model grid in 3-dimensional space (e.g., radiance simulation following slant path, GPS RO excess phase) if we know which model grids they travel across. 63 3.2.2 Interpretation of existing methods and the MLVL The multi-layer LETKF developed in the previous section makes it possible to express each nonlocal observation as a sum of multiple local observations. With this decompositionmethod, several existing observation localizationmethods can be interpreted in a unified way. 1 2 3 4 5 6 Normalized WF Figure 3.2: WF of one CrIS channel with double WF peaks. The largest WF peak is located at vertical grid 4, while the secondary WF is located at surface (grid 6). We are now trying to update the analysis at grid 5. To better illustrating the MLVL method, we use a CrIS channel as an example and compare the localization functions of different methods. Figure 3.2 shows the weighting function of one CrIS channel selected from the 431 subsets. In this schematic figure, there are 6 model grids in vertical, and we are now trying to update model grid 5 with this CrIS observation. Note that unlike the single-peak WFs from AMSU-A channels, the WF 64 Vertical grids shown here has its largest WF peak at vertical grid 4, and a secondary WF peak at surface. With the multilayer LETKF formulation, assimilation of this CrIS observation without vertical localization, or with HM05 (SLVL), MS07, F07 methods can be respectively expressed as [ ]? ????(R?1 1 1 1 1 1 1????,???) = [ ?11 ?21 ?31 ?41 ?51 ?61 ] (3.11a)? ? ? ? ? ? ? ????(R?1 05 1 1 1 1 1 1?? ,???) = ?1?? 5,4 ?2?5,4 ?3?? ? 5,4 ?? 4?5,4 ?5?? 5,4 ? ?? 6 5,4 [ (3.11b) ]? ????(R?1 ) = 1 ?5 1 ?5 1 ?5 1 ?5 1 ?5 1 ?5??07,??? ?? 1 ???( ??) ? ? ? ? ?? 2 ???( ? 3 4?) ? ???( ??) ? ???( ??) ? 5 ???( ? 6?) ? ???( ??) [ (3.11c) ]? ????(R?1?07 1,???) = ?1? ( ? 15) ?2? ( ? ) 15 ?3? ( ?5) 1? 14? ( ?5) ?5? ( ?5) 1?6? ( ?5)? ? ? ? ? ? (3.11d) where ??, ? ???? is the Gaussian-shaped localization value based on the distance between t????he grid 5 and the WF peak at grid 4, ?? is the WF value at vertical level ?, and ? ( ??) =?????1 ?? > ???? . After decomposition, there are two valid interpretations for these methods. If?0 ???? they treat this nonlocal CrIS observation as one local observation (i.e., ?5 = 1 while other ?? = 0), then these methods essentially only consider the contribution from one vertical level. If these me?thods indeed treat the CrIS observation as one nonlocal observation (i.e., ?? ? 0 while ? ?? = 1), we can observe in each method a constant localization factor (i.e., 1, , ?? 55,4 ( ) , and ? ( ?5)) is shared for localizing fractional (local) observations??? ?? at different levels. From this perspective all these methods belong to the same family of 65 localization functions. There are two improvements that we can make upon these methods for the MLVL. First, using a constant factor to localize observations fromdifferent distances is inconsistent with how we localize point observations. The common practice is to calculate the localization value based on the distance between model grids and observation locations (i.e., ?5,?, ? = 1,2, ...,6). After finishing revising the localization factors, there is still freedom to determine the weighting matrix c??1, since they only need to satisfy Eq.(3.10). Inspired by the MS07 and F07 method, we relate this weighting matrix in the MLVL with the entire WF of each radiance observation. The consideration of the entire WF shape is important when dealing with observations with multiple WF peaks. For example, if we try to assimilate the CrIS observation to update model grid 6 in our 1-d example (Fig.3.2), HM05 (SLVL) method will fail to account for the contribution from the model grid 6 by assigning it a small localization value, which actually contributes the second most to the simulated radiance. By contrast, MS07 and F07 will assign grid 6 large localization value given the secondary largest WF value here, avoiding the pitfalls of the HM05 method. Formally, the final form of our MLVL method to update targeted model grid ? is [ ]? ????(R?1 1 1 1????,???) = ?1?? ?,1 ? ?? 2 ?,2 . . . ? ? (3.12a)? ? ?,? | ? |?? ?? = ? ( ? = 1,2, ..., ?) (3.12b)? ? ?=1 | ? ? | where ? is the total number of model levels that this observation transverses. Applied to 66 our 1-D model, the decomposition is then [ ]? ????(R?1 1 1 1 1 1 1????,???) = ?1?5,1 ?2?5,2 ? ? ? ? ? ? ? ? (3.13)? ? ? 3 5,3 ? 4 5,4 ? 5 5,5 ? 6 5,6 In Eq.(3.13), the weighting term controls the maximum impact from different levels while the localization term ??,? damps the impact of each level on the target model grid by distance. Though there might exist other weighting matrix c??1 satisfying the constraint of Eq.(3.10), our choice of c??1 has one desirable property: it ensures that the MLVL method goes back to the traditional SLVL exactly when treating local observations by only allowing the weight at the observation level to be unity while keeping other levels zero. Alternatively, if some empirical localization functions are already acquired through generic tuning methods [Anderson, 2007, Anderson and Lei, 2013, Yoshida and Kalnay, 2018], the calculation of c??1 can be formulated as a constrainted optimization problem that minimizes the shape difference between the final form of the MLVL and the empirical localization functions. 3.2.3 Implementation of MLVL in the LETKF Based on Eqs.(3.13) and (3.5), the direct application of MLVL only replaces the calculation of (Y?)?R?1Y? for term P?? in Eq.(3.3) while keeping the other two equations (3.1) and (3.2) unchanged. Alternatively, we can further decompose (Y?)?R?1(y? ? y?) 67 in eq.(3.1) as ???? ?? (y? ?y?? ) ? ??,1 ?? ??? ?? (y ? ?y?) ???,2 ??? (Y?)?R?1(y? ?y?) = (Y? )?R?1 .. ????? ???? ???? . ?? ????(y? ?y?) ? (3.14) ? ??,??? ? ?? (y? ?y?) ?? ?? With Eq.(3.14), all nonlocal observations are now explicitly decomposed into local observations with inflated observation errors. Compared with the direct application of MLVL that only modifies Eq.(3.3), the explicit decomposition of nonlocal observations requires minimal code modifications on the existing LETKF systems. For example, if following Miyoshi [2005]?s implementation framework of LETKF (https://github. com/takemasa-miyoshi/letkf), this decomposition is conducted immediately after completing all observation simulations. Codes that run after the observation operator remain unchanged. The explicit decomposition of nonlocal observations also benefits the implementation of parallel LETKF algorithms. While some parallel LETKF algorithms keep a copy of global observations on each computing node (e.g., those from Miyoshi?s LETKF framework except SCALE-LETKF), other LETKF implementations might adopt a different partition strategy: on each computing node, only the local batch of observations around the model grids that are assigned to the current node are accessible [Hamrud et al., 2015]. For Hamrud et al. [2015] approach, additional considerations and coding efforts are required if the LETKF system wants to localize nonlocal observations with MLVL 68 that transverse the model grids scattered at several computing nodes. With (3.14), we can simply retain those transformed fractional observations within the cutoff radius on each computing node. Unlike sequential EnKFs (e.g., ensemble square root filter and ensemble adjustment Kalman filter) of which computational cost is proportional to the number of observations, the most computational demanding step in the LETKF is to acquire the eigenvectors of P??, which is of dimension ? ? ? . Since the MLVL does not increase the ensemble size ? , the total computational cost of the LETKF with the MLVL is not substantially increased. 3.2.4 Extension to other sequential-observation-processing EnSRFs Implementations of theMLVL introduced in previous subsection requires expansion of the matrix (Y?)?R?1Y?. The calculation of this matrix, however, is only required by the LETKF algorithm, but it is not included in other sequential EnKF algorithms (e.g., ensemble square root filter and ensemble adjustment Kalman filter). To extend the MLVL to these sequential EnKFs, we note for the 1-D example that after bringing the localized observation error term R?1 (Eq.(3.13)) back to (Y? )?R?1 Y? , we ????,??? ???? ????,??? ???? have ??? ( ) (Y? )?R?1 ? ? ????? ????,???Y???? = (Y??) ??? R ?1 Y? ??( ?,? ) ?? ???=1 (3.15)? = (Y? )? ? ? R?1 Y??? ? ?,? ?? ?? ?=1 where the final product is consisted of three independent components: the original nonlocal observation perturbation Y?, observation error R?1, and the other term related to 69 MLVL. This implies we are essentially localizing each nonlocal observation with original observation error but with a new localization function ????? defined as ??? ( ) ??? ( ??? ) ? ? ?????,? = ????,? = | ?? | ??,?/ | ? ? | (3.16) ?=1 ?=1 ?=1 where ?????,? is no smaller than 0 and converges to 1 when a large localization scale is applied. The structure of the MLVL localization function ????? also justifies the usage of non-Gaussian-shaped localization functions for observation localization: their shapes can be regarded as a weighted average of basis localization functions (e.g., exponential or Gaspari Cohn function). With form (3.16), the MLVL no longer limits its usage in the LETKF, and it can be applied to other serial-observation-processing EnKFs as a new kind of localization functions. For the sequential EnKF, the computational cost is proportional to the the number of observations. Unlike the explicit observation decomposition in the LETKF (section 3.2c) which increase the total number of observations, using the MLVL as a localization function keeps the same observation number, and thus does not increase the computational cost for sequential EnKFs. Note this form can also be used in the LETKF. For example, Miysohi?s (2005) parallel strategy for the LETKF does not decompose model grids in vertical and the information of each single column are stored in the same computing node. In this case it will also be convenient to use the MLVL as a localization function in the LETKF when localizing column integrated observations. 70 3.3 MLVL applied to the selected AMSU-A and CrIS channels 3.3.1 AMSU-A observations We then examine how SLVL and MLVL localization functions differ when applied to the selected AMSU-A and CrIS channels. The WFs required by both methods are calculated by the Community Radiative Transfer Model (CRTM, Han et al., 2007) with the US standard atmosphere as the input. Note that in real-observation experiments, the flow-dependent WFs are calculated with the vertical atmosphere profile and surface variables interpolated from the NWP background. Figure 3.3 shows the SLVL and MLVL functions applied to the AMSU-A surface channel 1 and troposphere channel 7. The SLVL localization functions for these two channels have their levels of maxima located at the channel WF peaks with the value of 1 and both share similar bell shapes as the AMSU-A WFs. By contrast, the MLVL localization functions show several features distinct from the SLVL. First, the MLVL functions can have maxima less than 1, with a small value corresponding to a small localization scale. Interestingly, this feature is consistent with those localization functions acquired using other independent methods [,i.e., Anderson and Lei, 2013, Lei et al., 2016, 2020] , where the maxima of their localization functions are also shown to be less than 1 for both ideal nonlocal [Anderson and Lei, 2013] and real AMSU-A/MHS observations [Lei et al., 2016, 2020]. For the MLVL, this is because the localization scale in the formula is responsible for limiting the impacts of all observational components at different levels on the target model grid. A grid receives minor integrated influence from distant levels when a small localization scale is adopted, thus causing the 71 localization value at this grid less than 1. Another interesting observation is that the MLVL peak level is above the WF peak (e.g., surface) for the AMSU-A surface channel 1. In the MLVL, the localization value at each grid quantifies the integrated impacts from all other levels. Though the WF peak for channel 1 resides at the surface, the surface level only gets impacts from the atmospheric layers above but not from below. This feature distinguishes the MLVL from the MS07 and F07 methods, which will set the center of the localization function at the surface in this situation. Interestingly, this finding is also consistent with the recent study by Lei et al. [2020], who showed that the vertical localization position is above surface for AMSU-A surface sensitive channels, but using an independent method. Besides these differences, the MLVL functions share some common properties to the SLVL functions. Like functions from the SLVL, the normalized MLVL functions show greater half-widths as the localization scale increases. Besides, the MLVL method in this example still generates Gaussian-shape localization functions thanks to the Gaussian- shape WFs of AMSU-A channels and the usage of exponential localization functions as the basis functions in the MLVL. 3.3.2 CrIS observations The localization of the CrIS observations tells a different story. Figure 3.4 shows the WF of one ozone-sensitive channel from CrIS. This channel has two WF peaks, with its maximum but narrow WF peak at the surface and the other broad one near 50hPa. Since the SLVLmethod only cares about the largestWF peak, it puts the bell-shaped localization 72 (a) SLVL: ch1 (b) MLVL: ch1 (c) MLVL_N: ch1 10 10 10 v0.2 v0.3 v0.4 v0.5 50 v0.6 50 50 v0.7 100 v0.8 100 100 200 200 200 300 300 300 500 500 500 750 750 750 1100 1100 1100 0 0.5 1 0 0.5 1 0 0.5 1 (d) SLVL: ch7 (e) MLVL: ch7 (f) MLVL_N: ch7 10 10 10 50 50 50 100 100 100 200 200 200 300 300 300 500 500 500 750 750 750 1100 1100 1100 0 0.5 1 0 0.5 1 0 0.5 1 Figure 3.3: (a) SLVL, (b) MLVL, and (c) normalized MLVL functions for NOAA-19 AMSU-A channel 1 with vertical localization scales ranging from 0.2 to 0.8. ? = 1 is used in Eq.(3.13b) for the MLVL functions. The WF and the boxcar localization function from the F07 method (threshold: 0.25) are respectively indicated by the black and gray lines. Panels (d)-(f) are the same as (a)-(c) except for channel 7. 73 Pressure (hPa) Pressure (hPa) function centered at the surface while completely neglecting the contributions from upper layers that show largeWF values. By contrast, the MLVL localization functions have their structures deviated from Gaussian shapes. They have two peaks, one near 50hPa and the other slightly above the surface, similar to the localization functions acquired from the MS07 and F07 methods. In the MLVL formulation, the weighting term takes care of the influence half-widths of different WF peaks, while the localization parameter controls the influence radius of observational components at different levels. Their combined effects enable MLVL to switch its largest peak from 800hPa to 50hPa when using a larger vertical localization value. One prominent feature of the MLVL method different from the MS07 and F07 methods is how the MLVL localizes the regions (e.g., 300-600hPa) between WF peaks. With MS07 and F07 methods, layers with small WF values between WF peaks receive small localization values while the MLVLmethod can still assign large localization values to the same regions given the large integrated impacts from the layers above and below them, especially when a large localization scale is adopted. The order ? of weighting is another parameter included in theMLVLmethod. Based on Fig. 3.4, a higher weighting order ? = 2,4 will emphasize all regions around the WF peaks while ignoring other levels, making the peaks of localization functions shaper with larger localization values. Note that for the MLVL functions shown here, we choose the normalizedWF for the weighting terms in the MLVL since it quantifies the vertical transmittance variation and reflects the combined effects of both temperature and gas concentrations on transmittance, and serves as a convenient shortcut to localize all state variables. However, this is not the 74 (a) SLVL (b) MLVL_w1 (c) MLVL_w2 (d) MLVL_w4 1 1 1 1 10 10 10 10 50 50 50 50 100 100 100 100 200 200 200 200 300 300 300 300 500 500 500 500 750 750 750 750 1100 1100 1100 1100 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 (e) MLVL_w1_N (f) MLVL_w2_N (g) MLVL_w4_N 1 1 1 10 10 10 50 50 50 100 100 100 200 200 200 300 300 300 500 500 500 750 750 750 1100 1100 1100 0 0.5 1 0 0.5 1 0 0.5 1 Figure 3.4: (a) SLVL, (b-d) MLVL, and (e-g) normalized MLVL functions for the 240th channel (channel 604 with the wavelength of 9.7382 ?m) of the NOAA-20 CrIS 431- channel subsets with vertical localization scales ranging from 0.2 to 0.8. ? = 1,2,3 are respectively used in Eq.(3.13b) for the MLVL functions in (b)(e), (c)(f) and (d)(g). The WF and the boxcar localization function from the F07 method (threshold: 0.25) are respectively indicated by the black and gray lines. 75 Pressure (hPa) Pressure (hPa) only choice nor the optimal one. Other candidates include temperature and gas Jacobians, which can be adopted to localize different model state variables. 3.4 Experiment design 3.4.1 The three-layer QG Model To compare the performance between the SLVL andMLVLmethods, the three-layer global QG model [Marshall and Molteni, 1993] is adopted in this study to conduct cycled assimilation experiments. The prognostic equations of the potential vorticity ?? (? = 1,2,3) for this model are ??1 = ?? (?1, ?1) ??1(?1, ?2) + ?1 (3.17a) ?? ??2 = ?? (?2, ?2) ??2(?1,?2, ?3) + ?2 (3.17b) ?? ??3 = ?? (?3, ?3) ??3(?2,?3) + ?3 (3.17c) ?? ?1 = ?2? ???21 1 (?1??2) + ? (3.17d) ? 22 = ? ?2 +??21 (?1??2) ?? ?2 2 (?2??3) + ? (3.17e) ?3 = ?2?3???2 ? 2 (?2??3) + ? (1+ ) (3.17f)?0 where subscripts ? = 1,2,3 corresponds to the vertical level of 200, 500, 800hPa; ?? is the streamfunction; ? (??, ??) is the Jacobian that represents nonlinear advection; ?? is the Rossby radius of deformation for each layer; and ?/?0 is the ratio of topographic height over the scale height. Terms ?1, ?2, ?3 represent the effects of Newtonian relaxation of temperature, drag on the 800-hPa wind, and horizontal diffusion of vorticity and 76 temperature. Terms ?1, ?2, ?3 are location-dependent sources of potential vorticity. Themodel is numerically solvedwith the spectralmethod at T21 resolution (approximately 5.625?). All forcings are tuned so that its large-scale circulation mimics the boreal winter climatology. The model, however, lacks the convection process therefore its tropical circulation is not well represented. Thanks to its ease of use and high (i.e., 6144) total degrees of freedom [Feng et al., 2017], this model has been used a lot in studies about predictability and data assimilation [e.g., Houtekamer and Mitchell, 1998, Feng et al., 2017]. The initial condition of the nature run is acquired by integrating this model for 500 days from one operational analysis randomly selected in the boreal winter. A 4-month nature run is then created with this initial condition. 3.4.2 Observation network Both local (e.g., radiosonde-like) and nonlocal (e.g., 1-d integrated and 3-d integrated) observations are included in the experiments. Their distributions are shown in Fig.3.5. The locations of radiosonde stations are modified based on Miyoshi [2005]. Radiosonde observations include vertical profiles of zonalwinds, meridionalwinds and streamfunction. They are created by adding the uncorrelated Gaussian errors (Table 3.1) to the observation simulations from the nature run. In the first type of observation network (Fig.3.5a), the nonlocal observations are?column-integrated streamfunction, of which observation operator is defined as ? (r) = 3?=1???? (r) ??wh?ere r is the localization of the horizontal grid, with the observation errors defined as 2? ? ? ?2(??)) (Table 3.1). In the second? 77 Table 3.1: Observation error and operator for the local and nonlocal observations in the experiment. Local observations Obs. type ?Obs. e?rror ?(800/500/200 hPa) ??/?? ?4.1/ ?5.4/?7.0 ? ?? 30/ 80/ 260? Nonlocal obervations Scenario ??Ob?s. error Weight ?? (800/500/200 hPa) narrow WF ? (?? ? ? (??))2 0.1//0.8/0.1 broad WF 0.15/0.45/0.4 multiple WF peaks 0.6/0.1/0.3 type of observation network (Fig.3.5b), the nonlocal observations are 3d-integrated streamfunction, which mimics the GPS RO excess phase observations or slant-path radiance simulation for observations with large sensor zenith angles. More details about the experiments assimilating 3-d integrated observations are discussed in Section 3.5e. For both observation networks, nonlocal observations with three different WF shapes (i.e., narrow WFs, broad WFs, or WFs with multiple-peaks) are created by assigning different weights ?? to each levels. Unlike radiosonde stations which are mainly located over the continents in theNorthernHemisphere, nonlocal observations covers the regions uniformly within the latitudes of 60 degrees, which is similar to the satellite-retrieved Level-3 grid products. Both types of observations are generated at 0000UTC and 1200UTC every day when data assimilation is performed. 3.4.3 Data assimilation settings Following Miyoshi [2005]?s LETKF framework, a QG-LETKF system is coupled with the three-layer QG model. The LETKF system performs data assimilation every 12 78 Figure 3.5: Spatial distribution of (a) ?radiosonde-like? observations (blue circle) and 1-dimensional integrated observations (gray plus), and (b) ?radiosonde-like? observations (blue circle) and 3-dimensional integrated observations (gray). The slanted gray lines in (b) show the spatial projection of the 3-D integrated observations while the gray dots indicate the locations of their observation components at 500hPa. 79 hours at 0000UTC and 1200 UTC for 240 cycles (about 4 months). The relaxation to the prior perturbation [Zhang et al., 2004] is adopted to prevent filter divergence due to small ensemble size, with ? = 0.8 for the 8-member experiments. The localization functions with the weighting order ? = 1 are used in the MLVL for these three experiment groups unless notified. Here we focus on comparing the performance between the SLVL and MLVL methods under three scenarios: assimilating nonlocal observations with narrow WFs (group EXP_1D_Snp), with broad WFs (group EXP_1D_Sbp) and with multiple WF peaks (group EXP_1D_Stp). In each experiment group, 7 experiments with vertical localization scale that varies from 0.2 to 0.8 are conducted to find the optimal localization scale for the SLVL and MLVL methods respectively. In addition, two more experiments are conducted in each group as the references: the CTRL experiment that assimilates no nonlocal observations and the noVL experiment that assimilates additional nonlocal observations but without vertical localization. All the experiments in the same experiment group shares the same nature run, observations, and initial conditions which are drawn from climatology. Other configurations are summarized in Table 3.2. Only the last 1.5-month analysis cycles are used for performance comparison. 3.5 Results 3.5.1 8-member experimentswhen assimilating observationswith narrow WFs Figure 3.6 compares the time-averaged analysis RMSE between the SLVL and MLVL methods in the Southern Hemisphere when assimilating nonlocal observations 80 81 Table 3.2: Experiment sets included in different experiment groups. Exp. group name Exp. purpose Exp. included Observations includedradiosonde 1-d 3-d Localization scale obs. nonlocal obs. nonlocal obs. EXP_1D_Snp Compare MLVL and SLVL for CRTL (1 exp) Yes radiosonde H.L.: 2000km (16 exps in total) assimilating 1-d nonlocal noVL (1 exp) Yes Yes obs. V.L.: 0.4 obs with narrow WFs SLVL (7 exps) Yes Yes nonlocal H.L.: 2000kmMLVL (7 exps) Yes Yes obs. V.L.: 0.2-0.8 EXP_1D_Sbp Compare MLVL and SLVL for (16 exps in total) assimilating 1-d nonlocal Same as EXP_1D_Snpobs with broad WFs EXP_1D_Stp Compare MLVL and SLVL for (16 exps in total) assimilating 1-d nonlocal Same as EXP_1D_Snpobs with two WF peaks EXP_3D_Snp Compare MLVL and SLVL CRTL (1 exp) Yes (15 exps in total) for assimilating 3-dnonlocal obs with narrow WFs SLVL (7 exps) Yes Yes Same as EXP_1D_Snp MLVL (7 exps) Yes Yes EXP_3D_Sbp Compare MLVL and SLVL (15 exps in total) for assimilating 3-d Same as EXP_3D_Snpnonlocal obs with broad WFs EXP_3D_Stp Compare MLVL and SLVL (15 exps in total) for assimilating 3-d Same as EXP_3D_Snpnonlocal obs with two WF peaks with narrow WFs (EXP_1D_Snp). Given the limited radiosonde observations in the Southern Hemisphere, the analysis error reduction mainly comes from the nonlocal observations, assimilation of which are strongly influenced by the vertical localization method. The SLVL method requires a small vertical localization scale (0.4) to achieve the smallest RMSE. A larger localization parameter causes increased RMSE. This is because nonlocal observations with narrow WFs behave similarly to local observations. With the small ensemble size, a small vertical localization scale is necessary to achieve the best performance. The increasing RMSE due to the larger vertical localization scale also stresses the necessity of vertical localization even in this simple three-layer model. According to Fig. 3.6, MLVL has similar performance as SLVL over several localization scales. For example, the analysis error of streamfunction at 800hPa is not statistically different at the 5% significance level for these two methods when the localization scale is 0.3 or 0.5. Note that based on our formulation, the MLVL is expected to be comparable to the SLVL when assimilating observations with narrowWFs. Fig. 3.6 also shows that the MLVL with the optimal vertical localization scale generally leads to a smaller RMSE than the SLVL for streamfunction at all three levels. This is also true for the two wind components. A practical issue to achieve the optimal performance with the EnKF system for real- observation experiments is about tuning of localization scales for different observations, which can requires considerable laborious work. Though the MLVL formulation is more flexible, it includes one more tuning parameters (order ? of weighting in Eq.3.13)) than the SLVL. Iterating over a wide range of localization scales as shown here is impractical for real-observation experiments. Therefore it is important to ensure the MLVL method does 82 Figure 3.6: Time-averaged analysis RMSE for the streamfunction (1st row, unit: 106?2/?), zonal wind (2nd row, unit: ?/?) and meridional wind (3rd row, unit: ?/?) in the Southern Hemisphere with the SLVL (blue) and MLVL (red) methods for EXP_1D_Snp. The vertical localization scale that leads to the smallest RMSE for each localization method is indicated by the solid dots. The colored plus signs show which localization method leads to statistically smaller RMSE at the 5% significance level. A missing plus sign indicates no significant difference between two methods at this localization scale. 83 not require significantly more tuning efforts to compete with the SLVL method, otherwise it is less practical for real-observation applications . From Fig.3.6, MLVL always achieves comparable or smaller RMSE than the SLVL at each vertical localization scale except 0.4. 3.5.2 8-member experiments when assimilating observations with broad WFs & multiple WF peaks When assimilating nonlocal observations with broad WFs, the optimal SLVL experiment now requires a large vertical localization scale (0.7) as expected (Fig.3.7). MLVL, however, still achieves a smaller analysis RMSE than the SLVL with a localization scale of 0.6. Unlike the SLVL that causes large analysis errors for the streamfunction when using a smaller localization scale, MLVL seems not to be very sensitive to the localization scale: the analysis error experiences weak fluctuations when the localization scale varies from 0.2 to 0.6. This is because in the MLVL the vertical localization scale is only responsible for determining the local impact of different observational components that constitute the integrated observations. Meanwhile, the whole shape of WF is taken care of by the weighting terms in Eq. (3.13). By contrast, the SLVL does not have separate parameters that explicitly account for the shape of WFs. Once the WF peak is located, its localization scale is set to ensure its symmetric Gaussian-shape function covers the whole WFs. The most interesting case occur when assimilating observations with multiple WF peaks (EXP_1D_Stp). Under this scenario, the optimal SLVL requires a large localization scale (0.7) as it did in the previous experiment group EXP_1D_Sbp. This is because a 84 Figure 3.7: Time-averaged analysis RMSE for the streamfunction (unit: 106?2/?) in the Southern Hemisphere with the SLVL (blue) and MLVL (red) methods for EXP_1D_Snp (1st row), EXP_1D_Sbp (2nd row), and EXP_1D_Stp (3rd row). The vertical localization scale that leads to the smallest RMSE for each localization method is indicated by the solid dots. The colored plus signs show which localization method leads to statistically smaller RMSE at the 5% significance level. A missing plus sign indicates no significant differences between two methods at this localization scale. 85 small scale will force the SLVL to neglect the information from its distant secondary WF peak. Only with a large localization scale can it extract the most information from both peaks and lead to the smallest RMSE for the streamfunction. It is seen that the optimal MLVL is still better than the SLVL. The analysis error for streamfunction at all three levels are about half of the error with the SLVL method. Besides, the MLVL method shows considerably lower errors than the SLVL method over a wide range of vertical localization scales (0.2 to 0.6).Consistent with our hypothesis, the MLVL method shows its advantage over the traditional SLVL method when assimilating observations with broad WFs or multiple WF peaks. 3.5.3 Comparison of the best MLVL and SLVL results Figure 3.8 shows the analysis RMSE for streamfunction evolves for the last 1.5 months when using the optimal SLVL and MLVL vertical localization scales for all three scenarios. For all cases, the CTRL experiment without assimilating nonlocal observations shows the most significant analysis error. Assimilating additional nonlocal observations without vertical localization reduces the error. However, performing SLVL or MLVL can further improve the streamfunction analysis, indicate the necessity of vertical localization. For all three scenarios, the MLVL is consistently better than the SLVL for almost all the assimilation cycles. Figure 3.9 summarizes the statistics for different model variables from the best SLVL and MLVL experiments. In general, the Northern Hemisphere presents smaller RMSE due to the assimilation of denser radiosonde observations. In each hemisphere, the 86 Figure 3.8: Global-averaged analysis RMSE for the streamfunction (unit: 106?2/?) by assimilating nonlocal observations with the optimal SLVL (blue) and MLVL (red) scales, and without the vertical localization (green) for the experiment EXP_1D_Snp (1st row), EXP_1D_Sbp (2nd row), and EXP_1D_Stp (3rd row). The control experiment without nonlocal observation assimilation is indicated by the gray line. 87 optimal MLVL contributes to greater analysis error reduction than the SLVL approach for both streamfunction and wind components. Among the three scenarios, the percentage of error reduction by switching from the SLVL to the MLVL is greatest when assimilating observations withmultipleWF peaks, while the error reduction isminor when assimilating observations with narrow WFs. Figure 3.9: Time-averaged analysis RMSE for the streamfunction (1st row, unit: 106?2/?), zonal wind (2nd row, unit: ?/?) and meridional wind (3rd row, unit: ?/?) with the optimal SLVL (blue) and MLVL (red) scales in the Southern and Northern Hemisphere for EXP_1D_Snp (1st column), EXP_1D_Sbp (2nd column), and EXP_1D_Stp (3rd column). Figure 3.10 examines the spatial locationswhereMLVL shows superior performance than the SLVL. The analysis performance over the continents in the Northern Hemisphere 88 does not differ much for two methods due to the dense radiosonde observations. For the experiment EXP_1D_Snp where the MLVL is expected to show a minor difference from the SLVL, several regions in the Southern Hemisphere and over the ocean exist that present smaller errors using the SLVL method. This is not the case for the experiment EXP_1D_Stp. Regions showing comparable performances are mainly limited to the continents in the Northern Hemisphere. Over the Southern Hemisphere, the MLVL is almost always better than the SLVL, responsible for smaller analysis errors for different model variables. 3.5.4 Sensitivity to the ensemble size and assigned weights in MLVL We then explore how the performance of two methods varies with the ensemble size. The former experiments under three scenarios (EXP_1D_Snp, EXP_1D_Sbp, EXP_1D_Stp) are repeated but now with the ensemble size of 10, 12, and 16. The RTPP relaxation coefficients also decrease from 0.8 to 0.7 when the ensemble size increases from 10 to 16. Figure 3.11 displays the analysis RMSE of the streamfunction at 500hPa with the different ensemble sizes. Several findings from the 8-member experiments are confirmed in those experiments with more ensemble members. For the SLVL approach, it always requires a large vertical localization scale (>0.7) when assimilating observations with broad WFs or multiple WF peaks, while it needs a small scale to localize observations with narrow WFs. Compared with this scenario, changing the vertical localization scale introduces large fluctuations to the analsysis RMSE when assimilating observations with 89 Figure 3.10: Time-averaged analysis RMSE difference for the (a-c) streamfunction (unit: 106?2/?) and (d-f) zonal wind (unit: ?/?) in EXP_1D_Snp. Panels (g-i) are the same as (a)-(f) except for the experiment EXP_1D_Stp. Negative (positive) difference values mean the MLVL (SLVL) leads to smaller RMSE. 90 broad WFs or multiple WF peaks, which indicates the laborious efforts that might be required for localizing these two kinds of nonlocal observations with the SLVL method for real-observation experiments. (a) 500_Snp (b) 500_Sbp 0.7 5 0.6 3 0.5 0.4 1.5 0.9 0.3 0.5 0.2 0.3 0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8 VL scale VL scale (c) 500_Stp 1.5 0.9 SLVL_m10 MLVL_m10 0.5 SLVL_m12 MLVL_m12 0.3 SLVL_m16 0.2 MLVL_m16 0.2 0.4 0.6 0.8 VL scale Figure 3.11: Time-averaged analysis RMSE for the 500hPa streamfunction (unit: 106?2/?) in the Southern Hemisphere with the SLVL (blue) and MLVL (red) methods for (a) EXP_1D_Snp, (b) EXP_1D_Sbp and (c) EXP_1D_Stp for ensemble sizes of 10, 12 and 16. The vertical localization scale that leads to the smallest RMSE for each localization method is indicated by the solid dots. Compared with the SLVL approach, the best MLVL experiment has smaller RMSEs than the best SLVL for different ensemble sizes. The optimal localization scale for the 91 RMSE MLVL is generally smaller than the SLVL method. Another advantage of the MLVL over the SLVL method is that its results are less sensitive to the change of localization scales, which will makes its tuning easier in real-observation applications. Note that with a large ensemble size, the performance difference between these two methods becomes small when dealing with narrow-WF observations, while the performance gap is still clear when assimilating observations with multiple WF peaks. Besides those differences, the MLVL and SLVL methods share some similarities as well. First, their analysis RMSE with the optimal localization decreases as the ensemble size increases. Besides, the optimal vertical localization scale expands as the ensemble size increases. For example, the optimal scale for the SLVL method changes from 0.4 to 0.8 when then the ensemble size varies from 10 to 16 when assimilating observations with narrow WFs. In the MLVL formulation, another tuning parameter is the weight order ? in Eq.(3.13). Its impact (? = 1,2,4) on the final analysis for the multiple-WF-peak scenario (EXP_1D_Stp) is shown in Fig.3.12. Like ? = 1, MLVL with ? = 2,4 has their best results with the same localization scale of 0.3, though using high order of weights (e.g., ? = 4) leads to larger analysis error. Compared to the SLVL, the performance of the optimal MLVL scale with ? = 2,4 is still better the SLVL method. 3.5.5 Experiments with 3-dimensional integrated observations In previous sections, the observation operator for the nonlocal observations is for column-integrated quantities, which serves as the simplified model to mimic the current 92 Figure 3.12: Time-averaged analysis RMSE for the 500hPa streamfunction (unit: 106?2/?) in the Globe using the MLVL methods with the order of weights ? = 1,2,4 (used in Eq. (3.13)) for EXP_1D_Stp. radiance andwater path retrievals. Given the limited vertical levels of the NWPmodel, it is still possible in real-observation applications to assimilate column-integrated observations without vertical localization if using a large ensemble size is utilized [Zhang et al., 2016]. However, nonlocal observations also include those that are integrated signals in the 3-dimensional space (e.g., GPS excess phase [Ma et al., 2009]). Besides, more accurate simulation of radiance observations, especially those acquired with large sensor zenith angles, requires considering the slant path that signal actually transverses in 3- dimensional space [Bormann, 2017]. For 3-dimensional nonlocal observations, how to remove the spatial localization is not well-defined because these observations travel across several horizontal grids in addition to multiple vertical levels . But the SLVL and MLVL approaches are still well defined. It is of our interest to investigate how the SLVL and 93 TOA TOA O3 O3 E E O2 O2 O1 O1 M M S S (a) (b) Figure 3.13: Illustration of (a) SLVL and (b) MLVL for localizing the 3-dimensional integrated observations. This nonlocal observation has its components?1,?2,?3 located at 800hPa, 500hPa, and 200hPa and at three different horizontal grids. Its WF maxima is at ?2. M is the analysis grid to update. MLVL methods behave for this kind of observations. To mimic these 3-dimensional integrated observations, we rev?ise the observation operator of the 1-dimensional column observations as ? (r ,r ,r ) = 31 2 3 ?=1???? (r?) where now the vertical observational component at each level comes from a different horizontal grid (Fig.3.13). These 3-dimensional integrated observations are created together with radiosondes (Fig.3.5b), with the sameweights from three scenarios in Table 3.1 to simulate nonlocal observations with different WF shapes. Figure 3.14 displays the results for assimilating those 3-dimensional integrated observations. The optimal SLVLscale becomes larger (0.6)when assimilating observations with narrow WFs, compared to its optimal value of 0.4 when assimilating column- integrated observations. This is because those 3-dimensional observations now occupy a larger spatial dimension with the same weights assigned to three different observational nodes, which essentially makes the WF broader. For this scenario, the optimal SLVL 94 Figure 3.14: Time-averaged analysis RMSE for the 500hPa streamfunction (unit: 106?2/?) in the Northern Hemisphere with the SLVL (blue) and MLVL (red) methods for EXP_3D_Stp (1st row), EXP_3D_Sbp (2nd row), and EXP_3D_Stp (3rd row). The vertical localization scale that leads to the smallest RMSE for each localization method is indicated by the solid dots. The colored plus signs show which localization method leads to statistically smaller RMSE at the 5% significance level. A missing plus sign indicates no significant difference between two methods at this localization scale. 95 and MLVL scales show comparable performances. For observations with broad WF or multiple WF peaks, the SLVL method requires a large localization scale to achieve the best performance. It is noted that the analysis error with the SLVL method fluctuates significantly as the localization scale changes. By contrast, the optimal MLVL leads to more accurate analysis than the SLVL under these two scenarios. Its performance is also more stable than the SLVL when the localization scale varies, thanks to the additional weighting term in the MLVL formulation that explicitly takes care of the WF shapes. 3.6 Summary When assimilating nonlocal observations in the EnKF, the empirical SLVL approach localizes those observations at their WF peaks with Gaussian-shape functions. While this popular approach can handle well those observations with Gaussian-shape WFs (e.g., AMSU-A observations), it might encounter difficulty when localizing observations with asymmetric and non-Gaussian-shape WFs, which are typical for observations from the infrared hyperspectral sounders (e.g., AIRS, CrIS and IASI). In this study, an MLVL method is proposed to localize those observations better. Unlike the SLVL approach that only focuses on the largest WF peak, The MLVL method explicitly considers the entire shape of theWF. It then calculates the localization value at each grid based on the integrated effects from all its nearby observational components, thus better localizing observations with complex WF shapes. When assimilating point observations, the traditional SLVL approach can be regarded as a special case of the MLVL approach since the MLVL approach is mathematically equivalent to the SLVL formulation for point observations. 96 Application of theMLVL to ideal and real nonlocal observations shows that the localization functions generated by the MLVL method share similar features with those localization functions derived from other independent methods such as the Empirical Localization Function [Anderson and Lei, 2013]. Ideal experiments using amulti-layerQGmodel show that while theMLVLperforms similarly to the SLVL when assimilating column-integrated observations with narrow WFs, the MLVL clearly outperforms the SLVL in analysis error reduction when handling observation with broad WFs or multiple WF peaks. It is also straightforward to apply the MLVL method to those 3-dimensional integrated observations. Compared with the SLVL approach, the MLVL method requires little additional tuning efforts in exchange for better performance, which is desirable for real-observation applications. The MLVL approach can be further improved from two aspects. In this study, the normalized WF of radiance observations is proposed to serve as the weighting terms in the MLVL formulation, but this is not the only choice. The normalized WF quantifies the combined effects of temperature and gas concentrations on the vertical transmittance variations and hence is convenient for localizing all model variables. This shortcut is however suboptimal since different observation-state pair is likely to have different localization function. In addition, WF is not a quantity general enough to be applied to other types of nonlocal observations (e.g., altimetry observations) other than radiance. An alternative candidate serving the weighting term in the MLVL is the Jacobian of the observation operator, with which we can even create different localization functions for different observation-state pairs. It, however, brings new challenges. For example, though the state-of-the-art radiative transfer models (e.g., CRTM or RTTOV) provide routines 97 to calculate the Jacobian, it remains unclear how to generate MLVL functions for wind model states because wind profile (except those near the surface that affect the emissivity calculation) is not involved in the radiance calculation and thus having zero Jacobian. Besides, significant amounts of effort will be required for actual implementation if the tangent linear and adjoint models of the observation operator are unavailable. Another opportunity for improvement for the MLVL is whether we should use the Gaussian-shape function as the basis function in the MLVL for all variables. The Gaussian-shaped function is not an ideal model to localize wind observations, because the correlation associated with winds based on geostrophic balance is neither isotropic nor decreasing monotonically [Daley, 1993]. To further reduce the computational cost in real-observation applications, we can apply the MLVL method only to those observations with complex WFs (e.g., infrared hyperspectral sounders), given that the MLVL is most advantageous over the SLVL for these observations. Encouraged by the current results from the ideal experiments, we will further examine theMLVLmethod with those new questions in our future real-observation experiments. 98 Chapter 4: SystematicComparison ofCoupledDataAssimilationStrategies for a Coupled Quasi-geostrophic Atmosphere-Ocean Model 4.1 Introduction Interest in Coupled Data Assimilation (CDA) is accumulating given its ability to produce coherent and balanced coupled-state analyses [Zhang et al., 2007, Sugiura et al., 2008], and introduced weaker initialization shocks than separate atmosphere and ocean analysis when initializing the subsequent forecast [Mulholland et al., 2015]. Based on the definitions by Penny et al. [2017], The CDA strategies can be generally classified into four types: ? quasi-weakly coupled DA (Quasi-WCDA) where the analysis for each earth component is independently obtained through uncoupled models, ? weakly-coupled DA (WCDA) where the analysis for each earth component is still obtained independently though the trajectories of all the earth components are from the integration of a coupled model, ? strongly-coupled DA (SCDA) in which the error covariance among different earth components are explicitly included in the formulation so that the same full observation datasets can be assimilated into different earth components to generate 99 the coherent, balanced coupled-state analysis. ? quasi-strongly coupled DA (Quasi-SCDA), which is similar toWCDA in the sense that neither of them includes the cross-domain error covariance in the formulation. However, observations from one earth component in the Quasi-SCDA system can influence other earth components through pathways other than coupled-state error covariance. 4.1.1 EnKF-based CDA All CDA systems can be categorized into two classes based on formulation of the underlying assimilation methods: the variational- or EnKF-based assimilation systems. Different CDAS strategies have been examined with the ensemble assimilation systems given the intrinsic simple structure of the EnKF. As early as 2007, Zhang et al. [2007] developed an EAKF-based WCDA system for the GFDL?s fully coupled climate model CM2. Through OSSE they found that the EnKF analysis could synchronize toward the nature run through WCDA. The EAKF analyses also correctly recover the ocean heat content variability and trend globally. Singleton [2011] compared WC and SC LETKF with the SC 4D-Var for a coupled Lorenz-63 model. They concluded that SC 4D-Var shows better performance when utilizing a long assimilation window. By adopting the Quasi-outer-loop and Running-in-place [Kalnay et al., 2007b,a, Yang et al., 2012], the SC ETKF could generate comparable or better analysis than the SC 4D-Var. Liu et al. [2013] examined the impact of different covariance couplings on the EnKF analysis for a simple coupled climate model based on the atmosphere Lorenz model 100 [Lorenz, 1963] and an ENSO oscillator model [Jin, 1997]. They found that SCDA leads to smaller analysis error than the WCDA and other one-way coupled CDA. Assimilating the high-frequency synoptic atmosphere observations into the coupled model improves the atmospheric variability and reduces the error in the surface forcing for the ocean model, which in turn improves the ocean analysis and forecast. Besides, assimilating the atmosphere observations directly into the ocean improves the ocean analysis through the cross-domain background error covariance over midlatitude, where the ocean variability is mainly driven by the atmosphere. Over tropics where the ocean drives atmosphere, ocean observations play a more important role than atmosphere observations in constraining the coupled model. Han et al. [2013] coupled Lorenz-63 model with a pycnocline ocean model, and then conducted ensemble CDA experiments with this coupled model. They showed SCDA is superior to WCDA only if using a large ensemble size, which is essential for an accurate coupled-state error covariance estimation. Besides, they also found that it is difficult to constrain the fast mode (i.e., atmosphere) by assimilating slow mode (i.e., ocean) observations. Sluka et al. [2016] coupled the LETKF with the intermediate coupled general circulation model SPEEY/NEMO. They explored the benefits of SCDA over WCDA by assimilating atmosphere observations into the ocean model through their SC LETKF. Their results show SCDA leads to more accurate ocean analysis in additional to the improved near-surface atmosphere variables such as temperature, humidity and zonal wind. Following this study, Sluka [2018] developed the coupled assimilation system CFSv2-LETKF, and conduct real-observation SCDA experiments with 50 members to 101 assimilate marine surface reports into ocean and atmosphere. They found that SCDA improves the observation fitting for the ocean surface temperature over the globle, with the maximum RMS difference reduction of 13.0% in the Northern Hemisphere. A further investigation showed that while the SCDA improves the observation fitting over tropics and the upper ocean layer in the northern hemisphere, SCDA degrades the observation fitting under 20-meter depth in the northern hemisphere, which they attributed to the spurious correlation introduced by small ensemble size. To mitigate the inferior performance of SCDA due to small ensemble size [Liu et al., 2013, Han et al., 2013], Yoshida and Kalnay [2018] proposed a correlation cutoff method to determine which observations should be assimilated by the SCDA EnKF. They calculated the correlation strength between different observation-state pairs from offline experiments, and only assimilated observations that show strong error correlation with each state variable. Their experiment with the coupled Lorenz model [Pe?a and Kalnay, 2004] showed that the ensemble analysis guided by their correlation cutoff method leads to improved coupled-state analyses than both lean WCDA and SCDA even with a small ensemble size. Yoshida [2019] further proved the effectiveness of the correlation cutoff method by applying it to an intermediate general circulation model FOAM. In their latest development, a neural network is trained from offline experiments to learn the localization pattern for different state-observation pairs. This well-trained neural network is then used in independent SC EnKF experiments to automatically predict the localization values for different observation-state pairs. Compared with the look-up table approach, their neural network approach significantly simplified the procedure for preparing the localization function required by their correlation cutoff method, which is more feasible for real- 102 observation experiments. Lu et al. [2015a,b] developed a leading averaged coupled covariance method for the SCDA. Like Tardif et al. [2014, 2015], they assimilated time-averaged atmosphere observations to constrain the slow mode (i.e., ocean) of the coupled model. They also assimilated time lagged atmosphere observations when the ensemble correlation is the largest between the atmosphere and ocean state variables in time. Experiments with the simple [Lu et al., 2015a]and state-of-the-art coupled general circulation model [Lu et al., 2015b] showed that SCDA combined with their method outperforms the WCDA and SCDA. 4.1.2 Variational-based CDA While many ensemble CDA studies have explored the benefits of SCDA overWCDA and developed new techniques to improve the SCDA, such exploration is less frequent for the variational CDA, given the intrinsic complexity associated with the variational method. For example, the tangent linear and adjoint models are two compulsory components for the incremental 4D-Var formulation. Their coding developments for the uncoupledmodels are time-consuming and require tremendous human resources. The coupler included in the coupled model imposes additional technical difficulty constructing those two components. Besides, it is not uncommon for atmosphere and ocean models to adopt different meshes within the same coupled model. Due to these mesh differences, atmosphere and ocean analysis systems adopt different approaches to impose their balance operator and perform spatial filtering. For example, atmosphere analysis systems usually adopt the spectral 103 transform or the recursive filter since many atmospheric models adopt the Gaussian (e.g., GFS) or the equidistant grids. Meanwhile, ocean analysis systems tend to utilize the Laplacian operator for its background error modeling, given its ease of dealing with topographies. Therefore, the cross-domain background error modeling through different meshes is another challenging issue for variational CDA. Given those challenges, most variational CDAsystems adopted theWCDAapproach. For example, the Climate Forecast System Reanalysis [CFSR, Saha et al., 2010, 2014] developed at NOAANational Centers for Environmental Prediction (NCEP) generated the forecast by utilizing the coupled model that includes the atmosphere GFS and the ocean MOM4 models while producing individual 3D-Var based analysis for its atmosphere and ocean components. Lea et al. [2015] presented the first implementation of weakly-coupled atmosphere-ocean-land assimilation system by Met Office. In that system, the reference trajectory is first produced by a coupled model, followed by a separate analysis done for each earth component with different methods (e.g., 4D-Var for the atmosphere and 3D- FGAT for the ocean). The analysis increment is then added back to the reference trajectory to initialize the subsequent assimilation cycles. Their results show that this preliminary WCDA system without tuning generates comparable analysis as their existing operational Quasi-WCDA system. Sugiura et al. [2008] developed the tangent linear and adjoint model for a fully- coupled general circulation model so that implicit cross-domain error covariance could build gradually through the integration of tangent linear and adjoint models for the full coupled model. Besides, initial ocean conditions and the bulk adjustment factors of the momentum fluxes, sensible heat, and latent heat are used as the control variables in their 104 system. The only missing component of this system from an SCDA variational one is its lack of coupled-state error covariance. Results show their system exhibits lower model bias thanks to their inclusion of the factors controlling surface flux as the additional control variables. Besides, this new system also shows better forecast skills on events such as El Nino, the Indian Ocean Dipole, and the Asian summer monsoon. Though Sugiura et al. [2008] approach showed many benefits, the development of the tangent linear and adjoint models for the fully-coupled model imposes too many technical challenges that very few CDA systems have adopted this approach. Perhaps themost special variationalCDAsystemdesign is from theCERAdeveloped by ECMWF [Laloyaux et al., 2016, 2018]. The quasi-SCDA system CERA first used a coupledmodel to generate the reference trajectory within each outer loop. It then generates incremental 4D-Var analysis for the atmosphere and 3D-FGAT analysis for the ocean by utilizing the latest forcing from the reference trajectory in the same outer loop. Though CERAdoes not explicitly utilize the coupled-state error covariance, its outer-loop-coupling approach allows additional forcing updates between the atmosphere and ocean during the optimization procedure, which does not exist for WCDA systems. It remains unclear how much improvement this outer-loop-coupling approach can bring compared with SC variational systems. Besides studies based on those operational variational CDA systems, Smith et al. [2015, 2017, 2018, 2020] systemically compared the response of variational systems to different CDA strategies with a single-column coupled model. Smith et al. [2015] showed that both the WCDA and SCDA generate more balanced initial analysis fields that introduce less intense initial shock and thus lead to better forecast than the UCDA. By 105 studying the structure of the coupled-state error covariance calculated from an ensemble of 4D-Var analysis, Smith et al. [2017] concluded that the structure and strength of cross- domain components show significant diurnal and seasonal variabilities. Smith et al. [2018] introduced two reconditioning methods to solve the problem caused by the large condition number of the coupled-state background error covariance, which lets the minimization algorithm inside the variational system fail to converge. Through single-observation experiments, Smith et al. [2020] demonstrated the necessity to explicitly include the cross-domain error covariance for the SC 4D-Var, since the implicit cross-domain error covariance introduced by the tangent linear and adjoint model is not enough to generate a balanced coupled-state analysis. 4.1.3 Objectives of this study We can see that the SCDA for the variational method has not been explored as thoroughly as those for the ensemble method. Besides, very few comprehensive studies [e.g, Singleton, 2011] have compared the performance of these twomethods under different CDA strategies, though such comparison studies have been extensively conducted for the uncoupled model [Singleton, 2011, Yang et al., 2009, Lorenc, 2003, Lorenc et al., 2015] to understand the merits of each method. Besides the interest in seeing performance differences between variational and ensemble methods on CDA, it is also an interesting question how well CERA?s outer- loop-coupling approach is compared with the SC approach that explicitly utilizes the cross-domain error covariance. If CERA is suboptimal, under which circumstance does 106 the suboptimality of CERA occur? In this study, we developed a CDA testbed, MAOOAM-CDAS, which includes variational and ensemble methods with different CDA strategies for the coupled quasi- geostrophic atmosphere-oceanmodelMAOOAM.WithMAOOAM-CDAS,wewill pursue the answers to the following three major questions: 1. How do the variational and ensemble methods for the coupled system respond to different CDA strategies (i.e., UC, WC, SC)? 2. Is there any substantial performance difference between the variational and ensemble methods for the coupled system? If so, then under which circumstance does the difference occur? 3. How is the CERA?s outer-loop-coupling approach compared with the WCDA and SCDA for variational systems? The structure of this chapter is organized as follows. Section 4.2 reviews the formulation of different data assimilation methods used in this study. Section 4.3 describe the quasi-geostrophic coupled model MAOOAM and the coupled assimilation system MAOOAM-CDAS that we developed. Experiment designs are also discussed in this section. Section 4.4 systematically compares the results of different assimilation methods with varying CDA strategies and observation networks. Section 4.5 gives the discussion and summary of this study. 107 4.2 Method: 3D-Var, 4D-Var and ETKF We constructed a CDA test bed that includes both variational and ensemble methods under different CDA strategies. Here we briefly review the principles of the 3D-Var, 4D- Var, and ETKF. 4.2.1 3D-Var Given the background model state x? and the observations y? at time ??, we compute the analysis x = x? that minimizes the cost functional, ? (x) 1= (x?x?)?B?1(x?x?) + 1 (y? ? ?(x))?R?1(y? ? ?(x)) (4.1) 2 2 where B and R are the background and observation error covariance matrices and ?(. . . ) is the observation operator. By linearizing the observations operator ?(x) ? ?(x?) +H(x? x?), and introducing a change of variables ?x = x?x? and d = y? ? ?(x?), we can convert Eq. (4.1) into the incremental form ? ( 1?x) = ?x?B?1 x+ 1? (d?H?x)?R?1(d?H?x) (4.2) 2 2 Direct calculation of ? (?x) in this form requires computing the inverse of B. This presents challenges for high dimensional systems and particular difficulties for coupled atmosphere- ocean dynamics for which the condition number of B can be as large as 1016 due to the wide range of resolved scales. 108 We find x? using an iterative minimization algorithm. The convergence rate is related to the conditioning of the Hessian matrix with respect to the control variable ?x, S = B?1 +H?R?1?x H (4.3) To address this conditioning problem, we perform the control variables transform (CVT) as described by Bannister [2008a,b], L?u = ?x (4.4) which converts Eq. (4.3) to ? (?u) 1 1= ?u??u+ (d?HL?u)?R?1(d?HL?u) (4.5) 2 2 with the gradient, ???u = ?u?L?H?R?1(d?HL?u) (4.6) The Hessian matrix with respect to the new control variable u is S?u = I+ (B1/2)?H?R?1HB1/2 (4.7) with an upper bound for the condition number, ???? (S ?1/2 ? ?1/2??? ) = 1+ ||R HBH R | |? (4.8) 109 which is generally much smaller than cond(S) [Haben, 2011]. The use of a CVT is common in operational system [Bannister, 2008a,b]. 4.2.2 Strong-constraint incremental 4D-Var The cost function of the strong-constraint 4D-Var with initial condition x0 is (x ) 1 (x ?x?)?B?1(x ?x?) + 1 ??? ? = (y? ? ?(? (x )))?R?1 ?0 0 0 0 0 ? ? 0 ? (y? ? ?(?? (x0))) (4.9)2 2 ?=0 where?? (x0) is themodel forecast at time ??. We adopt the incremental approach [Courtier et al., 1994] to solve Eq. (4.9) where we seek a series of states x( ?)0 , ? = 0,1,2, . . . , ? ? . For each x( ?)0 , we determine the increment x ( ?) ? 0 that minimizes ( ?) [( x ) 1= x( ?) ? (x? ?x( ?) ]? ?1 [ ]? ? 0 ??? [0 0 0 ) B ]?x ( ?) ? ( ?) 2 0 ?[ (x0 ?x0 )? ] (4.10) + 1 d( ?) ?H M | ?x( ?) ?R?1 d( ?) ?H M | ( ?) 2 ? ? ? x( ?) 0 ? ? ? ? x( ?) ?x0 ?=0 0 0 where d( ?) = y? ? ( ( ?)? ?? (x0 )) and M? |x( ?) is the tangent linear model of ( ?) ?? (x0 ). We? ? 0 set the next state x( ?+1) = x( ?)0 0 + ?x ? 0 and repeat the iteration in an outer loop, until the algorithm reaches a minimum error threshold. As for 3D-Var, we apply the CVT so that the cost function becomes ( u( ?)) 1 ( )? ( ) ? ? 0 = ???u ( ?) (0 ? u ?,( ?) ( ?) ?,( ?) ? 0 ?u0 ? ?)u2 0? ( ) (4.11) + 1 ( ? d ?) ?H M | L u( ?) ?1 ( ?)? ? x( ?) ? 0 R? d ?H M | L u ( ?) ? 2 ? ? ? ( ?)0 0 ? x0 0 ?= 110 where ( ) L u?,( ?)? 0 = x ? ( ?) 0 ?x0 (4.12) and the gradient of ? is ( ) ? ( ) ? = u( ?) ( ) ?? ? u?, ? +L? M? | H? ?1 ( ?) ( ?)? u( ?) ? 0 ? 0 ? x( ?)? ? R? d ?H?M? ? |x( ?)L?u0 (4.13)0 0 0 ?=0 4.2.3 ETKF Ensemble forecasts are used to form a dynamic estimate of the background error covariance matrix for the ETKF [Bishop et al., 2001, Hunt et al., 2007], P? 1= ? ? ? ? ? X (X ) (4.14)1 where each column of X? is an ensemble perturbation and ? is the total number of ensemble members. Assuming that the analysis error covariance matrix is of the same form, we seek a transform matrix W applied to X? so that X? = X?W. Given that P? = [I+PbHTR?1H]?1P? [Hunt et al., 2007, equation (11)], we have P? 1 [ X?WW? (X?)? I+ 1 ]?1 = = ?? ? X (X ?)?H?R?1H 1? X ? (X?)? (4.15) ? 1 ? 1 ? 1 Further letting Y? = HX?, 1 [ ]? ? X I+ 1 (Y?)?R?1? Y ? WW? (Xb)? 1= ?? X (X ?)? (4.16) ? 1 ? 1 ? 1 111 We can simplify so that [ ] I+ 1 (Y?)?R?1Y?? WW ? = I (4.17) ? 1 Because (Y?)?R?1Y? is symmetric and positive semidefinite with at least one eigenvalue of 0, I+ 1 (Y)?R?1?1 Y ? is symmetric and positive definite and so is invertible, ? [ ]?1 WW? = I+ 1? (Y) ?R?1Y? (4.18) ? 1 We compute the symmetric square root, [ W I+ 1 ] = (Y)?R?1 ?1/2 ? ? Y (4.19)? 1 and then update the ensemble perturbations as X? =X?W and the analysis ensemble mean as [ ]?1 x?? = x?? +X? (? ?1)I+ (Y)?R?1Y? (Y?)?R?1(y? ? ?(x??)) (4.20) Considering the small size of the model under investigation, we do not apply localization in this study. 112 4.3 Experiment design 4.3.1 Coupled atmosphere-ocean model The coupled quasi-geostrophic atmosphere-oceanmodelMAOOAM [DeCruz et al., 2016] is adopted in the study as the forward model upon which different assimilation systems are constructed. MAOOAM couples a two-layer quasi-geostrophic atmosphere dynamically and thermo-dynamically with a shallow-water ocean layer, mimicking the coupled systems at midlatitudes [Vannitsem and Lucarini, 2016]. All the experiments in this study adopt the same low-order configuration as in Vannitsem and Lucarini [2016], with the spectral truncation at the 2? 2 atmosphere and 2? 4 ocean. The state vectors with this low-order configuration are represented by 36 nondimensionalized coefficients of different spectral modes, with the first 20 elements representing the atmospheric states and the latter 16 for the oceanic states. Besides the forward model, MAOOAM also includes analytical tangent linear and adjoint models for this coupled model, which brings tremendous convenience when building the 4D-Var system for MAOOAM. Unlike real coupled atmosphere-ocean models that utilize couplers to exchange the surface fluxes between the atmosphere and ocean components at every coupling step, MAOOAM includes no coupler, and it exchanges the cross-domain information directly at every model integration step, which can be written formally as ???? ? ? ??x(?+1)???? ??? ? ? ?x (?)??? ? ?? ?? =M? ( ? ?) (4.21)x(?+1) ? (?)?? ??x? ?? 113 Where x(?)? and x(?)? are respectively the atmospheric and oceanic states at time step ?, and M? is the nonlinear forward operator. The lack of the ability for the uncoupled MAOOAM simulation with prescribed forcing results in technical difficulty emulating the uncoupled data assimilation scenario, which requires integrating the two uncoupled models with the given forcing. We adopt the following approach to emulate the uncoupled scenario in this study. To get the atmospheric states with the uncoupled atmosphere model x(?+1)? , we integrate the following model ???? ? ? ????? x(?+1)???? ?? ? ?x (?) ? ? =M ( ???? ? ?? ? x?(?+1)? ? ?) (4.22)x?(?)?? ? ?? where x?(?)? is the prescribed ocean state that is kept unchanged unless reaching the stage for forcing update. Similarly, we can get the uncoupled ocean state x(?+1)? with the prescribed atmosphere x?(?)? . The tangent linear and adjoint model are then developed by setting the non-prescribed state variables as active variables. 4.3.2 Data Assimilation settings ACDA testbedMAOOAM-CDAS is developed for themodelMAOOAM. It includes 3D-Var, 4D-Var and ETKF assimilation systems with different coupling strategies (i.e., UCDA, WCDA, and SCDA). Note that when no integrated observations for the coupled states are assimilated, the cost function of the coupled states for the UC andWC variational systems can be decoupled into two individual cost functions, with one for atmosphere and the other for ocean. The actual implementation of UC andWC variational systems follows 114 this approach. For the variational system, the coupled-state background error covariance matrix is calculated based on ensemble perturbations from a 40-member SC ETKF over a total of 1?105 MTUs, with the first half discarded as spin-up, B0 = ? 1 ? ? X (X ?)? ? (4.23) ? 1 A scalar multiplier ? is applied to empirically tune the amplitude of the background error covariance so that it minimizes the analysis error for the variational systems. Several studies [Singleton, 2011, Laloyaux et al., 2016, Smith et al., 2017] reported that such tuning is necessary to achieve better analysis quality of the variational systems. The final coupled-state error covariance matrix used in the variational system can be explicitly expressed as ?? ??? ??B B ? ?? ??? B = ?B0 = ?? ??? (4.24)B??? B???? The UC and WC variational system only uses the single-domain background error covariance matrix (i.e., B??, and B??) to calculate individual atmospheric and oceanic analysis while the SC variational system utilizes the coupled-stateB. All those background error covariance matrix are re-calculated once the observation network is changed. Besides 3D-Var and 4D-Var, we also developed a variational system similar to the ECMWF CERA, which utilizes 4D-Var for the atmosphere and 3D-FGAT for the ocean while the coupled model is integrated to generate reference trajectory within each outer loop. With this system, we will also be able to explore how the outer-loop-coupling approach is compared with those SC variational systems. 115 The nature run is a model integration with the total length of 1 ? 105 MTUs. Observations are created by adding the nature run with the Gaussian random noise, of which magnitude is 10% of the natural variability for each state variable. The data assimilation is conducted every 2.5MTUs (~6 hours). We adopt two observation networks: observations for both atmosphere and ocean, and observations for atmosphere only. The initial conditions for 40-member ETKF are randomly drawn from the model trajectory before the nature run. Their ensemble mean is used as the initial conditions for all the variational systems. We also investigate how the flux errors influence the quality of the analysis in the UC assimilation by updating the prescribed states from the truth with different frequencies (6 hours to 3 days). 4.4 Results 4.4.1 Background error covariance and correlation matrix for the coupled states Before examining SCDA for variational methods, we must first calculate a coupled- state background error covariance B. To do this, we use a 40-member ETKF with a 6-hour analysis cycle to build the statistics for this matrix B, and the result is shown in Figure 4.1(a). We can see that there is greater variability in the short-range forecast errors in the atmospheric components than its oceanic counterpart. The magnitude of the atmospheric component of B is significantly larger than the ocean component, or the cross-domain ocean-atmosphere blocks (i.e., off-diagonal blocks of B in Figure 4.1). The matrix B and its corresponding Hessian matrix S consequently have large condition numbers on the 116 cond=7.8e+15 -8?10 cond=190 3 1 10 2 10 0.5 20 1 0 20 0 -0.5 30 30 -1 10 20 30 10 20 30 Figure 4.1: The climatological background covariance matrix B (left) and its corresponding correlation matrix C (right). The atmospheric (1-20) and oceanic (21- 36) state vectors are separated by the black lines. order of 1016. This large condition number leads to the convergence issue and breaks down the minimization algorithms that are commonly used for variational assimilation systems (e.g., Limited-memory BFGS algorithm). We have made several attempts to fix this issue, including the reconditioning method for B [Smith et al., 2018] and the control variable transform (CVT) technique [Bannister, 2008a,b]. We found that CVT is effective in solving this problem. After applying CVT, the condition number of the Hessian matrix S??? for the transformed control variables decreases to 1.0003, and the minimization algorithm can then work successfully without the convergence issue. A further decomposition ofB=D1/2CD1/2 into the diagonal error standard deviation matrix D1/2 and the error correlation matrix C uncovers the error coupling between the atmosphere and ocean states. In Figure 4.1(b), nonzero correlations are observed for the cross-domain blocks in matrix C, with their magnitudes similar to those elements from the diagonal atmosphere or ocean error blocks. Those nonzero correlations imply that with SCDA, the ocean (atmosphere) state can be adjusted by the atmospheric (oceanic) 117 observations through the information transfer by the matrix B. Meanwhile, the WCDA cannot take this advantage because those off-diagonal blocks are assumed to be zero in a WCDA system. 4.4.2 Comparison of CDA strategies for 3D-Var We first compare the performance of 3D-Var with different CDA strategies under a full observing network. For the UCDA experiment, the uncoupled forward models are driven by three kinds of surface forcing: perfect forcing updated every ?? = 10 MTU (~1 day), perfect forcing updated every 30 MTU (~3 days), and constant forcing using the climatological mean. The varying forcing interval allows us to examine the impact of increasing error in the flux fields. The analysis root-mean-square error (RMSE) for the last 103 MTU (2? 103 steps, ~100 days) out of 105 MTU is shown in Figure 4.2. All experiments produce similar analysis RMSE for the atmosphere, except the uncoupled atmosphere forced by the climatological mean ocean, which exhibits slightly larger RMSE with itsmagnitude of the same order as theRMSE from the SCDAandWCDAexperiments. For the forced ocean model scenarios, improved accuracy can be obtained for the ocean analyses with a more frequent update of the atmospheric forcing. However, the smallest analysis RMSE from the UCDA experiment is still one order larger than the RMSE from the SCDA and WCDA experiments. We then compare the temporal averaged analysis RMSE for the last 4? 104 MTU (~11.0 years, Table 4.1). In general, the RMSE of the WCDA or SCDA systems are smaller than the RMSE of the forced UCDA systems. For both atmosphere and ocean, the 118 Table 4.1: Average analysis RMSE (Units: 10?5) of 3D-Var Over the last 4? 104??? (~11.0 Years) and of the ETKF after 2,500 MTU (for the remaining ~26.7 years) 3D-Var ETKF UCDA WCDA SCDA UCDA WCDA SCDA Atmosphere 116.0 116.0 115.9 41.37 42.10 40.72 Ocean 28.72 5.516 4.915 4.510 2.784 1.318 SCDA 3D-Var achieves the minimum RMSE, slightly improved against the RMSE for the WCDA 3D-Var. Table 4.1 also shows that the atmosphere analyses from the WCDA and SCDA are only marginally improved over that from the UCDA, while the ocean analysis RMSE from these two systems are one order smaller than that from the UCDA, as shown in Figure 4.2(b). An additional benefit of applying CDA with 3D-Var is that the coupled system experiences amore rapid spin-up than the uncoupled scenarios (Figure 4.3). Approximately 16 days are required for the uncoupled assimilation system to stabilize the atmosphere RMSE,while this spin-up period extends to ~16.4 years for the uncoupled ocean assimilation system, even when the ocean model is forced daily with true atmospheric forcing. The WCDA and SCDA systems require only ~4 days for the atmosphere and ~6.8 years for the ocean to achieve a comparable spin-up, with the SCDA RMSE slightly smaller than the WCDA RMSE during the spin-up period. 4.4.3 Comparison of CDA strategies for ETKF We next investigate the ETKF under different coupling scenarios. We use an ensemble size of 40, which is larger than the degrees of freedom of the coupled model. Given this ensemble size, no covariance inflation is applied for the ETKF experiments 119 -3 ?10 4 WC SC UC_clim UC_3days UC_1day 3.5 3 2.5 2 1.5 1 0.5 0 1.98 1.984 1.988 1.992 1.996 2 Time step 5?10 -2 10 -3 10 -4 10 -5 10 1.9 1.92 1.94 1.96 1.98 2 Time step 5?10 Figure 4.2: The analysis RMSEof the atmosphere (top) and ocean (bottom) for the strongly coupled (gray), weakly coupled (red), and uncoupled 3D-Var with forcing provided by the climatological mean (blue), or updated forcing from truth every 30 MTU (~3 days, green) and every 10 MTU (~1 day, orange). Results are shown for the last 1,000 MTU for the atmosphere and last 5,000 MTU for the ocean. The WC RMSE for the atmosphere is nearly identical to the UC_1day RMSE. 120 Analysis RMS error Analysis RMS error Figure 4.3: The analysis RMSE for the atmosphere (top) and ocean (bottom) during the spin-up period for strongly-coupled 3D-Var (red) and weakly-coupled 3D-Var (gray). Note that 25 time steps correspond to ~1 days. For the atmosphere, the RMSE from the uncoupled analysis forced by the ocean climatology (blue), or with the ocean states updated every day (orange) or every 3 days (green) are also shown. 121 0 10 freerun WC SC UC_3days UC_1day UC_6hours -1 10 -2 10 -3 10 -4 10 -5 10 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 Time step 5?10 -3 10 -4 10 -5 10 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 Time step 5?10 Figure 4.4: The analysis RMSE of the atmosphere (top) and ocean (bottom) for the weakly coupled (red), strongly coupled (gray), and uncoupled ETKF with updated forcing from truth every 30 MTU ( 3 days, green), 10 MTU (~1 day, orange), and 2.5 MTU (~6 hr, purple). A 141-step moving average filter is applied. For the atmosphere, RMSE for the WC and SC cases follows closely with the UC_6hours case but exhibits lower RMSE in the ocean. 122 Analysis RMS error Analysis RMS error since the experiment does not encounter filter divergence. For the forced model cases, perfect forcing is updated every ?? = 2.5, 10, or 30 MTU (or about 6 hours, 1 day, and 3 days). Figure 4.4 shows the temporal-evolving analysis RMSE with different CDA methods. Unlike 3D-Var, the forced atmosphere ETKF is sensitive to the frequency of the forcing update: the UC ETKF experiences filter divergence with forcing intervals of 1 day or greater (Figure 4.4). This divergence problem of ETKF implies that ETKF is more vulnerable to surface flux errors than 3D-Var. We note that inflating the atmospheric background error covariance or perturbing the oceanic forcing at each analysis cycle helped to reduce the occurrence of divergence caused by errors in the surface forcing supplied to the forced atmosphere. The ETKF converges when using the SCDA and WCDA, and SCDA consistently produces smaller RMSE in the oceanic component than WCDA. Such consistent RMSE reduction due to the switch from WCDA to SCDA is not present when using 3D-Var. For the atmosphere analysis, the CDA ETKF systems generate atmospheric analyses with its RMSE similar to the uncoupled ETKF with frequently updated ((~6 hours) perfect forcing. Smith et al. [2015] found that for a 1-D simplified coupled column model using incremental 4D-Var, WCDA was usually comparable to UCDA experiments in which the atmosphere and ocean models were forced using accurate surface fluxes. Here we find a similar result and add that the stability is degraded in the forced system only when reducing the forcing accuracy (e.g., by extending the forcing time window). Figure 4.5 compares the spin-up speed among SC 3D-Var and WC/SC ETKF. The three CDA strategies require a similar spin-up period (~4 days) to achieve the optimal atmosphere analysis, while the RMSE fromWC/SC ETKF is smaller than the RMSE from 123 SC 3D-Var. The ocean analysis is a completely different story: WC/SC ETKF experience a spin-up period (~40 days) much shorter than the one (~5.5 years) by the SC 3D-Var due to their flow-dependent background error covariance. Table 4.1 lists the average analysis RMSE of 3D-Var and ETKF under different scenarios. The forced UC ocean ETKF provides lower accuracy analyses than the WCDA and SCDA ETKF even when the forcing is updated with perfect atmospheric conditions as frequently as ~6 hours. Unlike 3D-Var, where the ocean RMSE from SCDA is similar to that from the WCDA after the spin-up period, the SC ETKF consistently produces higher-accuracy ocean analyses with RMSE about half of the magnitude of the WC ETKF even after the spin-up period. 4.4.4 Comparison of SCDA methods for full observation network Having explored the transition from a forced to fully coupled forecast model using three CDA strategies, we now compare various DA methods strictly using SCDA. We first focus on the scenario with a full observing network (Figure 4.6). We use a 40-member ETKF without inflation. For 3D-Var and 4D-Var, the background covariance matrix B has a significant impact on the accuracy of final analysis. Through empirical tuning, we find that inflating the normalized background error covariance matrix B with a constant factor of 10 (i.e., B3???? = 10?B????) and 2 (i.e., B4???? = 2?B????) lead to the most accurate 3D-Var and 4D-Var analysis respectively. Note that the finding that 4D-Var can utilize a more accurate background error covariance matrix than 3D-Var so that a smaller inflation factor of B is needed for 4D-Var was also reported for a low-order coupled Lorenz system [Singleton, 2011]. 124 Figure 4.5: The analysis RMSE for the atmosphere (top) and ocean (bottom) during the spin-up period for strongly-coupled 3D-Var (blue), weakly-coupled ETKF (gray), and strongly-coupled ETKF (red). Note that 25 time steps correspond to ~1 days. For the atmosphere, all three methods finish spin-up after ~250 time steps( ~10 days.). For the ocean, it takes strongly-coupled 3D-Var ~5.5 years (~5?104 time steps) to finish spin-up while both strongly-coupled and weakly-coupled ETKF finishes the spin-up within 40 days (~0.1?104 time steps). 125 -3 ?10 4 3D-Var: 1.15e-03 4D-Var: 4.66e-04 3.5 ETKFm40: 4.18e-04 ETKFm20: 3.54e-04 3 CERA: 4.67e-04 2.5 2 1.5 1 0.5 0 1.8 1.84 1.88 1.92 1.96 2 Time step 5?10 -4 ?10 1.4 3D-Var: 4.897e-05 4D-Var: 2.240e-05 1.2 ETKFm40:1.464e-05 ETKFm20: 2.150e-05 1 CERA: 2.349e-05 0.8 0.6 0.4 0.2 0 1.8 1.84 1.88 1.92 1.96 2 Time step 5?10 Figure 4.6: The analysis RMSE when using a full-coverage observing system for the atmosphere (top) and ocean (bottom) with SCDA 3D-Var (green), 4D-Var (blue), 4D- Var/3DFGAT CERA (cyan dash), 40-member ETKF (red), and 20-member ETKF with constant inflation of 1.01 (gray). A 61-step running average is applied for both panels. The analysis RMSE for the last 5?104 MTU (~13.7 years) is shown next to the label for each method. For the atmosphere, the RMSEs for 4D-Var and CERA are largely overlapped. 126 Analysis RMS error Analysis RMS error For all DA methods, the analysis RMSE for the atmosphere is 1 order of magnitude greater than the RMSE for the ocean. The 3D-Var produces the largest RMSE among all three DA methods. While 4D-Var and ETKF shares comparable atmospheric RMSE, the RMSE of the ETKF for the ocean component is generally smaller than 4D-Var when the analysis window is 6 hours. The performance of 4D-Var can be further improved by increasing the length of assimilation windows [Smith et al., 2015, Kalnay et al., 2007b]. We also conducted 4D-Var experiments with a 12-hour or 24-hour assimilation window, and results show that 4D-Var with 24-hours can further reduce the analysis RMSE, with its magnitude similar to the 40-member ETKF. For real observation experiments, the ensemble size is usually significantly smaller than the degrees of freedom of the forward model. Here we also tested the performance of 20-member SC ETKF while applying constant multiplicative inflation with 1.01. The results show that the full coupled-state RMSE for the 20-member ETKF is comparable to the RMSE of the 4D-Var and 40-member ETKF. An assimilation system like the ECMWF CERA configuration [Laloyaux et al., 2016], which integrates the coupled model in two outer loops and uses an incremental 4D-Var analysis for the atmosphere and a 3D-Var using First Guess at the Appropriate Time (FGAT) analysis for the ocean, is also developed in this study. Its results are shown in Figure 4.6 as well. Interestingly, this CERA-like approach reaches accuracy similar to the SCDA 4D-Var for both atmosphere and ocean analyses under a full observation network. Though both 3D-Var and 3D-FGAT do not utilize the adjoint of ocean models, the CERA-like ocean 3D-FGAT analysis shows an analysis RMSE comparable to 4D-Var, smaller than 3D-Var. 127 4.4.5 Comparison of SCDAmethods when only assimilating atmosphere observations Following Sluka et al. [2016], we next consider a reduced observing network and evaluate the performance of SCDA by assimilating only atmospheric observations. The ETKF and 4D-Var under this scenario produce similar RMSE for the atmosphere, both smaller than the 3D-Var (Figure 4.7(a)). However, due to the use of a static climatological error covariance, 3D-Var, 4D-Var, and variational-based CERA all failed to stabilize the ocean within the length of the experiments (~27.4 years). In comparison, after about 4? 104 MTU (~10 years) the EKTF does stabilize (Figure 4.7(b)). Besides, unlike the atmosphere analysis where 4D-Var and CERA generate comparable RMSE lower than 3D- Var, CERA shows the worst performance on the ocean analysis among all three variational DA systems. Note that both SC 3D-Var and 4D-Var take advantage of nonzero cross- domain background error correlation blocks so that assimilating atmosphere observations in these systems reduces the ocean state error through the information transfer by the couple-state B. This is not the case for CERA. In CERA, only the coupled-state trajectory is updated for each outer loop. The atmosphere and ocean still use their own domain background error covariance to minimize their separate cost functions. Results in Figure 4.7(b) indicate that only the outer loop coupling is not enough to compensate for the analysis degradation due to the missing coupled-state error correlation blocks. One following question is whether we can stabilize the variational analysis by tuning the background error covariance. Figure 4.8 shows the coupled-state analysis RMSE by inflating theB4???? with different factors (i.e., ?2, ?5, and?10). Interestingly, atmosphere 128 -3 ?10 3.5 3D-Var 4D-Var ETKFm40 CERA 3 2.5 2 1.5 1 0.5 0 1.8 1.84 1.88 1.92 1.96 2 Time step 5?10 -4 ?10 4 3 2 1 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 Time step 5?10 Figure 4.7: The analysis RMSE when observing only the atmosphere for the atmosphere (top) and ocean (bottom)with strongly coupled 3D-Var (green), 4D-Var (blue), 4D-Var/3D- Var CERA (cyan), and 40-member ETKF (red). A 21-step running average filter is applied to the bottom panel. 129 Analysis RMS error Analysis RMS error Figure 4.8: The analysis RMSE when observing only the atmosphere for the atmosphere (top) and ocean anslysis (bottom) with strongly coupled 4D-Var, but applying different multiplicative inflation parameters to its background error covariance B: 2 (cyan), 5 (blue) and 10 (dark blue). The RMSE from the 40-member ETKF (red) is also shown as a reference. 130 and ocean analysis showopposite sensitivity to the inflation factor: using a smaller inflation factor further reduces the analysis RMSE for the atmosphere while increasing the RMSE for the ocean. Inspired by results shown in Figure 4.8, instead of performing a uniform inflation for the coupled state (i.e., B???????,?? ? ????? = ?B???????), we adopted a block-wise inflation method to inflate the atmospheric and oceanic components separately. To maintain the structure of the original error correlations for the coupled system, we only inflate the error variance of the atmosphere and ocean model states, corresponding to the diagonal elements in the B??????? . This procedure can be explicitly written as B???????,?? ? ????? = F1/2 D1/2 C1/2 D1/2 F1/2 (4.25)??????? ??????? ??????? ??????? ??????? whereD??????? andC??????? are the diagonal variancematrix and the symmetric correlation matrix derived from B??????? , and F??????? is a diagonal matrix containing the inflation factor (????I? ?1, ????I? ?1). Figure 4.9 shows the 4D-Var results with the block??? ??? inflation. By applying a domain-dependent inflation (2 for the atmosphere and 10 for the ocean), the 4D-Var analysis can simultaneously achieve smallest analysis RMSE for both the atmosphere and ocean, though 4D-Var still fails to stabilize the analysis in the current experiment period. 4.5 Summary The interest in CDAhas grown since CDAcan provide amore coherent coupled-state analysis, which could be used to initialize the sub-seasonal or longer-range predictions. 131 Figure 4.9: The analysis RMSE when observing only the atmosphere for the atmosphere (top) and ocean anslysis (bottom) by using strongly coupled 4D-Var with block- independent inflation parameters to its background error covariance B (2 for B??? and 10 for B???, green). The RMSE from the 4D-Var with 2?B (cyan), 5?B (blue), and 40-member ETKF (red) are also shown as references. 132 Different CDA strategies have been explored for ensemble assimilation methods over a hierarchy of coupled models ranging from the simple coupled Lorenz model to the most state-of-the-art atmosphere-ocean models. Meanwhile, CDA strategies have not been investigated thoroughly for the variational methods due to the scientific and technical challenges related to variational methods. Given these difficulties, few studies have systematically compared the ensemble and variational methods in the context of CDA. It remains unclear how their performance varies with different CDA strategies. The most advanced coupled variational assimilation system CERA developed by ECMWF adopted an alternative approach that does not explicitly include the cross-domain background error correlation in its formulation: within each outer loop, CERA utilizes a coupled atmosphere-ocean model to generate the reference trajectory, while it then produces separate 4D-Var analysis for the atmosphere and 3D-FGAT analysis for the ocean by utilizing the forcing derived from the latest coupled-state reference trajectory in the same outer loop. It remains an open question about howmuch benefits this outer-loop- coupling approach can bring when compared with the SC 3D-Var and 4D-Var systems that explicitly include the coupled-state background error covariance in their formulations. In this study, aCDA testbed is developed for the coupled quasi-geostrophic atmosphere- ocean model MAOOAM. With this testbed, we then systematically compared different CDA strategies for both variational and ensemble assimilation methods. The major findings are listed as follows: 1. For 3D-Var, the RMSE of the WCDA and SCDA analysis is smaller than that of UCDA analysis. Increasing the forcing update frequency helps the UCDA system 133 reduce the analysis RMSE though its magnitude is still greater than the RMSE from the WC and SC 3D-Var. An additional benefit of applying CDA with 3D-Var is that the combined atmosphere-ocean system experiences a more rapid spin-up than the uncoupled scenarios. A slight improvement is seen for both atmosphere and ocean by switching from the WC 3D-Var to SC 3D-Var. 2. For ETKF, both SC andWC ETKF lead to smaller RMSE than the UC ETKF for the atmosphere analysis, while the ocean analysis is consistently improved by switching from theWC to SC ETKF, a feature not observed for the 3D-Var systems. Compared to SC 3D-Var with the static background error covariance, both WC and SC ETKF experience a shorter spin-up time for the ocean analysis. For the UCDA scenario, ETKF shows greater sensitivity than 3D-Var on the flux error. A more frequent flux update is necessary for ETKF to synchronize its coupled states towards the truth. Otherwise, it quickly suffers filter divergence. 3. We also compared different SCDA methods under different observation networks. When both atmosphere and ocean observations are assimilated, the SC incremental 4D-Var and ETKF share a similar analysis RMSE that is smaller than SC 3D-Var, for both the atmosphere and ocean. The ECMWF CERA, which adopts the outer- loop-coupling approach, achieves a similar RMSE as the SC 4D-Var and ETKF. 4. When only atmospheric observations are assimilated, all variational-based DA methods with static background error covariance failed to stabilize the RMSE for the ocean within the experiment periods ( 27.4 years), while the flow-dependent ETKF does stabilize the analysis after 10 years. Among all the variational systems, 134 CERA shows larger RMSE than SC 3D-Var and 4D-Var that explicitly use the coupled-state background error covariance in their formulation. This result implies the outer-loop-coupling alone is not enough to mitigate the information loss due to the lack of cross-domain background error covariance. While we have thoroughly examined different CDA strategies for variational and ensemble methods, it is worthwhile to keep in mind the limitations of this study. First, the coupled model adopted here is a coupled quasi-geostrophic model, of which internal dynamics resemble more to that at midlatitude. By nature, this quasi-geostrophic model does not include tropical convection, so all the results shown in this study might not be representative of the tropics. Second, the ocean in the midlatitude is primarily driven by the atmosphere, while the opposite is true based on Kalnay et al. [1986]. This difference might partially contribute to the results that SC ETKF can still constrain ocean analysis by only assimilating the atmosphere observations. Whether this conclusion can be extended to the tropics needs further investigation with a model that includes Tropic dynamics, such as convection. A CDA testbed based on the SPEEDY-NEMO model utilized by Sluka et al. [2016] will be an ideal candidate to revisit those problems. Besides the limitation due to the coupled model itself, there are two more points worth discussing from the perspective of data assimilation. First, observations assimilated in this study are the spectral modes covering the whole globe, which is too optimistic since the distribution of actual observations is far more irregular, and constraints on different scale motions vary with the distribution of observations. It would be desirable to re- examine all the CDAapproaches if we adopt a sparse observation network in physical space 135 instead of spectral space. Besides, our study only includes direct observations from the atmosphere or the ocean. It is crucial to remember that when integrated observations exist, the cost function for WC variational system is no longer separable. Further investigation is necessary to determine whether the WC variational systems with separate cost functions can synchronize towards the truth when integrated observations (e.g., radiance) exist, especially in the presence of model errors. 136 Chapter 5: Summary and Future Directions 5.1 Summary of the dissertation In this section, we give a summary of this dissertation. The first two chapters focus on evaluating the impact of current satellite-derived precipitation products from the NASA GPM on typhoon forecasts and developing a new localization method in the EnKF that can better assimilate radiance observations such as those from GPMMicrowave Imagers. Those two tasks correspond to two important research topics for the EnKF: assimilation of observations with non-Gaussian errors and the nonlocal observations. Different methods are developed in the dissertation to handle these two types of observations in the EnKF. After that, we switch our focus to comparing different CDA strategies for a coupled atmosphere-ocean model, in preparation for assimilating GPM precipitation into other earth components other than the atmosphere. The more detailed description of each chapter is as follows. In Chapter 2, we focus on the non-Gaussianity problem of precipitation assimilation. We extend the precipitation assimilation framework with Gaussian transformation [Lien et al., 2013, 2016a,b] to regional models and focus on typhoon forecast. We investigated the impact of assimilating satellite-derived GSMaP surface precipitation into the regional model SCALE-RM on typhoon forecast by conducting four typhoon case studies over 137 the Western Pacific in 2015. Results show that assimilation of surface precipitation improves the model representation of MSLP, surface winds, moisture, hydrometeors, and precipitation for typhoons. The overall typhoon forecast statistics show that precipitation assimilation reduces the track forecast error and the intensity error measured by MSLP up to 72 hours. Data denial experiments indicate that the precipitation observations over typhoon regions are mainly responsible for the improved typhoon forecast by improving the model analysis near these regions, confirmed by the verification against independent GPS-RO wet profiles. The Gaussian transformation process requires the construction of empirical precipitationCDFs. This study proposes an observation-basedCDFconstruction approach to remove the need to archive the past modeled precipitation required in [Lien et al., 2016b]. Results show that precipitation assimilation using the observation-based CDFs for Gaussian transformation still generates improved track and intensity forecasts compared with the experiments without precipitation assimilation. Chapter 3 tackles the localization problem for nonlocal observations in the EnKF. Assimilating nonlocal observations within the Ensemble Kalman Filter is a nontrivial problem. Nonlocal observations (e.g., total precipitable water, radiances, and altimeter observations) are intrinsically integrated measurements in space and thus have no single well-defined observation location. Nevertheless, an SLVL approach widely used to handle these nonlocal observations vertically localizes them at their WF peaks with symmetric Gaussian-shape localization functions. While the SLVL approach can properly localize nonlocal observations with narrow WFs, it has difficulties handling observations with broad asymmetric WFs or with multiple WF peaks, which are common for clear-sky radiance from sounding or trace-gas sensitive channels onboard hyperspectral infrared 138 sensors (e.g., CrIS), or for all-sky radiances when clouds are present. In this chapter, we developed an MLVL method for nonlocal observations in the LETKF. Unlike the SLVL approach that focuses only on the largest WF peak, the MLVL explicitly considers the wholeWF shapes in its formulation and generates the localization value at the target model grid based on the integrated influences from all components that constitute the nonlocal observations. UsingMLVL as a generic framework, the traditional SLVL can be viewed as a special case when localizing point observations. Comparisons of localization functions between SLVL and MLVL show that both methods generate Gaussian-shaped localization functions for AMSU-A observations of which WF is more Gaussian-shaped, while only the MLVL can generate non-Gaussian-shaped localization functions for selected CrIS channels of which WF shows multiple peaks. Observing system simulation experiments with a multi-layer quasi-geostrophic model are conducted to compare the performance of these twomethods when assimilating nonlocal observations with narrowWFs, broadWFs, or multipleWF peaks. The results show that theMLVL has comparable performance to the SLVL when assimilating narrow-WF observations, and superior performance to the SLVL when assimilating observations with broad WFs or with multiple WF peaks. Besides the LETKF, we also discuss how to implement the MLVL in other serial-observation- processing EnKFs such as EnSRF and EAKF. In Chapter 4, we switch our focus to coupled data assimilation in the preparation of assimilating precipitation into other earth components other than the atmosphere. It is noticed that there are few studies that systematically compare different CDA strategies (i.e., UCDA, WCDA, SCDA) of both ensemble and variational methods. In this chapter a CDA testbed is developed for a coupled geostrophic atmosphere-ocean model. This CDA 139 testbed includes both ensemble and variational methods under different CDA strategies that allow us to understand their difference systematically. Results show that for 3D-Var, WCDAand SCDA show comparable performance for atmosphere and ocean analysis better than UCDA. Besides, CDA significantly reduce the spin-up time of the coupled system when compared with UCDA. For ETKF, both SC and WC ETKF lead to smaller RMSE than the UC ETKF for the atmosphere analysis, while the ocean analysis is consistently improved by switching from the WC to SC ETKF, a feature not observed for the 3D-Var systems. Compared to SC 3D-Var with the static background error covariance, both WC and SC ETKF experience a shorter spin-up time for the ocean analysis. For the UCDA scenario, ETKF shows greater sensitivity than 3D-Var on the flux error. A more frequent flux update is necessary for ETKF to synchronize its coupled states towards the truth. Otherwise, it quickly suffers filter divergence. We also compared different SCDAmethods under different observation networks. When both atmosphere and ocean observations are assimilated, the SC incremental 4D-Var and ETKF share a similar analysis RMSE smaller than SC 3D-Var, for both the atmosphere and ocean. The ECMWF CERA, which adopts the outer-loop-coupling approach, achieves a similar RMSE as the SC 4D-Var and ETKF. When only atmospheric observations are assimilated, all variational-based DA methods with static background error covariance failed to stabilize the RMSE for the ocean within the experiment periods (about 27.4 years), while the flow-dependent ETKF does stabilize the analysis after about 10 years. Among all the variational systems, CERA shows larger RMSE than SC 3D-Var and 4D-Var that explicitly use the coupled-state background error covariance in their formulation. This result implies the outer-loop-coupling alone is not enough to mitigate the information loss due to the lack of cross-domain background error 140 covariance. 5.2 Future directions Here we list several future directions that we would like to explore later. Chapter 3 showed the MLVL is better than the SLVL when assimilating observations with broadWF or with multiple WF peaks in idealized experiments with a quasi-geostrophic model. It would be interesting to implement the MLVL approach into the GFS-LETKF and conduct real-observation experiments to investigate the impact of MLVL on radiance assimilation for infrared hyperspectral sounder observations. Besides the follow-on research onMLVL,we list here another two exciting directions related to precipitation assimilation. In Chapter 2, the Gaussian transformation process was proposed to make the error of transformed precipitation more Gaussian, thus being effectively assimilated by the EnKF that requires Gaussian likelihood. Many studies have reported that error related to precipitation, clouds, and aerosols are more closely related to Gamma or inverse Gamma distributions. Given this, an alternative approach is to develop a filter that can directly handle Gamma or inverse Gamma distributions. For example, the GIGG filter developed by Bishop [2016] seeks the minimum variance posterior under the Gamma (inverse Gamma) prior and inverse Gamma (Gamma) likelihood. We recently implemented this filter to a quasi-geostrophic model and an intermediate atmosphere general circulation model SPEEDY, and assessed the effectiveness of GIGG filter on precipitation assimilation. Our preliminary results show that precipitation assimilation with GIGG filter improves the model analysis. 141 Finally, most data assimilation studies focus on assimilating precipitation observations to improve the atmosphere weather analysis and forecast. For other earth component models such as land surface or ocean model, precipitation is often used as forcing data. But with SC ensemble assimilation system for coupled earth system models, it becomes possible to assimilate precipitation directly into other earth components through SCDA. An interesting topic is to explore if assimilating precipitation into other earth components through SCDA improve their analysis and forecast considering their interaction with precipitation (e.g., soil moisture for land, and sea surface temperature and salinity for ocean). 142 Appendix A: Statement of Originality Chapter 4 is based on a paper (Penny et al., 2019) in which I am the co-author. My contributions are developing the CDA testbed MAOOAM-CDAS that includes both variational (3D-Var, 4D-Var, CERA) and ensemble methods with different CDA strategies, comparing UC/WC/SC 3D-Var and ETKF, and comparing SCDAmethods under different observation networks. Penny, S. G., E. Bach, K. Bhargava,C.-C. Chang, C. Da., L. Sun, & T. Yoshida (2019). Strongly coupled data assimilation in multiscale media: Experiments using a quasi- geostrophic coupled model. Journal of Advances in Modeling Earth Systems, 11, 1803? 1829. https://doi.org/10.1029/2019MS001652. Chapter 3 is based on a paper (Da et al., 2022). Da, C., E. Kalnay, and T.-C. Chen (2022). Multi-layer observation localization for nonlocal observations in the Local Ensemble Transform Kalman Filter. (Submitted to Monthly Weather Review, in revision) 143 Bibliography Eugenia Kalnay. Atmospheric modeling, data assimilation and predictability. Cambridge university press, 2003. Guo-Yuan Lien, Eugenia Kalnay, and Takemasa Miyoshi. Effective assimilation of global precipitation: simulation experiments. Tellus A: Dynamic Meteorology and Oceanography, 65(1):19915, December 2013. ISSN null. doi: 10.3402/tellusa.v65i0. 19915. URL https://doi.org/10.3402/tellusa.v65i0.19915. Publisher: Taylor & Francis. Guo-Yuan Lien, Eugenia Kalnay, Takemasa Miyoshi, and George J. Huffman. Statistical Properties of Global Precipitation in the NCEPGFSModel and TMPAObservations for Data Assimilation. Monthly Weather Review, 144(2):663?679, February 2016a. ISSN 0027-0644. doi: 10.1175/MWR-D-15-0150.1. URL https://doi.org/10.1175/ MWR-D-15-0150.1. Guo-Yuan Lien, Takemasa Miyoshi, and Eugenia Kalnay. Assimilation of TRMM Multisatellite Precipitation Analysis with a Low-Resolution NCEP Global Forecast System. Monthly Weather Review, 144(2):643?661, February 2016b. ISSN 0027- 0644. doi: 10.1175/MWR-D-15-0149.1. URL https://doi.org/10.1175/ MWR-D-15-0149.1. Geir Evensen. Data assimilation: the ensemble Kalman filter. Springer Science & Business Media, 2009. Gerrit Burgers, Peter Jan van Leeuwen, and Geir Evensen. Analysis Scheme in the Ensemble Kalman Filter. Monthly Weather Review, 126(6):1719? 1724, June 1998. doi: 10.1175/1520-0493(1998)126<1719:ASITEK>2.0. CO;2. URL https://journals.ametsoc.org/view/journals/mwre/126/6/ 1520-0493_1998_126_1719_asitek_2.0.co_2.xml. Place: Boston MA, USA Publisher: American Meteorological Society. Craig H. Bishop, Brian J. Etherton, and Sharanya J. Majumdar. Adaptive Sampling with the Ensemble Transform Kalman Filter. Part I: Theoretical Aspects. Monthly Weather Review, 129(3):420?436, March 2001. ISSN 0027-0644. 144 doi: 10.1175/1520-0493(2001)129<0420:ASWTET>2.0.CO;2. URL https: //journals.ametsoc.org/doi/abs/10.1175/1520-0493(2001)129%3C0420: ASWTET%3E2.0.CO;2. Brian R. Hunt, Eric J. Kostelich, and Istvan Szunyogh. Efficient data assimilation for spatiotemporal chaos: A local ensemble transform Kalman filter. Data Assimilation, 230(1):112?126, June 2007. ISSN 0167-2789. doi: 10.1016/j.physd. 2006.11.008. URL https://www.sciencedirect.com/science/article/pii/ S0167278906004647. Jeffrey S. Whitaker and Thomas M. Hamill. Ensemble Data Assimilation without Perturbed Observations. Monthly Weather Review, 130(7):1913?1924, July 2002. ISSN 0027-0644. doi: 10.1175/1520-0493(2002)130<1913:EDAWPO>2.0.CO;2. URL https://journals.ametsoc.org/doi/abs/10.1175/1520-0493(2002) 130%3C1913:EDAWPO%3E2.0.CO%3B2. Jeffrey L. Anderson. An ensemble adjustment Kalman filter for data assimilation. Monthly weather review, 129(12):2884?2903, 2001. URL http://journals.ametsoc.org/ doi/abs/10.1175/1520-0493(2001)129%3C2884:AEAKFF%3E2.0.CO%3B2. Jeffrey L. Anderson. A Local Least Squares Framework for Ensemble Filtering. Monthly Weather Review, 131(4):634?642, April 2003. doi: 10.1175/1520-0493(2003) 131<0634:ALLSFF>2.0.CO;2. URL https://journals.ametsoc.org/view/ journals/mwre/131/4/1520-0493_2003_131_0634_allsff_2.0.co_2.xml. Place: Boston MA, USA Publisher: American Meteorological Society. John C. Derber and Wan-Shu Wu. The Use of TOVS Cloud-Cleared Radiances in the NCEP SSI Analysis System. Monthly Weather Review, 126(8):2287? 2299, August 1998. doi: 10.1175/1520-0493(1998)126<2287:TUOTCC>2.0. CO;2. URL https://journals.ametsoc.org/view/journals/mwre/126/8/ 1520-0493_1998_126_2287_tuotcc_2.0.co_2.xml. Place: Boston MA, USA Publisher: American Meteorological Society. P. Courtier, J.-N. Th?paut, and A. Hollingsworth. A strategy for operational implementation of 4D-Var, using an incremental approach. Quarterly Journal of the Royal Meteorological Society, 120(519):1367?1387, July 1994. ISSN 0035-9009. doi: 10.1002/qj.49712051912. URL https://doi.org/10.1002/qj.49712051912. Publisher: John Wiley & Sons, Ltd. Eugenia Kalnay, Hong Li, Takemasa Miyoshi, Shu-Chih Yang, and Joaquim Ballabrera- Poy. Response to the discussion on ?4-D-Var or EnKF?? by Nils Gustafsson. Tellus A: Dynamic Meteorology and Oceanography, 59(5):778?780, January 2007a. ISSN null. doi: 10.1111/j.1600-0870.2007.00263.x. URL https://doi.org/10.1111/ j.1600-0870.2007.00263.x. Publisher: Taylor & Francis. Shu-Chih Yang, Eugenia Kalnay, and Brian Hunt. Handling Nonlinearity in an Ensemble Kalman Filter: Experiments with the Three-Variable Lorenz Model. 145 Monthly Weather Review, 140(8):2628?2646, August 2012. doi: 10.1175/ MWR-D-11-00313.1. URL https://journals.ametsoc.org/view/journals/ mwre/140/8/mwr-d-11-00313.1.xml. Place: Boston MA, USA Publisher: American Meteorological Society. Pavel Sakov, Dean S. Oliver, and Laurent Bertino. An Iterative EnKF for Strongly Nonlinear Systems. Monthly Weather Review, 140(6):1988?2004, June 2012. doi: 10.1175/MWR-D-11-00176.1. URL https://journals.ametsoc.org/view/ journals/mwre/140/6/mwr-d-11-00176.1.xml. Place: Boston MA, USA Publisher: American Meteorological Society. Stephen E. Cohn. An Introduction to Estimation Theory (gtSpecial IssueltData Assimilation in Meteology and Oceanography: Theory and Practice). Journal of the Meteorological Society of Japan. Ser. II, 75(1B):257?288, 1997. doi: 10.2151/ jmsj1965.75.1B_257. S. J. Fletcher and M. Zupanski. A data assimilation method for log-normally distributed observational errors. Quarterly Journal of the Royal Meteorological Society, 132 (621):2505?2519, October 2006. ISSN 0035-9009. doi: 10.1256/qj.05.222. URL https://doi.org/10.1256/qj.05.222. Publisher: John Wiley & Sons, Ltd. S.J. Fletcher. Mixed Gaussian-lognormal four-dimensional data assimilation. Tellus A: Dynamic Meteorology and Oceanography, 62(3):266?287, January 2010. ISSN null. doi: 10.1111/j.1600-0870.2009.00439.x. URL https://www.tandfonline.com/ doi/abs/10.1111/j.1600-0870.2009.00439.x. Publisher: Taylor & Francis. Jeffrey L. Anderson. A Non-Gaussian Ensemble Filter Update for Data Assimilation. Monthly Weather Review, 138(11):4186?4198, November 2010. doi: 10.1175/ 2010MWR3253.1. URL https://journals.ametsoc.org/view/journals/ mwre/138/11/2010mwr3253.1.xml. Place: Boston MA, USA Publisher: American Meteorological Society. Jeffrey L. Anderson. A Nonlinear Rank Regression Method for Ensemble Kalman Filter Data Assimilation. Monthly Weather Review, 147(8):2847?2860, August 2019. doi: 10.1175/MWR-D-18-0448.1. URL https://journals.ametsoc.org/ view/journals/mwre/147/8/mwr-d-18-0448.1.xml. Place: Boston MA, USA Publisher: American Meteorological Society. Jeffrey L. Anderson. A Marginal Adjustment Rank Histogram Filter for Non-Gaussian Ensemble Data Assimilation. Monthly Weather Review, 148(8):3361?3378, August 2020. doi: 10.1175/MWR-D-19-0307.1. URL https://journals.ametsoc. org/view/journals/mwre/148/8/mwrD190307.xml. Place: Boston MA, USA Publisher: American Meteorological Society. Craig H. Bishop. The GIGG-EnKF: ensemble Kalman filtering for highly skewed non- negative uncertainty distributions. Quarterly Journal of the Royal Meteorological Society, 142(696):1395?1412, April 2016. ISSN 0035-9009. doi: 10.1002/qj.2742. URL https://doi.org/10.1002/qj.2742. Publisher: John Wiley & Sons, Ltd. 146 D. J. Posselt, T. S. L?Ecuyer, and G. L. Stephens. Exploring the error characteristics of thin ice cloud property retrievals using a Markov chain Monte Carlo algorithm. Journal of Geophysical Research: Atmospheres, 113(D24), December 2008. ISSN 0148-0227. doi: 10.1029/2008JD010832. URL https://doi.org/10.1029/2008JD010832. Publisher: John Wiley & Sons, Ltd. Derek J. Posselt and Tomislava Vukicevic. Robust Characterization of Model Physics Uncertainty for Simulations of Deep Moist Convection. Monthly Weather Review, 138(5):1513?1535, May 2010. doi: 10.1175/2009MWR3094. 1. URL https://journals.ametsoc.org/view/journals/mwre/138/5/ 2009mwr3094.1.xml. Place: Boston MA, USA Publisher: American Meteorological Society. Jonathan Poterjoy. A Localized Particle Filter for High-Dimensional Nonlinear Systems. Monthly Weather Review, 144(1):59?76, January 2016. doi: 10.1175/ MWR-D-15-0163.1. URL https://journals.ametsoc.org/view/journals/ mwre/144/1/mwr-d-15-0163.1.xml. Place: BostonMA,USAPublisher: American Meteorological Society. Jonathan Poterjoy and Jeffrey L. Anderson. Efficient Assimilation of Simulated Observations in a High-Dimensional Geophysical System Using a Localized Particle Filter. Monthly Weather Review, 144(5):2007?2020, May 2016. doi: 10.1175/ MWR-D-15-0322.1. URL https://journals.ametsoc.org/view/journals/ mwre/144/5/mwr-d-15-0322.1.xml. Place: BostonMA,USAPublisher: American Meteorological Society. Shunji Kotsuki, Takemasa Miyoshi, Koji Terasaki, Guo-Yuan Lien, and Eugenia Kalnay. Assimilating the global satellite mapping of precipitation data with the Nonhydrostatic Icosahedral Atmospheric Model (NICAM). Journal of Geophysical Research: Atmospheres, 122(2):631?650, January 2017. ISSN 2169-897X. doi: 10. 1002/2016JD025355. URL https://doi.org/10.1002/2016JD025355. Publisher: John Wiley & Sons, Ltd. Peter Bauer, Thomas Aulign?, William Bell, Alan Geer, Vincent Guidard, Sylvain Heilliette, Masahiro Kazumori, Min-Jeong Kim, Emily H.-C. Liu, Anthony P. McNally, Bruce Macpherson, Kozo Okamoto, Richard Renshaw, and Lars-Peter Riish?jgaard. Satellite cloud and precipitation assimilation at operational NWP centres. Quarterly Journal of the Royal Meteorological Society, 137(661):1934?1951, October 2011. ISSN 0035-9009. doi: 10.1002/qj.905. URL https://doi.org/10.1002/qj.905. Publisher: John Wiley & Sons, Ltd. Eugenia Kalnay. Recent Advances in EnKF: Running in Place, Assimilation of Rain, Ensemble Forecast Sensitivity to Observations, Coupled Data Assimilation, 2013. Kerry Emanuel and Fuqing Zhang. The Role of Inner-Core Moisture in Tropical Cyclone Predictability and Practical Forecast Skill. Journal of the Atmospheric Sciences, 74(7):2315?2324, July 2017. doi: 10.1175/ 147 JAS-D-17-0008.1. URL https://journals.ametsoc.org/view/journals/ atsc/74/7/jas-d-17-0008.1.xml. Place: Boston MA, USA Publisher: American Meteorological Society. William F. Campbell, Craig H. Bishop, and Daniel Hodyss. Vertical Covariance Localization for Satellite Radiances in Ensemble Kalman Filters. Monthly Weather Review, 138(1):282?290, January 2010. doi: 10.1175/ 2009MWR3017.1. URL https://journals.ametsoc.org/view/journals/ mwre/138/1/2009mwr3017.1.xml. Place: Boston MA, USA Publisher: American Meteorological Society. P. L. Houtekamer and Herschel L. Mitchell. Ensemble Kalman filtering. Quarterly Journal of the RoyalMeteorological Society, 131(613):3269?3289, October 2005. ISSN 0035-9009. doi: 10.1256/qj.05.135. URL https://doi.org/10.1256/qj.05.135. Publisher: John Wiley & Sons, Ltd. Koji Terasaki, Masahiro Sawada, and Takemasa Miyoshi. Local Ensemble Transform Kalman Filter Experiments with the Nonhydrostatic Icosahedral Atmospheric Model NICAM. SOLA, 11:23?26, 2015. doi: 10.2151/sola.2015-006. Koji Terasaki andTakemasaMiyoshi. AssimilatingAMSU-ARadianceswith theNICAM- LETKF. Journal of the Meteorological Society of Japan. Ser. II, advpub, 2017. doi: 10.2151/jmsj.2017-028. Koji Terasaki, Shunji Kotsuki, and Takemasa Miyoshi. Multi-year analysis using the NICAM-LETKF data assimilation system. SOLA, advpub, 2019. doi: 10.2151/sola. 2019-009. Hui Christophersen. Use of multiple satellite total ozone observations within and around tropical cyclones. PhD thesis, Florida StateUniversity, 2015. URLhttp://diginole. lib.fsu.edu/islandora/object/fsu%3A253058. Publisher: The Florida State University. Arthur Y. Hou, Ramesh K. Kakar, Steven Neeck, Ardeshir A. Azarbarzin, Christian D. Kummerow, Masahiro Kojima, Riko Oki, Kenji Nakamura, and Toshio Iguchi. The Global Precipitation Measurement Mission. Bulletin of the American Meteorological Society, 95(5):701?722, May 2014. doi: 10.1175/ BAMS-D-13-00164.1. URL https://journals.ametsoc.org/view/journals/ bams/95/5/bams-d-13-00164.1.xml. Place: Boston MA, USA Publisher: American Meteorological Society. George J Huffman, David T Bolvin, Eric J Nelkin, and Jackson Tan. Integrated Multi- satellitE Retrievals for GPM (IMERG) technical documentation. Nasa/Gsfc Code, 612 (47):2019, 2015. Takuji Kubota, Shoichi Shige, Hiroshi Hashizume, Kazumasa Aonashi, Nobuhiro Takahashi, Shinta Seto, Masafumi Hirose, Yukari Takayabu, Tomoo Ushio, Katsuhiro Nakagawa, Koyuru Iwanami, Misako Kachi, and Ken?ichi Okamoto. Global 148 Precipitation Map Using Satellite-Borne Microwave Radiometers by the GSMaP Project: Production and Validation. IEEE Transactions on Geoscience and Remote Sensing, 45(7):2259?2275, July 2007. ISSN 1558-0644. doi: 10.1109/TGRS.2007. 895337. Kazumasa Aonashi, Jun Awaka, Masafumi Hirose, Toshikaki Kozu, Takuji Kubota, Guosheng Liu, Shoichi Shige, Satoshi Kida, Sinta Seto, Nobuhiro Takahashi, and Yukari N. Takayabu. GSMaP Passive Microwave Precipitation Retrieval Algorithm : Algorithm Description and Validation. Journal of the Meteorological Society of Japan. Ser. II, 87A:119?136, 2009. doi: 10.2151/jmsj.87A.119. Abebe S. Gebregiorgis, Pierre-Emmanuel Kirstetter, Yang E. Hong, Jonathan J. Gourley, George J. Huffman, Walter A. Petersen, Xianwu Xue, and Mathew R. Schwaller. To What Extent is the Day 1 GPM IMERG Satellite Precipitation Estimate Improved as Compared to TRMMTMPA-RT? Journal of Geophysical Research: Atmospheres, 123 (3):1694?1707, February 2018. ISSN 2169-897X. doi: 10.1002/2017JD027606. URL https://doi.org/10.1002/2017JD027606. Publisher: John Wiley & Sons, Ltd. Guoqiang Tang, Ziyue Zeng, Meihong Ma, Ronghua Liu, Yixin Wen, and Yang Hong. Can near-real-time satellite precipitation products capture rainstorms and guide flood warning for the 2016 summer in South China? IEEE Geoscience and remote sensing letters, 14(8):1208?1212, 2017. ISSN 1545-598X. Publisher: IEEE. Bin Yong, Li-Liang Ren, Yang Hong, Jia-Hu Wang, Jonathan J. Gourley, Shan-Hu Jiang, Xi Chen, andWenWang. Hydrologic evaluation ofMultisatellite Precipitation Analysis standard precipitation products in basins beyond its inclined latitude band: A case study in Laohahe basin, China.Water Resources Research, 46(7), July 2010. ISSN0043-1397. doi: 10.1029/2009WR008965. URL https://doi.org/10.1029/2009WR008965. Publisher: John Wiley & Sons, Ltd. Kazumasa Aonashi. An initialization method to incorporate precipitation data into a mesoscale numerical weather prediction model. Journal of the Meteorological Society of Japan. Ser. II, 71(3):393?406, 1993. ISSN 0026-1165. Publisher: Meteorological Society of Japan. Tadashi Tsuyuki. Variational Data Assimilation in the Tropics Using Precipitation Data. Part II: 3D Model. Monthly Weather Review, 124(11):2545?2561, November 1996. doi: 10.1175/1520-0493(1996)124<2545:VDAITT>2.0.CO; 2. URL https://journals.ametsoc.org/view/journals/mwre/124/11/ 1520-0493_1996_124_2545_vdaitt_2_0_co_2.xml. Place: Boston MA, USA Publisher: American Meteorological Society. Tadashi Tsuyuki. Variational Data Assimilation in the Tropics Using Precipitation Data. Part III: Assimilation of SSM/I Precipitation Rates. Monthly Weather Review, 125 (7):1447?1464, July 1997. doi: 10.1175/1520-0493(1997)125<1447:VDAITT>2. 0.CO;2. URL https://journals.ametsoc.org/view/journals/mwre/125/7/ 1520-0493_1997_125_1447_vdaitt_2.0.co_2.xml. Place: Boston MA, USA Publisher: American Meteorological Society. 149 Arthur Y. Hou, David V. Ledvina, Arlindo M. da Silva, Sara Q. Zhang, Joanna Joiner, Robert M. Atlas, George J. Huffman, and Christian D. Kummerow. Assimilation of SSM/I-Derived Surface Rainfall and Total Precipitable Water for Improving the GEOS Analysis for Climate Studies. Monthly Weather Review, 128 (3):509?537, March 2000. doi: 10.1175/1520-0493(2000)128<0509:AOSIDS>2. 0.CO;2. URL https://journals.ametsoc.org/view/journals/mwre/128/3/ 1520-0493_2000_128_0509_aosids_2.0.co_2.xml. Place: Boston MA, USA Publisher: American Meteorological Society. S. Davolio and A. Buzzi. A Nudging Scheme for the Assimilation of Precipitation Data into a Mesoscale Model. Weather and Forecasting, 19(5): 855?871, October 2004. doi: 10.1175/1520-0434(2004)019<0855:ANSFTA>2. 0.CO;2. URL https://journals.ametsoc.org/view/journals/wefo/19/5/ 1520-0434_2004_019_0855_ansfta_2_0_co_2.xml. Place: Boston MA, USA Publisher: American Meteorological Society. Arthur Y. Hou, Sara Q. Zhang, and Oreste Reale. Variational Continuous Assimilation of TMI and SSM/I Rain Rates: Impact on GEOS-3 Hurricane Analyses and Forecasts. Monthly Weather Review, 132(8):2094?2109, August 2004. doi: 10.1175/1520-0493(2004)132<2094:VCAOTA>2.0.CO; 2. URL https://journals.ametsoc.org/view/journals/mwre/132/8/ 1520-0493_2004_132_2094_vcaota_2.0.co_2.xml. Place: Boston MA, USA Publisher: American Meteorological Society. Ko Koizumi, Yoshihiro Ishikawa, and Tadashi Tsuyuki. Assimilation of precipitation data to the JMA mesoscale model with a four-dimensional variational method and its impact on precipitation forecasts. Sola, 1:45?48, 2005. ISSN 1349-6476. Publisher: Meteorological Society of Japan. Xin Lin, Sara Q. Zhang, and Arthur Y. Hou. Variational Assimilation of Global Microwave Rainfall Retrievals: Physical and Dynamical Impact on GEOS Analyses. Monthly Weather Review, 135(8):2931?2957, August 2007. doi: 10.1175/MWR3434.1. URL https://journals.ametsoc.org/view/journals/ mwre/135/8/mwr3434.1.xml. Place: Boston MA, USA Publisher: American Meteorological Society. Philippe Lopez. Direct 4D-Var assimilation of NCEP stage IV radar and gauge precipitation data at ECMWF. Monthly Weather Review, 139(7):2098?2116, 2011. ISSN 0027-0644. Philippe Lopez. Experimental 4D-Var assimilation of SYNOP rain gauge data at ECMWF. Monthly weather review, 141(5):1527?1544, 2013. ISSN 0027-0644. Aleksandr Falkovich, Eugenia Kalnay, Stephen Lord, and Mukut B. Mathur. A New Method of Observed Rainfall Assimilation in Forecast Models. Journal of Applied Meteorology, 39(8):1282?1298, August 2000. doi: 10.1175/1520-0450(2000)039<1282:ANMOOR>2.0.CO;2. URL https: 150 //journals.ametsoc.org/view/journals/apme/39/8/1520-0450_2000_ 039_1282_anmoor_2.0.co_2.xml. Place: Boston MA, USA Publisher: American Meteorological Society. Xiaolei Zou, I. M. Navon, and J. G. Sela. Variational data assimilation with moist threshold processes using the NMC spectral model. Tellus A: DynamicMeteorology and Oceanography, 45(5):370?387, January 1993. ISSN null. doi: 10.3402/tellusa.v45i5. 14900. URL https://doi.org/10.3402/tellusa.v45i5.14900. Publisher: Taylor & Francis. Du?anka ?upanski and Fedor Mesinger. Four-Dimensional Variational Assimilation of Precipitation Data. Monthly Weather Review, 123(4):1112?1127, April 1995. doi: 10.1175/1520-0493(1995)123<1112:FDVAOP>2.0.CO;2. URL https://journals.ametsoc.org/view/journals/mwre/123/4/1520-0493_ 1995_123_1112_fdvaop_2_0_co_2.xml. Place: Boston MA, USA Publisher: American Meteorological Society. Ronald M Errico, David J Stensrud, and Kevin D Raeder. Estimation of the error distributions of precipitation produced by convective parametrization schemes. Quarterly Journal of the Royal Meteorological Society, 127(578):2495?2512, 2001. ISSN 0035-9009. Publisher: Wiley Online Library. Jean-Fran?ois Mahfouf. Linearization of a Simple Moist Convection Scheme for Large- Scale NWP Models. Monthly Weather Review, 133(6):1655?1670, June 2005. doi: 10.1175/MWR2942.1. URL https://journals.ametsoc.org/view/journals/ mwre/133/6/mwr2942.1.xml. Place: Boston MA, USA Publisher: American Meteorological Society. Philippe Lopez. Cloud and Precipitation Parameterizations in Modeling and Variational DataAssimilation: AReview. Journal of the Atmospheric Sciences, 64(11):3766?3784, November 2007. doi: 10.1175/2006JAS2030.1. URL https://journals.ametsoc. org/view/journals/atsc/64/11/2006jas2030.1.xml. Place: Boston MA, USA Publisher: American Meteorological Society. Takemasa Miyoshi and Kohei Aranami. Applying a Four-dimensional Local Ensemble Transform Kalman Filter (4D-LETKF) to the JMA Nonhydrostatic Model (NHM). SOLA, 2:128?131, 2006. doi: 10.2151/sola.2006-033. Ronald M Errico, Luc Fillion, Douglas Nychka, and Zhan-Qian Lu. Some statistical considerations associated with the data assimilation of precipitation observations. Quarterly Journal of the Royal Meteorological Society, 126(562):339?359, 2000. ISSN 0035-9009. Publisher: Wiley Online Library. Marc Bocquet, Carlos A. Pires, and Lin Wu. Beyond Gaussian Statistical Modeling in Geophysical Data Assimilation. Monthly Weather Review, 138(8):2997?3023, August 2010. doi: 10.1175/2010MWR3164.1. URL https://journals.ametsoc. org/view/journals/mwre/138/8/2010mwr3164.1.xml. Place: Boston MA, USA Publisher: American Meteorological Society. 151 George J. Huffman, David T. Bolvin, Eric J. Nelkin, David B. Wolff, Robert F. Adler, Guojun Gu, Yang Hong, Kenneth P. Bowman, and Erich F. Stocker. The TRMM Multisatellite Precipitation Analysis (TMPA): Quasi-Global, Multiyear, Combined- Sensor Precipitation Estimates at Fine Scales. Journal of Hydrometeorology, 8 (1):38?55, February 2007. doi: 10.1175/JHM560.1. URL https://journals. ametsoc.org/view/journals/hydr/8/1/jhm560_1.xml. Place: Boston MA, USA Publisher: American Meteorological Society. George J Huffman, Robert F Adler, David T Bolvin, and Eric J Nelkin. The TRMMmulti- satellite precipitation analysis (TMPA). In Satellite rainfall applications for surface hydrology, pages 3?22. Springer, 2010. Hirofumi Tomita andMasaki Satoh. A new dynamical framework of nonhydrostatic global model using the icosahedral grid. Fluid Dynamics Research, 34(6):357, 2004. ISSN 1873-7005. Publisher: IOP Publishing. Masaki Satoh, Hirofumi Tomita, Hisashi Yashiro, Hiroaki Miura, Chihiro Kodama, Tatsuya Seiki, Akira T Noda, Yohei Yamada, Daisuke Goto, and Masahiro Sawada. The non-hydrostatic icosahedral atmospheric model: Description and development. Progress in Earth and Planetary Science, 1(1):1?32, 2014. ISSN 2197-4284. Publisher: SpringerOpen. Masaki Satoh, Taro Matsuno, Hirofumi Tomita, Hiroaki Miura, Tomoe Nasuno, and Shin- ichi Iga. Nonhydrostatic icosahedral atmospheric model (NICAM) for global cloud resolving simulations. Journal of Computational Physics, 227(7):3486?3514, 2008. ISSN 0021-9991. Publisher: Elsevier. S. Nishizawa, H. Yashiro, Y. Sato, Y. Miyamoto, and H. Tomita. Influence of grid aspect ratio on planetary boundary layer turbulence in large-eddy simulations. Geoscientific Model Development, 8(10):3393?3419, 2015. doi: 10.5194/gmd-8-3393-2015. URL https://gmd.copernicus.org/articles/8/3393/2015/. Yousuke Sato, Seiya Nishizawa, Hisashi Yashiro, Yoshiaki Miyamoto, Yoshiyuki Kajikawa, and Hirofumi Tomita. Impacts of cloud microphysics on trade wind cumulus: which cloud microphysics processes contribute to the diversity in a large eddy simulation? Progress in Earth and Planetary Science, 2(1):23, August 2015. ISSN 2197-4284. doi: 10.1186/s40645-015-0053-6. URL https://doi.org/10. 1186/s40645-015-0053-6. Takumi Honda, Takemasa Miyoshi, Guo-Yuan Lien, Seiya Nishizawa, Ryuji Yoshida, Sachiho A. Adachi, Koji Terasaki, Kozo Okamoto, Hirofumi Tomita, and Kotaro Bessho. Assimilating All-Sky Himawari-8 Satellite Infrared Radiances: A Case of Typhoon Soudelor (2015). Monthly Weather Review, 146(1):213?229, January 2018. doi: 10.1175/MWR-D-16-0357.1. URL https://journals.ametsoc.org/ view/journals/mwre/146/1/mwr-d-16-0357.1.xml. Place: Boston MA, USA Publisher: American Meteorological Society. 152 HirofumiTomita. NewMicrophysical Schemeswith Five and SixCategories byDiagnostic Generation of Cloud Ice. Journal of the Meteorological Society of Japan. Ser. II, 86A: 121?142, 2008. doi: 10.2151/jmsj.86A.121. Miho Sekiguchi and Teruyuki Nakajima. A k-distribution-based radiation code and its computational optimization for an atmospheric general circulation model. Journal of Quantitative Spectroscopy and Radiative Transfer, 109(17):2779?2793, November 2008. ISSN 0022-4073. doi: 10.1016/j.jqsrt.2008.07.013. URL https://www. sciencedirect.com/science/article/pii/S0022407308001635. Mikio Nakanishi and Hiroshi Niino. An Improved Mellor?Yamada Level-3 Model with Condensation Physics: Its Design and Verification. Boundary-Layer Meteorology, 112 (1):1?31, July 2004. ISSN 1573-1472. doi: 10.1023/B:BOUN.0000020164.04146.98. URL https://doi.org/10.1023/B:BOUN.0000020164.04146.98. John S. Kain and J. Michael Fritsch. A One-Dimensional Entraining/Detraining Plume Model and Its Application in Convective Parameterization. Journal of Atmospheric Sciences, 47(23):2784?2802, December 1990. doi: 10.1175/1520-0469(1990) 047<2784:AODEPM>2.0.CO;2. URL https://journals.ametsoc.org/view/ journals/atsc/47/23/1520-0469_1990_047_2784_aodepm_2_0_co_2.xml. Place: Boston MA, USA Publisher: American Meteorological Society. Guo-Yuan Lien, Takemasa Miyoshi, Seiya Nishizawa, Ryuji Yoshida, Hisashi Yashiro, Sachiho A. Adachi, Tsuyoshi Yamaura, and Hirofumi Tomita. The Near-Real-Time SCALE-LETKF System: ACase of the September 2015Kanto-TohokuHeavy Rainfall. SOLA, 13:1?6, 2017. doi: 10.2151/sola.2017-001. FuqingZhang, Chris Snyder, and Juanzhen Sun. Impacts of initial estimate and observation availability on convective-scale data assimilation with an ensemble Kalman filter. Monthly Weather Review, 132(5):1238?1253, 2004. ISSN 1520-0493. Takemasa Miyoshi and Masaru Kunii. Using AIRS retrievals in the WRF-LETKF system to improve regional numerical weather prediction. Tellus A: Dynamic Meteorology and Oceanography, 64(1):18408, December 2012. ISSN null. doi: 10.3402/tellusa.v64i0. 18408. URL https://doi.org/10.3402/tellusa.v64i0.18408. Publisher: Taylor & Francis. Ting-Chi Wu, Hui Liu, Sharanya J. Majumdar, Christopher S. Velden, and Jeffrey L. Anderson. Influence of Assimilating Satellite-Derived Atmospheric Motion Vector Observations on Numerical Analyses and Forecasts of Tropical Cyclone Track and Intensity. Monthly Weather Review, 142(1):49?71, January 2014. doi: 10.1175/ MWR-D-13-00023.1. URL https://journals.ametsoc.org/view/journals/ mwre/142/1/mwr-d-13-00023.1.xml. Place: Boston MA, USA Publisher: American Meteorological Society. Ting-Chi Wu, Christopher S. Velden, Sharanya J. Majumdar, Hui Liu, and Jeffrey L. Anderson. Understanding the Influence of Assimilating Subsets of 153 Enhanced Atmospheric Motion Vectors on Numerical Analyses and Forecasts of Tropical Cyclone Track and Intensity with an Ensemble Kalman Filter. Monthly Weather Review, 143(7):2506?2531, July 2015. doi: 10.1175/ MWR-D-14-00220.1. URL https://journals.ametsoc.org/view/journals/ mwre/143/7/mwr-d-14-00220.1.xml. Place: Boston MA, USA Publisher: American Meteorological Society. Ting-ChiWu,Milija Zupanski, LewisD.Grasso, Paula J. Brown, ChristianD.Kummerow, and John A. Knaff. The GSI capability to assimilate TRMM and GPM hydrometeor retrievals in HWRF. Quarterly Journal of the Royal Meteorological Society, 142 (700):2768?2787, October 2016. ISSN 0035-9009. doi: 10.1002/qj.2867. URL https://doi.org/10.1002/qj.2867. Publisher: John Wiley & Sons, Ltd. Ting-Chi Wu and Milija Zupanski. Assimilating GPM hydrometeor retrievals in HWRF: choice of observation operators. Atmospheric Science Letters, 18(6):238?245, June 2017. ISSN 1530-261X. doi: 10.1002/asl.748. URL https://doi.org/10.1002/ asl.748. Publisher: John Wiley & Sons, Ltd. R. A. Anthes. The COSMIC/FORMOSAT-3 mission: Early results. Bull. Amer. Meteor. Soc., 89:313?334, 2008. doi: 10.1175/BAMS-89-3-313. Ying-Hwa Kuo, Sergey V Sokolovskiy, Richard A Anthes, and Francois Vandenberghe. Assimilation of GPS radio occultation data for numerical weather prediction. Terrestrial Atmospheric and Oceanic Sciences, 11(1):157?186, 2000. Publisher: UNKNOWN. Zaizhong Ma, Ying-Hwa Kuo, Bin Wang, Wan-Shu Wu, and Sergey Sokolovskiy. Comparison of Local and Nonlocal Observation Operators for the Assimilation of GPS RO Data with the NCEP GSI System: An OSSE Study. Monthly Weather Review, 137(10):3575?3587, October 2009. doi: 10.1175/2009MWR2809. 1. URL https://journals.ametsoc.org/view/journals/mwre/137/10/ 2009mwr2809.1.xml. Place: Boston MA, USA Publisher: American Meteorological Society. Kristian Mogensen, Magdalena Balmaseda, and AnthonyWeaver. The NEMOVAR ocean data assimilation system as implemented in the ECMWF ocean analysis for System 4. ECMWF Technical Memoranda, 2012. URL https://www.ecmwf.int/node/ 11174. Publisher: ECMWF. A. J. Geer, F. Baordo, N. Bormann, P. Chambon, S. J. English, M. Kazumori, H. Lawrence, P. Lean, K. Lonitz, and C. Lupu. The growing impact of satellite observations sensitive to humidity, cloud and precipitation. Quarterly Journal of the Royal Meteorological Society, 143(709):3189?3206, October 2017. ISSN 0035-9009. doi: 10.1002/qj.3172. URL https://doi.org/10.1002/qj.3172. Publisher: John Wiley & Sons, Ltd. Rolf H. Langland and Nancy L. Baker. Estimation of observation impact using the NRL atmospheric variational data assimilation adjoint system. Tellus A: Dynamic Meteorology and Oceanography, 56(3):189?201, January 2004. ISSN null. doi: 154 10.3402/tellusa.v56i3.14413. URL https://doi.org/10.3402/tellusa.v56i3. 14413. Publisher: Taylor & Francis. Ronald Gelaro and Yanqiu Zhu. Examination of observation impacts derived from observing system experiments (OSEs) and adjoint models. Tellus A, 61(2):179? 193, March 2009. ISSN 0280-6495. doi: 10.1111/j.1600-0870.2008.00388.x. URL https://doi.org/10.1111/j.1600-0870.2008.00388.x. Publisher: JohnWiley & Sons, Ltd. Eugenia Kalnay, Yoichiro Ota, Takemasa Miyoshi, and Junjie Liu. A simpler formulation of forecast sensitivity to observations: application to ensemble Kalman filters. Tellus A: Dynamic Meteorology and Oceanography, 64(1):18462, December 2012. ISSN null. doi: 10.3402/tellusa.v64i0.18462. URL https://doi.org/10.3402/tellusa. v64i0.18462. Publisher: Taylor & Francis. MarkBuehner, PingDu, and Jo?l B?dard. ANewApproach for Estimating theObservation Impact in Ensemble?Variational Data Assimilation. Monthly Weather Review, 146(2): 447?465, January 2018. ISSN 0027-0644. doi: 10.1175/MWR-D-17-0252.1. URL https://doi.org/10.1175/MWR-D-17-0252.1. Yoichiro Ota, John C. Derber, Eugenia Kalnay, and Takemasa Miyoshi. Ensemble-based observation impact estimates using theNCEPGFS. Tellus A: DynamicMeteorology and Oceanography, 65(1):20038, December 2013. ISSN null. doi: 10.3402/tellusa.v65i0. 20038. URL https://doi.org/10.3402/tellusa.v65i0.20038. Publisher: Taylor & Francis. Alan J. Geer and Peter Bauer. Observation errors in all-sky data assimilation. Quarterly Journal of the Royal Meteorological Society, 137(661):2024?2037, October 2011. ISSN 0035-9009. doi: 10.1002/qj.830. URL https://doi.org/10.1002/qj.830. Publisher: John Wiley & Sons, Ltd. Fuqing Zhang, Masashi Minamide, and Eugene E. Clothiaux. Potential impacts of assimilating all-sky infrared satellite radiances fromGOES-R on convection-permitting analysis and prediction of tropical cyclones. Geophysical Research Letters, 43(6): 2954?2963, March 2016. ISSN 0094-8276. doi: 10.1002/2016GL068468. URL https://doi.org/10.1002/2016GL068468. Publisher: John Wiley & Sons, Ltd. Alan J. Geer, Katrin Lonitz, Peter Weston, Masahiro Kazumori, Kozo Okamoto, Yanqiu Zhu, Emily Huichun Liu, Andrew Collard, William Bell, Stefano Migliorini, Philippe Chambon, Nadia Fourri?, Min-Jeong Kim, Christina K?pken-Watts, and Christoph Schraff. All-sky satellite data assimilation at operational weather forecasting centres. Quarterly Journal of the Royal Meteorological Society, 144(713):1191?1217, April 2018. ISSN 0035-9009. doi: 10.1002/qj.3202. URL https://doi.org/10.1002/ qj.3202. Publisher: John Wiley & Sons, Ltd. Massimo Bonavita, Alan J. Geer, and Mats Hamrud. All-Sky Microwave Radiances Assimilated with an Ensemble Kalman Filter. Monthly Weather Review, 148(7):2737? 2760, July 2020. doi: 10.1175/MWR-D-19-0413.1. URL https://journals. 155 ametsoc.org/view/journals/mwre/148/7/mwrD190413.xml. Place: Boston MA, USA Publisher: American Meteorological Society. P. L. Houtekamer and Herschel L. Mitchell. Data Assimilation Using an Ensemble Kalman Filter Technique. Monthly Weather Review, 126(3):796? 811, March 1998. doi: 10.1175/1520-0493(1998)126<0796:DAUAEK>2.0. CO;2. URL https://journals.ametsoc.org/view/journals/mwre/126/3/ 1520-0493_1998_126_0796_dauaek_2.0.co_2.xml. Place: Boston MA, USA Publisher: American Meteorological Society. Thomas M. Hamill, Jeffrey S. Whitaker, and Chris Snyder. Distance- Dependent Filtering of Background Error Covariance Estimates in an Ensemble Kalman Filter. Monthly Weather Review, 129(11):2776?2790, November 2001. doi: 10.1175/1520-0493(2001)129<2776:DDFOBE>2.0.CO; 2. URL https://journals.ametsoc.org/view/journals/mwre/129/11/ 1520-0493_2001_129_2776_ddfobe_2.0.co_2.xml. Place: Boston MA, USA Publisher: American Meteorological Society. Xuguang Wang, Dale M. Barker, Chris Snyder, and Thomas M. Hamill. A Hybrid ETKF?3DVAR Data Assimilation Scheme for the WRF Model. Part I: Observing System Simulation Experiment. Monthly Weather Review, 136(12):5116?5131, December 2008. doi: 10.1175/2008MWR2444. 1. URL https://journals.ametsoc.org/view/journals/mwre/136/12/ 2008mwr2444.1.xml. Place: Boston MA, USA Publisher: American Meteorological Society. Craig H. Bishop, Jeffrey S. Whitaker, and Lili Lei. Gain Form of the Ensemble Transform Kalman Filter and Its Relevance to Satellite Data Assimilation with Model Space Ensemble Covariance Localization. Monthly Weather Review, 145(11):4575?4592, November 2017. doi: 10.1175/MWR-D-17-0102.1. URL https://journals. ametsoc.org/view/journals/mwre/145/11/mwr-d-17-0102.1.xml. Place: Boston MA, USA Publisher: American Meteorological Society. Lili Lei, Jeffrey S. Whitaker, and Craig Bishop. Improving Assimilation of Radiance Observations by Implementing Model Space Localization in an Ensemble Kalman Filter. Journal of Advances in Modeling Earth Systems, 10(12):3221?3232, December 2018. ISSN 1942-2466. doi: 10.1029/2018MS001468. URL https://doi.org/10. 1029/2018MS001468. Publisher: John Wiley & Sons, Ltd. Anna Shlyaeva, Jeffrey S. Whitaker, and Chris Snyder. Model-Space Localization in Serial Ensemble Filters. Journal of Advances in Modeling Earth Systems, 11(6):1627? 1636, June 2019. ISSN 1942-2466. doi: 10.1029/2018MS001514. URL https: //doi.org/10.1029/2018MS001514. Publisher: John Wiley & Sons, Ltd. Lili Lei and Jeffrey S. Whitaker. Model Space Localization Is Not Always Better Than Observation Space Localization for Assimilation of Satellite Radiances. Monthly Weather Review, 143(10):3948?3955, October 2015. doi: 10.1175/ 156 MWR-D-14-00413.1. URL https://journals.ametsoc.org/view/journals/ mwre/143/10/mwr-d-14-00413.1.xml. Place: Boston MA, USA Publisher: American Meteorological Society. Roland Potthast, Anne Walter, and Andreas Rhodin. A Localized Adaptive Particle Filter within an Operational NWP Framework. Monthly Weather Review, 147(1):345? 362, January 2019. doi: 10.1175/MWR-D-18-0028.1. URL https://journals. ametsoc.org/view/journals/mwre/147/1/mwr-d-18-0028.1.xml. Place: Boston MA, USA Publisher: American Meteorological Society. Jie Feng, Xuguang Wang, and Jonathan Poterjoy. A Comparison of Two Local Moment- Matching Nonlinear Filters: Local Particle Filter (LPF) and Local Nonlinear Ensemble Transform Filter (LNETF). Monthly Weather Review, 148(11):4377?4395, November 2020. doi: 10.1175/MWR-D-19-0368.1. URL https://journals.ametsoc.org/ view/journals/mwre/148/11/MWR-D-19-0368.1.xml. Place: Boston MA, USA Publisher: American Meteorological Society. Takemasa Miyoshi and Yoshiaki Sato. Assimilating Satellite Radiances with a Local EnsembleTransformKalmanFilter (LETKF)Applied to the JMAGlobalModel (GSM). SOLA, 3:37?40, 2007. doi: 10.2151/sola.2007-010. ELANA J. Fertig, Brian Hunt, Edward Ott, and Istvan Szunyogh. Assimilating non- local observations with a local ensemble Kalman filter. Tellus A, 59(5):719?730, October 2007. ISSN 0280-6495. doi: 10.1111/j.1600-0870.2007.00260.x. URL https://doi.org/10.1111/j.1600-0870.2007.00260.x. Publisher: JohnWiley & Sons, Ltd. Jeffrey L. Anderson. Exploring the need for localization in ensemble data assimilation using a hierarchical ensemble filter.DataAssimilation, 230(1):99?111, June 2007. ISSN 0167-2789. doi: 10.1016/j.physd.2006.02.011. URLhttps://www.sciencedirect. com/science/article/pii/S0167278906002168. Jeffrey Anderson and Lili Lei. Empirical Localization of Observation Impact in Ensemble Kalman Filters. Monthly Weather Review, 141(11):4140?4153, November 2013. doi: 10.1175/MWR-D-12-00330.1. URL https://journals.ametsoc.org/ view/journals/mwre/141/11/mwr-d-12-00330.1.xml. Place: BostonMA,USA Publisher: American Meteorological Society. Takuma Yoshida and Eugenia Kalnay. Correlation-Cutoff Method for Covariance Localization in Strongly Coupled Data Assimilation. Monthly Weather Review, 146(9):2881?2889, September 2018. doi: 10.1175/MWR-D-17-0365. 1. URL https://journals.ametsoc.org/view/journals/mwre/146/9/ mwr-d-17-0365.1.xml. Place: Boston MA, USA Publisher: American Meteorological Society. Takuma Yoshida. Covariance Localization in Strongly Coupled Data Assimilation. PhD thesis, University of Maryland, 2019. URL https://drum.lib.umd.edu/handle/ 1903/25170. 157 Lili Lei, Jeffrey L. Anderson, and Jeffrey S. Whitaker. Localizing the impact of satellite radiance observations using a global group ensemble filter. Journal of Advances in Modeling Earth Systems, 8(2):719?734, June 2016. ISSN 1942-2466. doi: 10.1002/2016MS000627. URL https://doi.org/10.1002/2016MS000627. Publisher: John Wiley & Sons, Ltd. Lili Lei, Jeffrey S. Whitaker, Jeffrey L. Anderson, and Zhemin Tan. Adaptive Localization for Satellite Radiance Observations in an Ensemble Kalman Filter. Journal of Advances in Modeling Earth Systems, 12(8):e2019MS001693, August 2020. ISSN 1942-2466. doi: 10.1029/2019MS001693. URL https://doi.org/10.1029/2019MS001693. Publisher: John Wiley & Sons, Ltd. James A Jung, Andrew Collard, Kristen Bathmann, Dave Groff, Andrew Heidinger, and Mitchell Goldberg. Preparing for CrIS Full Spectral Resolution Radiances in the NCEP Global Forecast System. In Proceedings of the 21st International TOVS Study Conference, Darmstadt, Germany, volume 29, 2018. Takemasa Miyoshi. Ensemble Kalman filter experiments with a primitive-equation global model. PhD Thesis, University of Maryland, College Park, MD, USA, 2005. Mats Hamrud, Massimo Bonavita, and Lars Isaksen. EnKF and hybrid gain ensemble data assimilation. Part I: EnKF implementation. Monthly Weather Review, 143(12): 4847?4864, 2015. ISSN 0027-0644. Yong Han, Fuzhong Weng, Quanhua Liu, and Paul van Delst. A fast radiative transfer model for SSMIS upper atmosphere sounding channels. Journal of Geophysical Research: Atmospheres, 112(D11), June 2007. ISSN 0148-0227. doi: 10.1029/ 2006JD008208. URL https://doi.org/10.1029/2006JD008208. Publisher: John Wiley & Sons, Ltd. John Marshall and Franco Molteni. Toward a Dynamical Understanding of Planetary-Scale Flow Regimes. Journal of Atmospheric Sciences, 50(12): 1792?1818, June 1993. doi: 10.1175/1520-0469(1993)050<1792:TADUOP>2.0. CO;2. URL https://journals.ametsoc.org/view/journals/atsc/50/12/ 1520-0469_1993_050_1792_taduop_2_0_co_2.xml. Place: Boston MA, USA Publisher: American Meteorological Society. Jie Feng, Zoltan Toth, and Malaquias Pe?a. Spatially extended estimates of analysis and short-range forecast error variances. Tellus A: Dynamic Meteorology and Oceanography, 69(1):1325301, January 2017. ISSN null. doi: 10.1080/ 16000870.2017.1325301. URL https://doi.org/10.1080/16000870.2017. 1325301. Publisher: Taylor & Francis. Niels Bormann. Slant path radiative transfer for the assimilation of sounder radiances. Tellus A: Dynamic Meteorology and Oceanography, 69(1):1272779, January 2017. ISSN null. doi: 10.1080/16000870.2016.1272779. URL https://doi.org/10. 1080/16000870.2016.1272779. Publisher: Taylor & Francis. 158 Roger Daley. Atmospheric data analysis. Number 2. Cambridge university press, 1993. S. Zhang, M. J. Harrison, A. Rosati, and A. Wittenberg. System Design and Evaluation of Coupled Ensemble Data Assimilation for Global Oceanic Climate Studies. Monthly Weather Review, 135(10):3541?3564, October 2007. doi: 10.1175/MWR3466.1. URL https://journals.ametsoc.org/view/journals/ mwre/135/10/mwr3466.1.xml. Place: Boston MA, USA Publisher: American Meteorological Society. Nozomi Sugiura, Toshiyuki Awaji, Shuhei Masuda, Takashi Mochizuki, Takahiro Toyoda, Toru Miyama, Hiromichi Igarashi, and Yoichi Ishikawa. Development of a four- dimensional variational coupled data assimilation system for enhanced analysis and prediction of seasonal to interannual climate variations. Journal of Geophysical Research: Oceans, 113(C10), October 2008. ISSN 0148-0227. doi: 10.1029/ 2008JC004741. URL https://doi.org/10.1029/2008JC004741. Publisher: John Wiley & Sons, Ltd. David P. Mulholland, Patrick Laloyaux, Keith Haines, and Magdalena Alonso Balmaseda. Origin and Impact of Initialization Shocks in Coupled Atmosphere?Ocean Forecasts. Monthly Weather Review, 143(11):4631?4644, November 2015. doi: 10.1175/ MWR-D-15-0076.1. URL https://journals.ametsoc.org/view/journals/ mwre/143/11/mwr-d-15-0076.1.xml. Place: Boston MA, USA Publisher: American Meteorological Society. SG Penny, S Akella, O Alves, C Bishop, M Buehner, M Chevallier, F Counillon, C Draper, S Frolov, and Y Fujii. Coupled data assimilation for integrated earth system analysis and prediction: Goals, challenges and recommendations. WWRP 2017-3, 50 pp. 2017. Tamara Singleton.Data assimilation experimentswith a simple coupled ocean-atmosphere model. PhD thesis, University of Maryland, 2011. URL http://hdl.handle.net/ 1903/11912. EUGENIA Kalnay, HONG LI, TAKEMASA MIYOSHI, SHU-CHIH YANG, and JOAQUIMBALLABRERA-POY. 4-D-Var or ensemble Kalman filter? Tellus A, 59(5): 758?773, October 2007b. ISSN 0280-6495. doi: 10.1111/j.1600-0870.2007.00261.x. URL https://doi.org/10.1111/j.1600-0870.2007.00261.x. Publisher: John Wiley & Sons, Ltd. Zhengyu Liu, Shu Wu, Shaoqing Zhang, Yun Liu, and Xinyao Rong. Ensemble data assimilation in a simple coupled climate model: The role of ocean-atmosphere interaction. Advances in Atmospheric Sciences, 30(5):1235?1248, September 2013. ISSN 1861-9533. doi: 10.1007/s00376-013-2268-z. URL https://doi.org/10. 1007/s00376-013-2268-z. EdwardN. Lorenz. Deterministic Nonperiodic Flow. Journal of the Atmospheric Sciences, 20(2):130?141, March 1963. ISSN 0022-4928. doi: 10.1175/1520-0469(1963) 020<0130:DNF>2.0.CO;2. URL http://journals.ametsoc.org/doi/abs/10. 1175/1520-0469(1963)020%3C0130:DNF%3E2.0.CO;2. 159 Fei-Fei Jin. An Equatorial Ocean Recharge Paradigm for ENSO. Part I: Conceptual Model. Journal of the Atmospheric Sciences, 54(7):811? 829, April 1997. doi: 10.1175/1520-0469(1997)054<0811:AEORPF>2.0. CO;2. URL https://journals.ametsoc.org/view/journals/atsc/54/7/ 1520-0469_1997_054_0811_aeorpf_2.0.co_2.xml. Place: Boston MA, USA Publisher: American Meteorological Society. Guijun Han, Xinrong Wu, Shaoqing Zhang, Zhengyu Liu, and Wei Li. Error Covariance Estimation for Coupled Data Assimilation Using a Lorenz Atmosphere and a Simple Pycnocline Ocean Model. Journal of Climate, 26(24):10218?10231, December 2013. doi: 10.1175/JCLI-D-13-00236.1. URL https://journals.ametsoc.org/ view/journals/clim/26/24/jcli-d-13-00236.1.xml. Place: BostonMA,USA Publisher: American Meteorological Society. Travis C. Sluka, Stephen G. Penny, Eugenia Kalnay, and Takemasa Miyoshi. Assimilating atmospheric observations into the ocean using strongly coupled ensemble data assimilation. Geophysical Research Letters, 43(2):752?759, January 2016. ISSN 0094-8276. doi: 10.1002/2015GL067238. URL https://doi.org/10.1002/ 2015GL067238. Publisher: John Wiley & Sons, Ltd. Travis Sluka. Strongly Coupled Ocean-Atmosphere Data Assimilation with the Local Ensemble Transform Kalman Filter. PhD thesis, University of Maryland, 2018. URL http://hdl.handle.net/1903/22167. Malaquias Pe?a and Eugenia Kalnay. Separating fast and slow modes in coupled chaotic systems. Nonlinear Processes in Geophysics, 11, July 2004. doi: 10.5194/ npg-11-319-2004. Feiyu Lu, Zhengyu Liu, Shaoqing Zhang, and Yun Liu. Strongly Coupled Data Assimilation Using Leading Averaged Coupled Covariance (LACC). Part I: Simple Model Study. Monthly Weather Review, 143(9):3823?3837, September 2015a. doi: 10.1175/MWR-D-14-00322.1. URL https://journals.ametsoc.org/view/ journals/mwre/143/9/mwr-d-14-00322.1.xml. Place: Boston MA, USA Publisher: American Meteorological Society. Feiyu Lu, Zhengyu Liu, Shaoqing Zhang, Yun Liu, and Robert Jacob. Strongly Coupled Data Assimilation Using Leading Averaged Coupled Covariance (LACC). Part II: CGCMExperiments. Monthly Weather Review, 143(11):4645?4659, November 2015b. doi: 10.1175/MWR-D-15-0088.1. URL https://journals.ametsoc.org/ view/journals/mwre/143/11/mwr-d-15-0088.1.xml. Place: Boston MA, USA Publisher: American Meteorological Society. Robert Tardif, Gregory J. Hakim, and Chris Snyder. Coupled atmosphere?ocean data assimilation experiments with a low-order model and CMIP5 model data. Climate Dynamics, 45:1415?1427, 2014. Robert Tardif, Gregory J. Hakim, and Chris Snyder. Coupled atmosphere?ocean data assimilation experiments with a low-order model and CMIP5 model data. Climate 160 Dynamics, 45(5):1415?1427, September 2015. ISSN 1432-0894. doi: 10.1007/ s00382-014-2390-3. URL https://doi.org/10.1007/s00382-014-2390-3. Suranjana Saha, Shrinivas Moorthi, Hua-Lu Pan, Xingren Wu, Jiande Wang, Sudhir Nadiga, Patrick Tripp, Robert Kistler, John Woollen, David Behringer, Haixia Liu, Diane Stokes, Robert Grumbine, George Gayno, Jun Wang, Yu-Tai Hou, Hui-ya Chuang, Hann-Ming H. Juang, Joe Sela, Mark Iredell, Russ Treadon, Daryl Kleist, Paul Van Delst, Dennis Keyser, John Derber, Michael Ek, Jesse Meng, Helin Wei, Rongqian Yang, Stephen Lord, Huug van den Dool, Arun Kumar, Wanqiu Wang, Craig Long, Muthuvel Chelliah, Yan Xue, Boyin Huang, Jae-Kyung Schemm, Wesley Ebisuzaki, Roger Lin, Pingping Xie, Mingyue Chen, Shuntai Zhou, Wayne Higgins, Cheng-Zhi Zou, Quanhua Liu, Yong Chen, Yong Han, Lidia Cucurull, Richard W. Reynolds, Glenn Rutledge, and Mitch Goldberg. The NCEP Climate Forecast System Reanalysis. Bulletin of the American Meteorological Society, 91(8):1015?1058, August 2010. doi: 10.1175/2010BAMS3001.1. URL https://journals.ametsoc.org/view/ journals/bams/91/8/2010bams3001_1.xml. Place: Boston MA, USA Publisher: American Meteorological Society. Suranjana Saha, Shrinivas Moorthi, Xingren Wu, Jiande Wang, Sudhir Nadiga, Patrick Tripp, David Behringer, Yu-Tai Hou, Hui-ya Chuang, Mark Iredell, Michael Ek, Jesse Meng, Rongqian Yang, Malaqu?as Pe?a Mendez, Huug van den Dool, Qin Zhang, Wanqiu Wang, Mingyue Chen, and Emily Becker. The NCEP Climate Forecast System Version 2. Journal of Climate, 27(6):2185?2208, March 2014. doi: 10.1175/JCLI-D-12-00823.1. URL https://journals.ametsoc.org/view/ journals/clim/27/6/jcli-d-12-00823.1.xml. Place: Boston MA, USA Publisher: American Meteorological Society. D. J. Lea, I. Mirouze, M. J. Martin, R. R. King, A. Hines, D. Walters, and M. Thurlow. Assessing a New Coupled Data Assimilation System Based on the Met Office Coupled Atmosphere?Land?Ocean?Sea Ice Model. Monthly Weather Review, 143(11):4678? 4694, November 2015. doi: 10.1175/MWR-D-15-0174.1. URL https://journals. ametsoc.org/view/journals/mwre/143/11/mwr-d-15-0174.1.xml. Place: Boston MA, USA Publisher: American Meteorological Society. Patrick Laloyaux, Magdalena Balmaseda, Dick Dee, Kristian Mogensen, and Peter Janssen. A coupled data assimilation system for climate reanalysis. Quarterly Journal of the Royal Meteorological Society, 142(694):65?78, January 2016. ISSN 0035-9009. doi: 10.1002/qj.2629. URL https://doi.org/10.1002/qj.2629. Publisher: John Wiley & Sons, Ltd. Patrick Laloyaux, Eric deBoisseson,MagdalenaBalmaseda, Jean-RaymondBidlot, Stefan Broennimann, Roberto Buizza, Per Dalhgren, Dick Dee, Leopold Haimberger, Hans Hersbach, Yuki Kosaka, MatthewMartin, Paul Poli, Nick Rayner, Elke Rustemeier, and Dinand Schepers. CERA-20C:ACoupled Reanalysis of the Twentieth Century. Journal of Advances inModeling Earth Systems, 10(5):1172?1195,May 2018. ISSN1942-2466. doi: 10.1029/2018MS001273. URL https://doi.org/10.1029/2018MS001273. Publisher: John Wiley & Sons, Ltd. 161 Polly J. Smith, Alison M. Fowler, and Amos S. Lawless. Exploring strategies for coupled 4D-Var data assimilation using an idealised atmosphere?ocean model. Tellus A: DynamicMeteorology and Oceanography, 67(1):27025, December 2015. ISSN null. doi: 10.3402/tellusa.v67.27025. URL https://doi.org/10.3402/tellusa.v67. 27025. Publisher: Taylor & Francis. Polly J. Smith, Amos S. Lawless, and Nancy K. Nichols. Estimating Forecast Error Covariances for Strongly Coupled Atmosphere?Ocean 4D-Var Data Assimilation. Monthly Weather Review, 145(10):4011?4035, October 2017. doi: 10.1175/ MWR-D-16-0284.1. URL https://journals.ametsoc.org/view/journals/ mwre/145/10/mwr-d-16-0284.1.xml. Place: Boston MA, USA Publisher: American Meteorological Society. Polly J. Smith, Amos S. Lawless, and Nancy K. Nichols. Treating Sample Covariances for Use in Strongly Coupled Atmosphere-Ocean Data Assimilation. Geophysical Research Letters, 45(1):445?454, January 2018. ISSN 0094-8276. doi: 10.1002/2017GL075534. URL https://doi.org/10.1002/2017GL075534. Publisher: John Wiley & Sons, Ltd. Polly J. Smith, Amos S. Lawless, and Nancy K. Nichols. The role of cross-domain error correlations in strongly coupled 4D-Var atmosphere?ocean data assimilation. Quarterly Journal of the Royal Meteorological Society, 146(730):2450?2465, July 2020. ISSN 0035-9009. doi: 10.1002/qj.3802. URL https://doi.org/10.1002/ qj.3802. Publisher: John Wiley & Sons, Ltd. Shu-Chih Yang, Matteo Corazza, Alberto Carrassi, Eugenia Kalnay, and Takemasa Miyoshi. Comparison of Local Ensemble Transform Kalman Filter, 3DVAR, and 4DVAR in a Quasigeostrophic Model. Monthly Weather Review, 137(2):693? 709, February 2009. doi: 10.1175/2008MWR2396.1. URL https://journals. ametsoc.org/view/journals/mwre/137/2/2008mwr2396.1.xml. Place: Boston MA, USA Publisher: American Meteorological Society. Andrew C. Lorenc. The potential of the ensemble Kalman filter for NWP?a comparison with 4D-Var. Quarterly Journal of the Royal Meteorological Society, 129(595):3183? 3203, October 2003. ISSN 0035-9009. doi: 10.1256/qj.02.132. URL https://doi. org/10.1256/qj.02.132. Publisher: John Wiley & Sons, Ltd. Andrew C. Lorenc, Neill E. Bowler, Adam M. Clayton, Stephen R. Pring, and David Fairbairn. Comparison of Hybrid-4DEnVar and Hybrid-4DVar Data Assimilation Methods for Global NWP. Monthly Weather Review, 143(1):212?229, January 2015. doi: 10.1175/MWR-D-14-00195.1. URL https://journals.ametsoc.org/ view/journals/mwre/143/1/mwr-d-14-00195.1.xml. Place: Boston MA, USA Publisher: American Meteorological Society. R. N. Bannister. A review of forecast error covariance statistics in atmospheric variational data assimilation. II: Modelling the forecast error covariance statistics. Quarterly Journal of the Royal Meteorological Society, 134(637):1971?1996, October 2008a. 162 ISSN 0035-9009. doi: 10.1002/qj.340. URL https://doi.org/10.1002/qj.340. Publisher: John Wiley & Sons, Ltd. R. N. Bannister. A review of forecast error covariance statistics in atmospheric variational data assimilation. I: Characteristics and measurements of forecast error covariances. Quarterly Journal of the Royal Meteorological Society, 134(637):1951?1970, October 2008b. ISSN 0035-9009. doi: 10.1002/qj.339. URL https://doi.org/10.1002/ qj.339. Publisher: John Wiley & Sons, Ltd. Stephen Haben. Conditioning and preconditioning of the minimisation problem in variational data assimilation. PhD thesis, University of Reading, 2011. L. De Cruz, J. Demaeyer, and S. Vannitsem. The Modular Arbitrary-Order Ocean- Atmosphere Model: MAOOAM v1.0. Geosci. Model Dev., 9(8):2793?2808, August 2016. ISSN 1991-9603. doi: 10.5194/gmd-9-2793-2016. URL https://gmd. copernicus.org/articles/9/2793/2016/. Publisher: Copernicus Publications. St?phaneVannitsemandValerioLucarini. Statistical and dynamical properties of covariant lyapunov vectors in a coupled atmosphere-ocean model?multiscale effects, geometric degeneracy, and error dynamics. Journal of Physics A: Mathematical and Theoretical, 49(22):224001, May 2016. ISSN 1751-8113. doi: 10.1088/1751-8113/49/22/224001. URL http://dx.doi.org/10.1088/1751-8113/49/22/224001. Publisher: IOP Publishing. E. Kalnay, Kingtse C. Mo, and J. Paegle. Large-Amplitude, Short-Scale Stationary Rossby Waves in the Southern Hemisphere: Observations and Mechanistic Experiments to Determine their Origin. Journal of Atmospheric Sciences, 43(3): 252?275, February 1986. doi: 10.1175/1520-0469(1986)043<0252:LASSSR>2. 0.CO;2. URL https://journals.ametsoc.org/view/journals/atsc/43/3/ 1520-0469_1986_043_0252_lasssr_2_0_co_2.xml. Place: Boston MA, USA Publisher: American Meteorological Society. 163