ABSTRACT Title of Dissertation: DEVELOPMENT OF APPROACHES TO COMMON CAUSE DEPENDENCIES WITH APPLICATIONS TO MULTI-UNIT NUCLEAR POWER PLANT Taotao Zhou, Doctor of Philosophy, 2018 Dissertation directed by: Professor Mohammad Modarres, and Professor Enrique López Droguett, Reliability Engineering Program, Department of Mechanical Engineering The term “common cause dependencies” encompasses the possible mechanisms that directly compromise components performances and ultimately cause degradation or failure of multiple components, referred to as common cause failure (CCF) events. The CCF events have been a major contributor to the risk posed by the nuclear power plants and considerable research efforts have been devoted to model the impacts of CCF based on historical observations and engineering judgment, referred to as CCF models. However, most current probabilistic risk assessment (PRA) studies are restricted to single reactor units and could not appropriately consider the common cause dependencies across reactor units. Recently, the common cause dependencies across reactor units have attracted a lot of attention, especially following the 2011 Fukushima accident in Japan that involved multiple reactor unit damages and radioactive source term releases. To gain an accurate view of a site's risk profile, a site-based risk metric representing the entire site rather than single reactor unit should be considered and evaluated through a multi-unit PRA (MUPRA). However, the multi-unit risk is neither formally nor adequately addressed in either the regulatory or the commercial nuclear environments and there are still gaps in the PRA methods to model such multi-unit events. In particular, external events, especially seismic events, are expected to be very important in the assessment of risks related to multi-unit nuclear plant sites. The objective of this dissertation is to develop three inter-related approaches to address important issues in both external events and internal events in the MUPRA. 1) Develop a general MUPRA framework to identify and characterize the multi- unit events, and ultimately to assess the risk profile of multi-unit sites. 2) Develop an improved approach to seismic MUPRA through identifying and addressing the issues in the current methods for seismic dependency modeling. The proposed approach can also be extended to address other external events involved in the MUPRA. 3) Develop a novel CCF model for components undergoing age-related degradation by superimposing the maintenance impacts on the component degradation evolutions inferred from condition monitoring data. This approach advances the state-of-the-art CCF analysis in general and assists in the studies of internal events of the MUPRA. DEVELOPMENT OF APPROACHES TO COMMON CAUSE DEPENDENCIES WITH APPLICATIONS TO MULTI-UNIT NUCLEAR POWER PLANT by Taotao Zhou Dissertation submitted to the Faculty of the Graduate School of the University of Maryland, College Park, in partial fulfillment of the requirements for the degree of Doctor of Philosophy 2018 Advisory Committee: Professor Mohammad Modarres, Advisor and Chair Adjunct Associate Professor Enrique López Droguett, Co-Advisor Professor Gregory B. Baecher (Dean's Representative) Associate Professor Gary A. Pertmer Assistant Professor Monifa Vaughn-Cooke Assistant Professor Katrina Groth © Copyright by Taotao Zhou 2018 Dedication To my parents and my fiancée, who always believe in me. ii Acknowledgements It has been an enjoyable and challenging journey at the University of Maryland. First and foremost, I wish to thank my advisors, Dr. Mohammad Modarres and Dr. Enrique López Droguett. This work would not have been possible without their support, patience and guidance. I appreciate all their contributions of time, ideas, and funding to make my Ph.D. experience productive and stimulating. I am also grateful to all my dissertation committee members Dr. Gregory B. Baecher, Dr. Gary A. Pertmer, Dr. Monifa Vaughn-Cooke and Dr. Katrina Groth for reviewing the dissertation and providing valuable advice to improve this research. I would like to thank all of those with whom I have had the pleasure to work during the time at the University of Maryland. My special appreciation goes to Mr. Jan Muehlbauer, as my mentor and friend, for guiding me through the experimental works. Nobody has been more important to me in the pursuit of my Ph.D. than the members of my family: parents, sisters, nieces and nephew. I am especially indebted to my parents Jiasheng Zhou and Chaoying Li, whose love and support are with me in whatever I pursue. They are always the ultimate role models and have sacrificed so much to achieve this accomplishment. Most importantly, I wish to thank my loving and supportive fiancée, Baochan Wang, for her unconditional love and support. I would also like to thank my future in-laws for their support and encouragement. This dissertation is dedicated to them. iii Table of Contents Dedication ..................................................................................................................... ii Acknowledgements ...................................................................................................... iii Table of Contents ......................................................................................................... iv List of Tables ................................................................................................................ ii List of Figures .............................................................................................................. iii Chapter 1: Introduction ................................................................................................. 1 1.1. Background and Motivation .............................................................................. 1 1.2. Research Objectives ........................................................................................... 4 1.3. Methodologies.................................................................................................... 5 1.3.1. Review the State-of-the-Art MUPRA ......................................................... 6 1.3.2. Develop a General MUPRA Framework .................................................... 6 1.3.3. Develop an Approach to External-Event MUPRA ..................................... 7 1.3.4. Advance the Studies of Internal Events in the MUPRA ............................. 9 1.4. Organization of the Dissertation ...................................................................... 10 Chapter 2: A Review of Multi-Unit Nuclear Power Plant Probabilistic Risk Assessment Research .................................................................................................. 11 2.1. Abstract ............................................................................................................ 11 2.2. Introduction ...................................................................................................... 11 2.3. MUPRA Research Summary ........................................................................... 14 2.3.1. Multi-Unit Event ....................................................................................... 15 2.3.2. MUPRA Modeling .................................................................................... 16 2.3.3. Site-Based Risk Metric ............................................................................. 18 2.4. Conclusions ...................................................................................................... 19 Chapter 3: Advances in Multi-Unit Nuclear Power Plant Probabilistic Risk Assessment .................................................................................................................. 21 3.1. Abstract ............................................................................................................ 21 3.2. Introduction ...................................................................................................... 22 3.3. Background ...................................................................................................... 24 3.4. Multi-Unit Quantitative Health Objectives and Their Surrogate Metrics ....... 29 3.4.1. Single-Unit CDF Metrics .......................................................................... 32 3.4.2. Multi-Unit CDF Metrics ........................................................................... 33 3.4.3. Multi-Unit LRF and LERF Metrics .......................................................... 36 3.5. Illustration of the MUPRA Approach .............................................................. 40 3.5.1. Multi-Unit Dependencies .......................................................................... 42 3.5.2. Parametric Estimation of Common and Causal Dependencies in Multi- Units .................................................................................................................... 46 3.6. Quantification of a Conceptual Example to Illustrate the MUPRA Approach 57 3.6.1. Single-Unit Marginal Cut Sets and Frequency Assessment ..................... 61 3.6.2. Double-Unit Cut Sets and Frequency Assessment ................................... 63 3.6.3. Marginal Single-Unit Important Events.................................................... 67 3.7. Conclusions ...................................................................................................... 68 Chapter 4: Issues in Dependency Modeling in Multi-Unit Seismic PRA .................. 70 4.1. Abstract ............................................................................................................ 70 iv 4.2. Introduction ...................................................................................................... 70 4.3. Seismic Fragility Evaluation ............................................................................ 74 4.4. Issues with the Equivalence Hypothesis Between β-Factor and Correlation Coefficient............................................................................................................... 76 4.5. Issues with the Reed-McCann Method ............................................................ 79 4.5.1. Reed-McCann Method .............................................................................. 80 4.5.2. Examination and Observations ................................................................. 83 4.6. Conclusions ...................................................................................................... 89 Chapter 5: An Improved Multi-Unit Nuclear Plant Seismic Probabilistic Risk Assessment Approach ................................................................................................. 90 5.1. Abstract ............................................................................................................ 90 5.2. Introduction ...................................................................................................... 91 5.3. An Overview of Seismic Risk Quantification Methods .................................. 97 5.3.1. Seismic Hazard Analysis .......................................................................... 97 5.3.2. Seismic Fragility Evaluation ..................................................................... 99 5.3.3. Numerical Schemes ................................................................................ 101 5.4. Proposed Hybrid Methodology with Demonstration ..................................... 103 5.4.1. Seismic Risk Quantification at the Group Level .................................... 103 5.4.2. Multi-Unit Site Seismic Risk Quantification .......................................... 115 5.5. Example Application to the Seismic MUPRA............................................... 124 5.5.1. A Two Reactor Unit Site Seismic PRA .................................................. 126 5.5.2. The Two-Unit Site Seismic PRA Results ............................................... 127 5.6. Conclusions and Recommendations .............................................................. 132 Chapter 6: A Common Cause Failure Model for Components under Age-Related Degradation ............................................................................................................... 133 6.1. Abstract .......................................................................................................... 133 6.2. Introduction .................................................................................................... 134 6.3. Proposed Approach ........................................................................................ 140 6.3.1. Modeling Scope and Assumption ........................................................... 141 6.3.2. Degradation Assessment ......................................................................... 142 6.3.3. Estimation of the β-Factor for CCF Probability ..................................... 149 6.4. Experimental Study ........................................................................................ 155 6.4.1. Experimental Design and Instrumentation.............................................. 156 6.4.2. Pump Failure Analysis ............................................................................ 158 6.4.3. Pump Degradation Assessment............................................................... 160 6.4.4. Pump Degradation Model Development ................................................ 164 6.4.5. Experimental Results for CCF Estimation .............................................. 168 6.5. Conclusions .................................................................................................... 180 Chapter 7: Conclusions, Contributions and Recommendations ............................... 182 7.1. Conclusions .................................................................................................... 182 7.2. Contributions.................................................................................................. 183 7.3. Recommendations for Future Research ......................................................... 186 Bibliography ............................................................................................................. 188 v List of Tables Table 3-1: Total number of LER end effects affecting multi-units ............................ 55 Table 3-2: LER events involving 2 or 3 units and estimation of probabilities of multiple events ............................................................................................................ 56 Table 3-3: Unit-1 specific cut sets .............................................................................. 61 Table 3-4: Unit-1 cut sets conditioned (causally) on Unit-2 events ........................... 62 Table 3-5: Unit-1 cut sets with initiating events are dependent (causally) on Unit-2 events .......................................................................................................................... 63 Table 3-6: Causal double unit sequences .................................................................... 64 Table 3-7: Common cause double unit sequences ...................................................... 65 Table 3-8: Significant cut sets for the single-unit CD ................................................ 67 Table 3-9: Significant cut sets for the double-unit CD ............................................... 68 Table 4-1: Four cases for examination purpose .......................................................... 84 Table 5-1: Composite correlation between the components ..................................... 113 Table 5-2: Four cases for examination purpose ........................................................ 113 Table 5-3: PGA intervals and frequency .................................................................. 117 Table 5-4: Four cases to be analyzed for the selection of reference PGA level ....... 117 Table 5-5: Results for the system of 2/3 failures ...................................................... 118 Table 5-6: Results for the system of 3/3 failure ........................................................ 119 Table 5-7: SSCs properties [131] .............................................................................. 121 Table 5-8: SAPHIRE results and simulation results ................................................. 124 Table 5-9: PGA intervals and frequency .................................................................. 127 Table 5-10: Multi-unit CDF results .......................................................................... 131 Table 6-1: Failure analysis for each of the three pumps ........................................... 159 Table 6-2: Results for regression and goodness-of-fit statistics ............................... 167 ii List of Figures Figure 3-1: Classes of intra-unit dependencies [20] ................................................... 33 Figure 3-2: Multi-unit CD as single event .................................................................. 34 Figure 3-3: Conceptual examples of unit-to-unit dependencies ................................. 45 Figure 3-4: An example of causal mapping an LER event of a multi-unit site ......... 57 Figure 3-5: A conceptual two-unit logic for demonstration of classes of dependencies and their probabilistic treatment in the PRA............................................................... 59 Figure 4-1: Example fragility curves [114] ................................................................ 76 Figure 4-2: Flowchart of the Reed-McCann method .................................................. 80 Figure 4-3: Comparison between independent cases with fifty iterations (intersection & union) ...................................................................................................................... 86 Figure 4-4: Performance index with ten different numbers of iterations N ................ 87 Figure 4-5: Comparison between dependent and independent cases (intersection & union) .......................................................................................................................... 88 Figure 5-1: Example seismic hazard curves [105] ...................................................... 99 Figure 5-2: Example fragility curves [114] .............................................................. 101 Figure 5-3: Flowchart of the simulation-based scheme at the group level ............... 104 Figure 5-4: Results for the dependent and independent cases in Table 5-2 using the simulation-based scheme displayed in Figure 5-3 .................................................... 114 Figure 5-5: Example seismic hazard curve ............................................................... 116 Figure 5-6: Flowchart of the discretization-based scheme at the site level .............. 120 Figure 5-7: β-factor with equivalent correlation coefficient of 0.8 .......................... 122 Figure 5-8: β-factor for the three-component system ............................................... 123 Figure 5-9: Results for the total site CDF metric ...................................................... 129 Figure 5-10: Results for the concurrent CDF metric ................................................ 129 Figure 5-11: Results for the marginal CDF metric ................................................... 130 Figure 5-12: Contribution of concurrent CDF to total site CDF .............................. 130 Figure 6-1: CCF for components under age-related degradation ............................. 137 Figure 6-2: Flowchart of the proposed approach ...................................................... 141 Figure 6-3: Proposed degradation assessment method ............................................. 145 Figure 6-4: Characterization of CCF with the components’ degradation states and endurance to degradation .......................................................................................... 150 Figure 6-5: Test rig and instrumentation .................................................................. 158 Figure 6-6: Degradation index constructed based on process monitoring data ........ 161 Figure 6-7: Degradation index constructed based on vibration monitoring data...... 163 Figure 6-8: Degradation index constructed based on AE monitoring data ............... 164 Figure 6-9: Degradation index regarding three types of condition monitoring data 165 Figure 6-10: Testing profile with three phases ......................................................... 168 Figure 6-11: (a) Illustration of the degradation states of all three pumps at t= 1500 hours; (b) estimate of β-factor over the entire test .................................................... 169 Figure 6-12: Estimate of -factor for Phase 1 and Phase 2 ...................................... 170 Figure 6-13: (a) Condition-based maintenance policy; (b) imperfect maintenance characterized by the beta distribution with α=5 and γ=1.5 ....................................... 171 iii Figure 6-14: (a) Hourly evolution of β-factor; (b) distribution of β-factor in relation to component degradation and maintenance actions..................................................... 172 Figure 6-15: (a) Three options for rejuvenation factor considered in sensitivity analysis; (b) mean β-factor changes according to twenty-seven imperfect maintenance policies ...................................................................................................................... 173 Figure 6-16: Mean β-factor changes according to nine perfect maintenance policies ................................................................................................................................... 175 Figure 6-17: CCF adjustment curve according to twenty-seven maintenance policies ................................................................................................................................... 178 iv Chapter 1: Introduction 1.1. Background and Motivation Defense-in-Depth (DiD) involves introducing isolation, redundancy and diversity to address complexity and uncertainty in plant systems, structures and components (SSCs) and to enhance reliability and safety of the nuclear power plant. The ensuing assumption is that the redundant and diverse SSCs reduce the likelihood of failures and the isolated SSCs are fully independent. However, perfect isolation is not possible and various types of dependencies do exist between the components, because of common design features, operational practices, safety culture, economic considerations, and construction layout [1]. The possible dependencies may defeat perfect isolation, redundancy and diversity principles, and ultimately lead to a class of component failures called dependent failure. The influence of these dependencies could be either explicitly modeled in the fault tree logic or implicitly treated as the type of common cause dependencies leading to common cause failure (CCF) events. The common cause dependencies are usually characterized [2] by root causes (i.e., pre-operational-related, operational-maintenance-related and operational- environment-related) and coupling factors (i.e., hardware-based, operation-based and environment-based), which impair the capacities of components to perform the design function and then ultimately lead to CCF events. The CCF events have been recognized as the significant contributors to the risk posed by the nuclear power plants. Typically, the impacts of CCF are parametrically 1 modeled. The relevant CCF models [3] may be grouped into two major categories: shock models (e.g., binomial failure rate model) and non-shock models. The non- shock models are mainly adopted in the PRA practices, including the β-Factor Model, the α-Factor Model and the Multiple Greek Letter Model [4]. In these models, the CCF events are characterized by some static CCF parameters that need to be quantified through statistical analysis based on historical observations and engineering judgment [5, 6]. However, these CCF models suffer from several major limitations summarized as follows:  The models are mainly built from generic operational experience and are usually not specific to the operating conditions of individual components.  The number of observed failure events, particularly in the nuclear power plants is limited, especially for the events involving failures of more than one identical or similar component.  There are difficulties to model asymmetrical components and to account for the dependencies among the components within multiple common cause component groups.  The implicit assumption of the present parametric CCF models is constant failure rate where the failures are treated as fully random without considering the effects of degradation. To address these limitations, there are three main approaches in the present literature to enhance these CCF models: 2  Improve the quality and quantity of CCF database by compiling the CCF event data in a more consistent manner. For instance, the International Common-Cause Failure Data Exchange (ICDE) Project [7] has been established to obtain both qualitative and quantitative insights of CCF by properly integrating many national experiences.  Formulate a casual CCF model to account for the relationship of specific root causes and coupling factors on the CCF. Bayesian network is adopted as the main technique to establish the causal framework to probabilistically link all relevant sources. Examples include the Unified Partial Method and its extension referred to as the Zitrou’s Model [8], the Kelly-CCF Method [9], the Alpha- Decomposition Method [10], and the General Dependency Model [11].  Address some other limitations of the current CCF models, for instance, by treating the dependencies among the components across multiple common cause component groups [12, 13], improving the uncertainty treatment [14, 15], and developing the extension of current CCF models [16, 17]. It should be noted that most current PRA studies are restricted to single reactor units [18] and hence are referred to as single-unit PRA (SUPRA). The SUPRAs include scenarios exclusive to one reactor unit, implicitly assuming other units will be unaffected, and hence only consider the dependencies within the boundary of one reactor unit [19]. Note the dependencies across multiple reactor units could play critical roles in potential nuclear accidents with the possibility of core damage in multiple reactor cores and spent fuels. The risk significance of multi-unit events is 3 especially highlighted in the 2011 Fukushima accident in Japan. Furthermore, Schroer and Modarres [20] reviewed all the U.S. Licensee Event Reports (LERs) submitted to the U.S. NRC between 2000 and 2011, and confirmed the significance of multi-unit events because over 9% of the total LERs affected multiple units on a site [21]. Recent research and operational experiences [22, 23] have recognized that loss of offsite power (LOOP) and external events to be the dominant multi-unit initiating events. Among these, seismic events are the most likely event sequences that challenge multiple radiological sources. To gain an accurate view of a multi-unit site's risk profile, a site-based risk metric representing the entire site rather than single reactor unit should be considered and evaluated through a multi-unit PRA (MUPRA). However, the multi-unit risk is neither formally nor adequately addressed in either the regulatory or the commercial nuclear environments and there are still gaps in the PRA methods to model such multi-unit events. The primary objective of this dissertation is to address the important issues faced in the development of MUPRA. 1.2. Research Objectives There are four research objectives established to investigate the important issues in both external events and internal events analyses in MUPRA. 1) Conduct a holistic review of the state-of-the-art MUPRA research that facilitate the understanding of the status of MUPRA development, existing gaps and need for future research. 4 2) Develop a general MUPRA framework that addresses three main questions: (a) how to identify and understand all the possible dependencies across reactor units? (b) is it sufficient for the current CCF parametric methods to model the impacts of multi-unit events? (c) how to develop appropriate site-based risk metrics for multi-unit scenarios? 3) As the most likely dominant multi-unit event with extremely limited observations, external events especially seismic events should be addressed specifically in the MUPRA development. The adequacy of the current methods for seismic dependency modeling should be reviewed, and new approach should be developed if needed. 4) The number of CCF observations regarding internal events are not large enough especially for the events occurring in multiple units. Other data sources should be solicited, for instance, the sensor monitoring data that could be used to infer the component states. 1.3. Methodologies The methodologies of this dissertation are documented in the form of five articles, among which, two articles have been published in leading journals, one has been presented and published in a peer-reviewed international conference, one has been accepted to be published in a peer-reviewed international conference, and one journal article has been prepared and is in review. Within these five articles, the four research 5 objectives stated in Chapter 1.2 have been addressed. The methodologies are summarized in the following sections. 1.3.1. Review the State-of-the-Art MUPRA The first objective was accomplished by a holistic review of state-of-the-art MUPRA of nuclear power plants. The detailed review is documented in Chapter 2 “A Review of Multi-Unit Nuclear Power Plant Probabilistic Risk Assessment Research”. The full-text has been accepted to be published in the Proceedings of the 2018 International Conference on Nuclear Engineering (ICONE26). The research contents are highlighted as follows:  Summarize the relevant activities to address and develop methodologies including workshops, proceedings, projects and case studies.  Review the different facets of MUPRA research including multi-unit event, MUPRA modeling and site-based risk metric.  Identify existing gaps and the need for future research. 1.3.2. Develop a General MUPRA Framework The second objective was accomplished by examining the advances in the MUPRA and proposing a general MUPRA framework. The detailed methodology and results are documented in Chapter 3 “Advances in Multi-Unit Nuclear Power Plant Probabilistic Risk Assessment”. The full-text has been published in the Journal of Reliability Engineering & System Safety. The research contents are highlighted as follows: 6  Propose a general MUPRA framework that relies on expanding multiple single- unit PRAs by superimposing the impacts of multi-unit dependencies.  Build a systematic way to identify and understand the multi-unit events.  Develop quantitative approaches to characterize the multi-unit events and incorporate their impacts to the MUPRA model.  Offer the formal definitions of multi-unit site risk metrics based on the conventional single-unit risk metrics (e.g., CDF, LRF, LERF).  Demonstrate the proposed framework with a conceptual two-unit example. 1.3.3. Develop an Approach to External-Event MUPRA The third objective was accomplished by developing an improved approach to seismic MUPRA, which addresses the limitation of the current seismic dependency modeling methods and achieves a balance between risk estimation accuracy and computational simplicity. The approach can also be extended to address other external events involved in the MUPRA. The proposed research consists of two parts. The first part was to review the adequacy and identify the issues in the current methods for seismic dependency modeling in the MUPRA. The detailed review and discussions are documented in Chapter 4 “Issues in Dependency Modeling in Multi- Unit Seismic PRA”. The full-text has been presented and published in the Proceedings of the 2017 International Topical Meeting on Probabilistic Safety Assessment and Analysis (PSA 2017). Several issues were identified in the present methodologies for consideration of dependencies in seismic MUPRA: 7  Identify and demonstrate the inappropriate equivalence hypothesis between the β- factor and correlation coefficient.  Examine the Reed-McCann method through a comparison study which showed that the Reed-McCann method cannot properly characterize the contribution of dependencies. To address the issues identified in the first part, the second part developed an improved approach to external event probabilistic risk assessment for multi-unit sites. The detailed methodology and results are documented in Chapter 5 “An Improved Multi-Unit Nuclear Plant Seismic Probabilistic Risk Assessment Approach”. The full- text has been published in the Journal of Reliability Engineering & System Safety. The research contents are highlighted as follows:  Develop a seismic MUPRA methodology that could properly consider the seismic-induced dependencies and the implementation based on the standard PRA software tools. The proposed approach is based on a hybrid scheme that achieves a balance between risk estimation accuracy and computational simplicity.  Applying the proposed approach to a three-component example, demonstrate the issues in the inappropriate uses of the geometric mean as the reference level for the ground motion in the current discretization-based scheme for seismic risk quantification.  Develop a case study for the seismic-induced multi-unit scenarios for a hypothetical two-unit nuclear plant site.  Conduct a feasibility analysis of three different multi-unit risk metrics. 8 1.3.4. Advance the Studies of Internal Events in the MUPRA Because of very limited historical CCF data between plant units, the fourth objective was accomplished through development of a novel CCF model for components undergoing age-related degradation by superimposing the impacts of maintenance on the component degradation evolutions inferred from condition monitoring data. This approach advances the state-of-the-art CCF analysis in general and relies on physics- based models to develop internal CCF event likelihoods for the MUPRA. The detailed methodology, experimental study and results are documented in Chapter 6 “A Common Cause Failure Model for Components under Age-Related Degradation”. The full-text has been submitted to the Journal of Reliability Engineering & System Safety. The research contents are highlighted as follows:  Propose a novel CCF model for components under age-related degradation by exploiting recent advances in sensor-based data analytic algorithms.  Demonstrate the efficacy of the proposed approach using the diverse sensory data collected from a special-purpose experiment.  Develop simulation as well as sensitivity studies to evaluate the maintenance effects on CCF over lifetime services.  Demonstrate the applicability of the proposed approach to estimate the β-factor for CCF probability of specific components. 9 1.4. Organization of the Dissertation The dissertation is arranged into the following chapters.  Chapter 2 provides a literature review of the state-of-the-art MUPRA and a summary of the relevant activities to address and develop methodologies for MUPRA.  Chapter 3 examines the advances in the MUPRA and develops a general MUPRA framework to identify and characterize the multi-unit events, and ultimately to assess the risk profile of multi-unit sites.  Chapter 4 reviews the adequacy and identifies the issues in the current methods for seismic dependency modeling in the MUPRA.  Chapter 5 develops an improved approach to seismic MUPRA, which addresses the issues identified in Chapter 4 and achieves a balance between risk estimation accuracy and computational simplicity.  Chapter 6 proposes a CCF model for components under age-related degradation by superimposing the impacts of maintenance on the component degradation evolutions inferred from condition monitoring data. This approach advances the state-of-the-art CCF analysis in general and assists in the studies of internal events of MUPRA.  Chapter 7 presents a summary of conclusions, contributions and recommendations for future research. 10 Chapter 2: A Review of Multi-Unit Nuclear Power Plant 1 Probabilistic Risk Assessment Research 2.1. Abstract The events at the Fukushima nuclear power station drew attention to the need for consideration of risks from multiple nuclear reactor units co-located at a site. As a result, considerable international interests and research efforts have been dedicated to addressing the multi-unit risks over the past few years. This paper presents a review of the state-of-the-art multi-unit probabilistic risk assessment (MUPRA) of nuclear power plants. The concept of MUPRA is briefly presented and the relevant activities to address and develop methodologies are summarized including workshops, proceedings, projects and case studies. The paper presents different facets of MUPRA research, including multi-unit event, MUPRA modeling and site-based risk metric. The paper also identifies existing gaps and the need for future research. 2.2. Introduction Defense-in-Depth involves introducing isolation, redundancy and diversity to address complexity and uncertainty in plant systems, structures and components (SSCs) and to enhance reliability and safety of the nuclear power plant. The ensuing assumption is that the redundant and diverse SSCs reduce the likelihood of failures and the isolated SSCs are fully independent. However, perfect isolation is not possible and 1 The full-text of this chapter is to appear in the Proceedings of the 26th International Conference on Nuclear Engineering (ICONE), July 22-26, 2018, London, England. 11 various types of dependencies do exist between the components, because of common design features, operational practices, safety culture, economic considerations, and construction layout. The possible dependencies may defeat perfect isolation, redundancy and diversity principles, and ultimately lead to a class of component failures called dependent failures. The significance of dependent failures has been well recognized. The influence of these dependencies could be either explicitly modeled in the fault tree logic or implicitly treated as the type of common cause dependencies leading to common cause failure (CCF) events [3]. Considerable research efforts have been dedicated to accounting for the contributions of dependent failures to the risk posed on the nuclear power plants [24]. However, most probabilistic risk assessment (PRA) studies are restricted to single reactor units [18] and hence are referred to as single-unit PRA (SUPRA). The SUPRAs include scenarios exclusive to one reactor unit assuming other units will be unaffected, and hence only consider the dependencies within the boundary of one reactor unit [19]. Note that the effects of dependencies across multiple reactor units could play critical roles in potential nuclear accidents with the possibility of core damage in multiple reactor cores and spent fuels [25]. As such, the current single-unit risk metrics [26] cannot capture contributions from multi-unit accidents and are inadequate for providing accurate insights of the multi-unit site’s risk profile. The findings above are important and recognize the urgent need for a multi-unit PRA (MUPRA) to assess the risk profile of multi-unit sites and identify the critical 12 contributors to the entire site risk. However, there have been limited experiences in performing MUPRAs in either the regulatory or the commercial nuclear environments. The first study of MUPRA in the U.S. dates back to the Indian Point Station PRA performed in the early 1980's [27] that addressed the dual-unit releases due to seismic and high wind hazards. Another example of a MUPRA was the Level 3 PRA for the Seabrook Station in New Hampshire, U.S., performed in the mid- 1980's [28]. Methods have also been recommended to address facets of a MUPRA analysis [29-34], yet no well-established integrated approach and understanding of the implications of MUPRA exist. More recently, considerable interests and research efforts have been presented by research groups in the U.S., Canada and other countries [35] to addressing the multi-unit risks, especially following the Fukushima accident of March 2011 in Japan that involved multiple reactor units and radioactive source terms [36-38]. In the international arena, a series of activities (i.e., workshops, proceedings, projects and case studies) have been conducted or planned:  IAEA sponsored a workshop [39] that discussed the issue of MUPRA in 2012.  NEA/CNSC organized a workshop [40] on this subject that took place in Ottawa, Canada, in November 2014.  The International Seismic Safety Centre of IAEA has been working on developing a series of Safety Reports for MUPRA [41, 42].  Plenary lectures, special sessions and technical presentations on the MUPRA issues have been conducted or planned in a series of proceedings including the workshops organized by OECD [43-45], PSA2013 [46], the 12th PSAM [47], PSA2015 [48], the 13th PSAM [49], the PSA2017 [50], the 14th PSAM [51]. 13  The European countries have sponsored the project ASAMPSA_E [52] to investigate the challenges in light of the Fukushima accident.  Most importantly IAEA has developed MUPRA methodology documents and continues to refine the documents, including an ongoing case study involving a 4- unit nuclear plant site [53]. This paper aims to provide a review of the state-of-the-art MUPRA research. The MUPRA researches are summarized from three aspects: multi-unit events, MUPRA modeling and site-based risk metrics. This study would benefit the researchers interested in MUPRA research and contribute to identifying existing gaps and opportunities in future work. 2.3. MUPRA Research Summary The MUPRA research could be summarized in terms of three categories: (1) identify and understand all the possible dependencies across reactor units to explicitly recognize the role of multi-unit events in enhancing the site safety; (2) develop a MUPRA methodology to characterize the multi-unit events and incorporating their impacts to the accident sequences involving multiple reactor units; (3) develop appropriate site-based risk metrics for multi-unit scenarios. 14 2.3.1. Multi-Unit Event A variety of events could result in dependencies across multiple reactor units and potentially lead to failure events of similar or dissimilar nature during the same mission time such as shared electric buses, failure of grid, flooding and earthquake. Unfortunately, much of what is known today about the multi-unit events is based on what has been learned through the operational experiences and Fukushima accident. A systematic approach should be built to fully identify and understand the multi-unit events [54]. Schroer and Modarres [20] reviewed all the U.S. Licensee Event Reports (LERs) submitted to the U.S. NRC between 2000 and 2011, and confirmed the significance of multi-unit events because over 9% of the total LERs affected multiple units on a site [21]. A classification scheme was also proposed, and the multi-unit events were sorted into six categories: initiating events, shared connections, identical components, proximity dependencies, human dependencies, and organizational dependencies. This classification scheme has been adopted in most current researches and a classification study was presented by Dennis et al. [55] for an integral pressurized water reactor (iPWR). The knowledge of multi-unit events allows further development of screening criteria to identify the multi-unit events that needed to be modeled. Three forms of root causes were identified [25]: (1) common site conditions (e.g., organizational dependency), (2) external events include both explicit type (e.g., failure of the grid), and implicit type (e.g., similar design errors), and (3) common events or components (e.g., shared electric buses). 15 Recent research and operational experiences [22, 23] have recognized that loss of offsite power (LOOP) and external hazards to be the dominant multi-unit initiating events. Among these, external events are the most likely event sequences that challenge multiple radiological sources. Furthermore, some of the hazards may potentially occur in combination namely correlated hazards [56], which may cause significant consequences and even have comparable occurrence frequency as well as that of individual hazard. Screening criteria need to be developed to identify the hazards that deserve additional consideration, and methodologies also need to be developed to assess frequencies of individual hazard and correlated hazards [36, 57]. 2.3.2. MUPRA Modeling The MUPRA modeling involves two challenging tasks: characterize multi-unit dependencies and incorporate their impacts to the accident sequences involving multiple reactor units. In general, the MUPRA modeling could be categorized into either a dynamic approach or a static approach. A simple, but practical approach to static MUPRA is proposed by the IAEA [53] where SUPRA logic is simplified and then multiple SUPRA are integrated using traditional single unit common cause dependencies to treat multi-unit dependencies. However, more complex and realistic methodologies have also been proposed as discussed in this paper. There have been few dynamic MUPRA studies which rely on certain dynamic simulators. For instance, Dennis et al. [55] developed a framework for assessing integrated site risk of small module reactors using an upgraded version of ADS-IDAC; Mandelli et al. [58] presented a PRA analysis of a multi-unit plant using RAVEN as stochastic 16 method coupled with RELAP5-3D. Although the dynamic studies could provide more realistic multi-unit insights by considering timing chain of events and more complex dependencies, they offer limited practicality because they would be subject to several common limitations: (a) increase model complexity; (b) lack of information to support characterizing the very detailed time-based multi-unit scenarios; (c) major constraint on the computational demand. Most of the practical methods rely on expanding the existing SUPRA model, including the IAEA methodology, because sufficient basis for estimating the multi- unit dependencies are available from the traditional methods of common cause parametric estimation [25]. There have been lots of interests to develop such integrated approach. The most important works in this direction are Yang [59], Vecchiarelli et al. [60], Hassija et al. [61], Le Duy et al. [62], Zhang et al. [63] and Modarres et al. [25]. Note that most of the research provides limited considerations of the data needed to characterize the impacts of multi-unit events. As such, Modarres et al. [25] proposed to extend the traditional parametric method for common cause failure to multi-unit dependent event situations and reported a four-stage approach [64] to assess the observed multi-unit incidents and failure events based on the LERs reported to the U.S. NRC. This approach provides a defensible technical basis to characterize the multi-unit events and was illustrated by analyzing the U.S. LERs from years 2000 to 2011. The results showed that inter-unit dependencies among human errors, hardware failures and initiating events are generally large, but smaller than intra-unit common cause dependencies. However, the study recommends that to 17 enhance the treatment of the uncertainties and the risk significance of different types of multi-unit events, importance measures [23] and impact vectors [65] would be introduced in the future research. Furthermore, more reported U.S. LERs should be analyzed, and the application of advanced knowledge engineering tools and techniques would be the solution to assist the identification and characterization of multi-unit events [66]. One must also note that simply lumping multi-unit events with different features could lead to inaccuracies and even errors in the characterization of multi-unit events. The features of MUPRA modeling should be addressed specifically in terms of different types of multi-unit events, for instance, the LOOP [29-30], shared SSCs [32, 67], human errors [68], and external hazards [69, 70]. Among these, seismic events have received a lot of attention because of their likelihood to induce multi-unit dependencies with significant consequences [33, 35, 57, 71-73]. The adequacy of the current methods for seismic dependency modeling in MUPRA was discussed by Zhou et al. [73]. Although most current researches focus on seismic hazards, the external event MUPRA methodologies can be extended to other external hazards such as high wind hazards. 2.3.3. Site-Based Risk Metric Site-based risk metric needs to be properly developed to evaluate the multi-unit site risk profile and assure the public health and safety [40]. The current practice is to define site-based risk metrics based on the conventional single-unit risk metrics: core 18 damage frequency (CDF), large release frequency (LRF) and large early release frequency (LERF). Different types of site-based risk metrics have been proposed [25, 31, 20, 60], for instance, conditional probability of multi-unit accident (CPMA), site CDF describing frequency of one or more CDFs, concurrent multi-unit CDFs, and site LRF. Yet there is no consensus on the most relevant choice of site-based risk metrics for applications to risk management has emerged. Readers interested in the applicability of possible site-based risk metrics are referred to Samaddara et al. [69]. It appears, however, that the most commonly adopted metric in the present literature [20, 61, 62] is the site CDF which means the frequency of at least one core damage per site per year. Most recently, Zhou et al. [35] used three multi-unit CDF metrics (site, concurrent and marginal) in a case study for seismic-induced Small Loss of Coolant Accident (SLOCA) for a hypothetical two-unit site and concluded the site CDF to be the most appropriate multi-unit CDF metric for seismic risk when no correlation data is available. There have also been few Level 3 PRA studies [26, 34] to evaluate the multi-unit offsite consequences considering multi-unit releases. The applicability of U.S. NRC qualitative safety goals and QHOs to multi-unit sites was validated by Hudson and Modarres [26] in which the surrogates for QHOs were assessed and compared to safety goals. Currently, the U.S. NRC is conducting a level-3 MUPRA analysis [74, 75]. 2.4. Conclusions This paper presented a review of the state-of-the-art MUPRA research and summarized the relevant literature in terms of three categories: multi-unit event, 19 MUPRA modeling and site-based risk metric. The relevant activities were also briefly summarized including workshops, proceedings, projects and case studies. The paper identified some of the existing gaps and the need for future research. 20 Chapter 3: Advances in Multi-Unit Nuclear Power Plant 2 Probabilistic Risk Assessment 3.1. Abstract The Fukushima Dai-ichi accident highlighted the importance of risks from multiple nuclear reactor unit accidents at a site. As a result, there has been considerable interest in Multi-Unit Probabilistic Risk Assessment (MUPRA) in the past few years. For considerations in nuclear safety, the MUPRA estimates measures of risk and identifies contributors to risk representing the entire site rather than the individual units in the site. In doing so, possible unit-to-unit interactions and dependencies should be modeled and accounted for in the MUPRA. In order to effectively account for these risks, six main commonality classifications—initiating events, shared connections, identical components, proximity dependencies, human dependencies, and organizational dependencies—may be used. This paper examines advances in MUPRA, offers formal definitions of multi-unit site risk measures and proposes quantitative approaches and data to account for unit-to-unit dependencies. Finally, a parametric approach for the multi-unit dependencies has been discussed and a simple example illustrates application of the proposed methodology. 2 The full-text of this chapter has been published in the Journal of Reliability Engineering & System Safety, Volume 157, Pages 87-100, January 2017. http://dx.doi.org/10.1016/j.ress.2016.08.005 21 3.2. Introduction Nuclear power plants consisting of more than one unit and other radioactive inventories are not formally evaluated in an integrated manner in the traditional Probabilistic Safety Assessment (PRA). Most PRAs are based on single-unit PRA evaluations that don’t provide a complete picture of all possible accident sequences and radioactive sources to assess the radiological risk arising from severe events on the site. Models for accident sequences involving concurrent releases from multiple radiological sources on the site concurrently are still in their infancy. Although the risk triplet [76] defined from a general perspective also applies to the multi-unit risk concerns and sporadic ad hoc approaches exist which have considered seismically induced multi-unit accidents involving loss of coolant accidents, station blackout, and consideration of multi-unit common cause failures, there is still a need for an integrated approach to Multi-Unit PRA (MUPRA). A formal approach to MUPRA would not only improve our understanding of the complete risk profile, but the results would also improve regulatory decision-making and accident management. The accident in March 2011 involving the six-reactor Fukushima Dai-ichi nuclear power plant site clearly underlined that scenarios involving nearly concurrent release of multiple sources of radioactivity on a site and multiple core damage events should be carefully evaluated. While the accident started from a seismic external event, it led to a devastating tsunami that, coupled with inadequate emergency response to adequately cope with the complex intertwined severe accident challenges to all six reactor units and their spent fuel storage facilities, despite some initially successful 22 measures that delayed the radioactive releases that permitted public evacuations, resulted in serious radiological releases. The end result was severe core damage of three operating reactor units along with containment breach of one of the reactors and releases of radioactive material exceeded only during the Chernobyl accident. Major weaknesses in emergency response and incompetence in accident management in handling multi-unit accidents with extended station blackout conditions were clearly alarming. The two units that were down for maintenance and refueling plus operation of a single emergency diesel generator circumvented core damage in all six units. Multi-unit plants, although physically independent to a large extent, have many direct and indirect inter-connections that make them practically dependent. Examples of these dependencies include certain initiating events simultaneously occurring in multiple units, a transient in one unit affecting some or all of the other units, proximity of the units to each other, shared structures or components (e.g., shared batteries and diesel generators), common operation practices and human actions, and substantial procedural and other organizational similarities. Besides considering all sources of radioactivity and dependencies among the facilities on a site, to gain an accurate view of a site's risk profile, a measure of Core Damage Frequency (CDF) and radiation release metrics such as the Large Early Release Frequency (LERF) representing the site rather than the unit should be considered and estimated in a fully integrated MUPRA. MUPRA refers broadly to an extension of the traditional PRA techniques to assess the risks of multi-unit sites. This includes single-unit PRAs that consider the accident sequences that may propagate from one unit to another, fully 23 integrated PRA models that address accident sequences that may involve any combination of reactor units and radiological sources, and hybrids of these models. This paper will discuss the technical aspects of an integrated MUPRA, including consideration of dependencies and assessment of the multi-unit dependency data and models for quantifying such dependencies. The paper also provides discussions on formal definitions and metrics for multi-unit site risks. Finally, parametric methods are used to address multi-unit dependency situations. A conceptual two-unit logic example is used to demonstrate the application of proposed methodology. 3.3. Background In a MUPRA it is necessary to account for possible interactions between the units collocated at a site when a single reactor accident may propagate to affect other units (causal interaction), or when a common cause event impacts multiple units and radiological sources concurrently. Consideration of these interactions in MUPRA leads to some technical issues and challenges that this paper attempts to characterize and offer possible solutions for. It is clear that MUPRA requires development and modeling of initiating events, accident sequences, end states and risk metrics that are relevant to multi-unit sites. There have been limited experiences in performing MUPRAs in the past in the U.S., Canada, and other countries; however, such efforts are neither formally nor adequately considered. This includes operating plant sites in either the regulatory or the commercial nuclear environments [20, 33, 37]. Fleming, Arndt, Omoto, Jung, et 24 al. have recommended methods to deal with facets of a MUPRA analysis [30, 31, 34, 77], yet no well-established integrated approach and understanding of the implications of MUPRA exists. In the early 1980’s the PRAs for the Indian Point Station [27], addressed the dual-unit releases as a result of seismic and high wind. The other example of a MUPRA was the Level 3 PRA for the Seabrook Station in New Hampshire, U.S.A., performed in the mid-1980’s [28]. More recently, MUPRAs have been performed for some CANDU plants in Canada [78]. There have also been some Level 1 PRAs of multi-unit sites that provide very limited considerations of the concurrent states of the other units. Unfortunately, much of what is known today about the risks of multi-unit sites is based on what has been learned through operating experiences [20] and the multi-unit accident at the Fukushima Dai-ichi plant. The U.S. Nuclear Regulatory Commission (NRC) has dealt with multi-unit risk in a limited manner. For example, after the Chernobyl accident the NRC issued recommendations involving limiting noble gases and airborne volatiles being 3 transported to the other units during the accident through a shared ventilation system . This included addressing issues such as control room habitability, contamination outside of the control room, smoke control, and shared shutdown systems [79]. Also, the Criterion 5 of the General Design Criteria (GDC) [80] in the U.S. for nuclear power plants recommends no sharing of structures, systems and components (SSC) among units at a nuclear plant site, “unless it can be shown that such sharing will not significantly impair their ability to perform their safety functions”. More recently, the 3 A similar transport mechanism also occurred during the Fukushima Dai-ichi event, during which the fire/explosion at Unit 4 was caused by leakage of hydrogen released from Unit 3 through shared ductwork with Unit 4. 25 U.S. NRC has been conducting an effort to create an integrated Level-3 PRA that includes the effects of multiple units, as well as the risk from all radiation sources onsite, such as the spent fuel pool [81]. The U.S. nuclear industry’s integrated site risk solutions generally focus on only one facet of the MUPRA at a time without considering other concurrent events. For example, the station blackout event has been investigated because of its site impact and the interdependencies in its shared electrical systems. Similarly, the seismic-induced dependencies between units and component fragilities across a site have been of major interest. But although specific aspects of MUPRA have been looked at in an ad hoc fashion with greater detail in the U.S., no integrated approach exists. In the international arena, the International Atomic Energy Agency (IAEA) has been working on this area, and its International Seismic Safety Centre has been working on developing a series of Safety Reports for MUPRA [36, 41, 42]. Also, the IAEA General Safety Requirements Part 4 in its requirement 4.31 for evaluation of external events states: “… Where there is more than one facility or activity at the same location, account has to be taken in the safety assessment of the effect of a single external event, such as an earthquake or a flood, on all of the facilities and activities, and of the potential hazards presented by each facility or activity to the others." 26 In 2012, the IAEA sponsored a workshop that discussed the issue of multi-unit PRA, but did not offer any methodological solutions to perform practical PRA analysis, including: (1) the possible combinations of hazards induced by external events, (2) the ways they may affect multi-unit site, (3) how to address the dependencies under the impact of external events among multiple units [39]. NEA/CNSC organized a workshop on this subject that took place in Ottawa, Canada, in November 2014 [40]. Finally, the ASME/ANS Joint Committee on Nuclear Risk Management (JCNRM) is discussing the development of a standard for PRA applications to SMRs, but this effort is still in its infancy with no standards available. In moving forward, the NEA/CNSC Workshop in Ottawa listed many recommendations, including the following critical needs for further advances in the future: 1. Designation of additional risk metrics beyond Core Damage Frequency (CDF) and Large Early Release Frequency (LERF) to better capture risk profile of multi-unit sites 2. Delineation of single and multi-unit accident sequences including effects of single reactor/facility events on other units in form of additional initiating events and accident scenarios 3. Accounting for multi-unit common cause and causal dependencies involving functional, human, and spatial dependencies and development of supporting data to address inter-unit and intra-unit common cause failures 27 4. Evaluation of interactions between operator actions that would adversely affect multiple-units 5. Proper accounting of the timing and amount of source-term releases from different units 6. Consideration of site condition in restricting operator access, recovery actions and implementation of planned accident management measures 7. Definition of site-level plant damage end states including the effects of cumulative radiological releases and other correlated hazards, as well as release categories reflecting multi-unit accidents, spent fuel storage, and other radiological sources 8. Proper accounting of risk in terms of frequency of events per site-year including consideration of risk metrics for spent fuel pool accidents involving temporal variations in the amount of radiological contents of the pool. 9. Improvements in human reliability models and analyses to address performance shaping factors unique to multi-unit accidents 10. Consideration of longer mission times beyond currently practiced 24 hours length of operation for emergency equipment 11. Site response to the same earthquake and correlation among the component fragilities in the MUPRA context. 12. Modeling of multiple points of release from the plant site, including possible time lags of releases, and release energies for plume rise considerations. 28 The recommendations listed above require significant development that is beyond the scope of this paper. However, this paper attempts to address the first three of the needs listed above along with related discussions about some of the other needs. In summary, considering only one reactor unit at a time under the implicit assumption that other reactor units are appropriately protected has been the practice so far in both deterministic and probabilistic safety assessments of multi-unit sites and facilities. Indeed, this problem is recognized as an important issue by the NRC, CNSC and IAEA, but very little progress has been made in understanding and measuring the safety significance of multi-unit risks and implications and uses of the safety goals in the context of multi-unit sites. Despite the fact that single-unit PRAs provide very useful results, performing PRAs, one reactor at a time, could potentially yield misleading and optimistic risk insights [37] in situations involving multi-unit events, and “site-based risk metrics and methods should be defined and used in risk-informed decision making.” 3.4. Multi-Unit Quantitative Health Objectives and Their Surrogate Metrics The US NRC policy statement on safety goals proposed two safety goals and associated Quantitative Health Objectives (QHOs) to articulate levels of acceptable risk, which later served as the de facto guidelines for using PRA results in regulation. The goals provided indices for the level of “public protection which nuclear plant designers and operators should strive to achieve.” Two safety goals were introduced in terms of public health risk, one addressing individual risk and the other addressing 29 societal risk. The risk to an individual is based on the potential for death resulting directly from a reactor accident – that is, a prompt fatality. The societal risk is stated in terms of nuclear power plant operations as opposed to accidents alone, and addresses the long-term impact on those living near the plant site. The sources of societal risk include all the radiological sources onsite (i.e., reactor cores, spent fuel pools and radioactive waste facility). The safety goals were expressed in qualitative terms for a nuclear site, perhaps so that the philosophy could be understood. The NRC also expressed the qualitative goals for the safety of nuclear power plants in terms of individual and societal QHOs. While the QHOs provided metrics to address the question of “how safe is safe enough?” around a nuclear plant site, practical implementation of the QHOs proved to be difficult because of the large uncertainties involved in the calculation of risk [21, 82]. To address the difficulty of uncertainties related to QHOs, the U.S. NRC observed that implementation of the safety goals using surrogate or subsidiary metrics that achieve the same intent as the QHOs but do not involve as much complexity can be useful in making regulatory decisions. These surrogate metrics anchor, or provide guidance on, an appropriate defense-in-depth philosophy that balances accident prevention and mitigation. In this light, it was indicated that a CDF of less than one in 10,000 years of reactor operation is a useful benchmark in making judgments about that portion of regulations that are directed to accident prevention. Similarly, a LERF of less than 1 in 100,000 years is a useful benchmark to help ensure a proper balance between prevention and mitigation. These considerations later evolved into the 30 -4 -6 “benchmark” values of 10 /year for CDF and 10 /year for Large Release Frequency (LRF). In addition, the plant design is required to meet a containment performance goal, which includes (1) a deterministic goal that containment integrity be maintained for approximately 24 hours following the onset of core damage for the more likely severe accident challenges, and (2) a probabilistic goal that the Conditional Containment Failure Probability (CCFP) be less than approximately 0.1 for the composite of all core damage sequences assessed in the PRA. It is the definition and these surrogate metrics that led to difficulty when applied to multi-unit sites. The implicit assumption in using these metrics is that they are measured on the basis of a single unit, whereas the safety goals and QHOs are applicable to the entire site. Since it is the entire site that imposes the public risks, we conclude that the measures of QHOs should apply and remain unchanged for multi-unit sites. Of course, the prompt fatality goal remains more restrictive than the latent cancer fatality goal in multi-unit releases. Accordingly, multi-unit risk should be below the QHOs for both prompt and latent fatalities. For multi-unit releases, surrogates for QHOs (such as CDF, LRF and LERF) for site risk should be defined, assessed and compared to the goals. An important -4 -6 -5 consideration is whether the limits of 10 , 10 , and 10 per year for these surrogates in the context of site risk should change. The site-based acceptance limits for the QHO surrogates are outside the scope of this paper and need further analysis to justify proper limits. In the remainder of this section we will discuss possible definitions of these surrogates in the context of the site risk and MUPRA. 31 In MUPRA the definition of the site CDF, LRF and LERF are more complex than the simple sum of the individual frequencies of a single unit because there are complex unit-to-unit dependencies. Dependencies between the units (albeit small) and other structures holding radiological materials do exist because of specific design features, operating practices, spatial layout of the site, safety features and culture and organizational behavior and practices. Such dependencies, however, could contribute significantly to the likelihood of multi-unit core damage. In order to quantify a MUPRA, the CDF, LRF and LERF metrics must be clearly defined; dependencies between the units identified, accounted and modeled; the PRA model of the site developed and quantified; and the health effects estimated. 3.4.1. Single-Unit CDF Metrics Site risk can be viewed as an event in which one or more of the units experience the core damage (CD) event. Assume events ( ) ( ) represent random variables describing the “events of a core damage” in reactor units 1 to n. Considering the site risk in terms of a multivariate distribution (describing joint random variables ( ) and ( ) risk of each unit), a single-unit CD risk can be expressed in two ways: marginal CD risk of each unit, or conditional CD risk of each unit. In each of these definitions, there are dependences within each unit as well as between the units. The conditional CD risk may be expressed by the CDF of one unit as the CDF of one unit given one or more known states of the other units on the site. The marginal CD risk of one unit is then defined as the CDF of one unit regardless of the states of other 32 units. The inter-unit dependencies may exist among all units or among a subset of them. Inter-unit dependencies must be defined and probabilistically expressed so as to make it possible to estimate the marginal CDF or conditional CDF of a unit. Figure 3-1 depicts Schroer’s unit-to-unit dependencies discussed earlier [21]. In a causal- dependency situation, the root causes of these dependencies can be viewed as the condition that couples the units together. Figure 3-1: Classes of intra-unit dependencies [20] 3.4.2. Multi-Unit CDF Metrics Consider Figure 3-2 depicting multi-unit risk in terms of site core damage composed of a set of mutually exclusive CD states for a hypothetical 3-unit site, where each unit can be expressed by its marginal or conditional definitions of CD. According to this 33 figure, the multi-unit risk may be defined as the frequency of one or more CD events. For example, this definition corresponds to the union of the CD events of Units 1 through 3 represented by the Venn diagram of Figure 3-2. Alternatively, the multi- unit risk may be expressed as the frequency of multiple CD events, for example, two CD events of Units 1 and 2, but not Unit 3 as shown in the Venn diagram of Figure 3-2. These events, in turn could be separated by significant time, or could exist nearly concurrently. For simplicity, these events are assumed to be existed simultaneously in the following discussion. Figure 3-2: Multi-unit CD as single event The union of the minimal cut sets of the individual units will represent the multi-unit CD. Two closely related definitions of the multi-unit CDF (or site CDF) may be the formal summation of individual unit CDFs either expressed as the marginal probability (per year) for all conditions imposed by inter-unit dependencies of ( ( )), or the conditional CDF of a unit, , given condition may be expressed 34 by ( ( )| ). Accordingly, the definition of the multi-unit CDF expressed as the annual probability of one or more core damage events based on the marginal probability of a single unit CD using the total probability theorem would be as below. The implicit assumption is that all reactor units onsite are subject to the same operating profile through the timing period of interest, and that the Poisson model underlies the notion of frequency. Indeed it is understood that the site operations could be changed due to the continually improved actions, such as design changes, improved organizations, operator training, etc. ( ( )) ( ( ) ) ( ( ) ( ) ) ( ) ( ) ( ) ( ) ( (3-1) ( )) Using Boole’s inequality, a simpler and more conservative estimate of the multi-unit CD can be obtained from ( ( ) ) ( ) ( ) (3-2) If a condition exists that couples a subset of reactor units, say 1 through k, each term of Equation (3-2) may be written as: ( ( ) ) ∑ ( ( )| ) ( ) (3-3) For multiple reactor core damage events under condition , ( ( ) ) ( ( ) ( ) ) ( ( ) ( ) ( ) ) ( ( ) (3-4) ( ) ) ( | ( ) ) 35 From Equation (3-4) the total annual probability for the k units under all conditions would be: ( ( ) ) ∑ ( ( ) | ) ( ) (3-5) Note that there may be causal dependencies among conditions such that one condition leads to others. Accordingly, the hierarchy of such multiple causal conditions is ( ) ∑ ( ) ( ) (3-6) where ( ) is the joint probability of conditions . The problem now reduces to how one can determine the marginal annual probability of a CD event for one unit, or conditional probability of a CD for a unit for specific set of conditions. Both marginal and conditional measures are addressed in this paper. It should be recognized that the marginal and conditions measures are very different. Note that events and conditions are treated differently in this paper. In particular, the occurrence of an event may lead to a site transient that cause the site operation changes condition. For example a site can be led to an abnormal “condition” by seismic-induced loss of offsite power (LOOP) accident “event”, following which a series of mitigating systems will then be called upon to bring such condition back to normal. 3.4.3. Multi-Unit LRF and LERF Metrics To estimate the consequences of a multi-unit accident, it is necessary to estimate large releases of radioactivity that lead to prompt fatalities. Modarres et al. [83] 36 summarized three options for estimation of large release frequency. In the first option the magnitude of release may be measured on the basis of associating a “large” release with an expectation that it would result in at least one early fatality. For example, the ASME/ANS Standard for PRA (RA-S-1.2-2014) [84] defines a “large early release” as a “rapid, unmitigated release of airborne fission products … such that there is a potential for early health effects.” Incorporating the effectiveness of temporal consequences, such as public evacuation and other protective actions, however, complicates the definition of a large release in this context. SECY-13-0029 [85] removes this complication by defining a release as large when it leads to an early fatality (with high probability) for a stationary individual standing one-mile from the site. This is a simple and convincing measure. However, it nevertheless requires some assumptions when applied to a particular site. To determine this measure of LRF, a hypothetical site should be assumed along with assumed meteorological data and an assumption of what constitutes a “high probability.” While identifying a representative site is possible, major conservatisms may be necessary to make it justifiable. For example, a site with medium to high population density, a weather stability condition resulting in high exposure, and limited evacuation routes may be a reasonable representative site. The second option measures the large release (i.e., on the basis of magnitude of the source term associated with each multi-unit core damage scenario) in the form of either absolute or relative quantities of radionuclides released. The absolute measure is often expressed in terms of activity released to the environment as a surrogate for a 37 quantitative calculation of dose. This is typically done for a few isotopes that tend to dominate estimates of offsite health effects, such as I-131 or Cs-137. For relative release, the traditional form expressed is fractional release of core inventory of various radionuclide groups to the environment, and the timing of the release may be specified. NUREG/CR-6595 [86] (Appendix A) suggests specific release fractions that may be considered as large (e.g., 2-3% of the iodine inventory). This option is simple to describe, but selecting the total amount of release or release fractions considered large is subjective and contentious [87-88]. The third option for large release de-emphasizes the amount of radioactivity released, by defining it in terms of the physical condition of systems, pressure boundaries and radionuclide barriers at the time release begins. For example, a large release might be considered as one involving failure of multiple reactor pressure vessels and containment pressure boundaries due to isolation failure(s), bypass, or structural damage within a few hours of core melting and fission product release from fuel, during which opportunities for attenuation of the airborne concentration are minimal. Conditions associated with multiple units may also be defined, if necessary. Note that this option is typically used for defining LERF. In summary, the three options for measuring LRF (surrogate for prompt fatality QHO) are: (1) Frequency of rapid, unmitigated release of airborne fission products that would result in at least one early fatality from the sites (NUREG/CR-6094 [89] suggests a stationary individual one mile from plant); (2) Frequency of site-level 38 absolute or relative quantities of radionuclides released (absolute expressed in terms of activity released, relative in terms of the percent of available inventory—usually of I-131 or Cs-137); (3) Frequency of pre-set site-level plant states: physical operating states of systems, states of pressure boundaries, and/or states of the radionuclide barriers at the time release begins. Note that the prompt fatality in the safety goals applies to an average individual living in the region between the site boundary and one mile beyond. The latent cancer fatality in the safety goals, however, applies to an average individual living in the region between the site boundary and ten miles beyond. Large early release frequency (LERF) proposed by EPRI and adopted in RG 1.174 [90] as the surrogate for prompt fatality goal is defined by the NRC as “the frequency of those accidents leading to significant, unmitigated releases from containment in a time frame prior to effective evacuation of the close-in population such that there is a potential for early health effects.” The use of system states to define large release for calculating LERF has been discussed in NUREG/CR-6596. Note that the NRC -5 -6 rejected the recommendation to use LERF (10 /year) in place of LRF (10 /year) in the Safety Goal Policy statement [88]. It appears that the LRF option for multi-unit sites would be preferable as it also is formally part of the NRC safety goals. For example, in option 3 discussed above, multi-unit reactor-system states and other conditions can be selected by performing a level-3 PRA based on a surrogate site, from which one can roll back to the level-2 39 release categories to see which ones contribute to one or more deaths. Having identified those release categories, the contributing unit (system states) with characteristics that may be designated as large releases can be defined. Because this method uses conservative site meteorological condition, population density and evacuation, the resulting system states would yield certainly conservative risks for a single unit. Equally its extension can justify events considered as conservative release frequency measures for simultaneous release events from multiple units. Important factors that influence the prompt fatality risk relate to source term parameters: radionuclide activity, rate and timing of release, chemical and physical form of radionuclides, thermal energy, release fractions, etc. Level 3 consequence analysis would be needed assuming a “generic” site and applying multi-unit PRA scenarios to quantify and evaluate the implications of the NRC’s QHOs. Although detailed site-specific Level-3 PRAs are not available for the U.S. plants, relevant information could still be obtained from the site-specific environmental impact analyses performed in support of the license renewal applications that involve simplified Level 3 analyses. 3.5. Illustration of the MUPRA Approach The MUPRA involves development of an integrated model involving a combination of individual unit PRAs that also include unit-to-unit dependencies. Regardless of the risk definition used, to estimate multi-unit risk a MUPRA model representing each unit and dependencies among the units must be developed. Two general approaches 40 are possible: 1) treating each unit through a separate unit-specific PRA and expanding the single-unit static PRAs into a multi-unit one by superimposing the effects of the unit-to-unit dependencies, or 2) using a Dynamic Probabilistic Risk Assessment (DPRA) to establish a simulation approach to capturing interaction of the units. The approach in Option 1 which is to develop a site-level static PRA can be accomplished either through development of a single, integrated PRA model of the site, or by combining individual PRAs of each unit in the site. The single integrated site-level PRA approach is more complex and more applicable to simple designs. In this paper we focus on the approach involving combining single unit static PRA models by superimposing all unit-to-unit dependencies. For related research on dynamic MUPRA see Dennis, et. al. [55] To demonstrate the approach, consider Figure 3-3, which depicts a conceptual example of a MUPRA model representing a two-unit logic including the classes of interactions between the units. The unit-to-unit dependencies models will be described next. Note that this logic is not related to any reactor design or site and serves in this paper only for MUPRA illustration purposes. In the first step we define the unit-to-unit dependencies and next we demonstrate a method of parametrically quantifying such dependencies, and finally we use the parametric dependency estimates to quantify the conceptual example. 41 3.5.1. Multi-Unit Dependencies Schroer [20, 21] discussed a number of dependent analysis methods and classified them into three major groups, combination, parametric and causal-based. A brief description of each group follows. 3.5.1.1. Combination In this case, the condition in Equation (3-5) describes a common (shared) event that should be explicitly modeled in each unit’s PRA model (e.g., the event may be failure of a shared SSC serving multiple units, or the same initiating event affecting multiple units, such as loss of power); they simply need to be represented as an identical item in the MUPRA logic, so that they are not double counted in the quantification of the site risk metrics. This would be a simple problem to handle, with the accounting done as part of the PRA logic manipulation. 3.5.1.2. Parametric Parametric methods are commonly used in the traditional single-unit PRAs for common cause failure events as the catchalls technique to address dependencies that cannot be modeled explicitly. However, internal hazards (e.g., internal fires and floods) and external hazards (e.g., earthquakes and high winds) are modeled explicitly in PRAs and are not treated by parametric analyses. The Seabrook PRA, one of the only MUPRAs performed in 1983, used the parametric beta factor [3]; however, the parameters currently in use in the single-unit PRAs would not be applicable to the MUPRAs. Extension of the parametric methods to multi-unit situations is a major 42 contribution of this paper that will be described in Section 3.5.2. Use of the parametric methods in MUPRA may lead to undue conservatism when the common cause failure group is higher than four [3]. The current parametric methods may not adequately address MUPRAs because they use parameter estimators for a single unit when one train of a system is challenged, but all similar trains are also assumed challenged [3]. This is oftentimes not the case for multi-unit events. For example, during a single-unit reactor trip, the supporting systems for that unit will be called upon while other units’ systems usually continue with normal operation. Parametric methods may be used to assess common cause failures between redundant SSCs and human actions across multiple units. However, when it becomes difficult to explicitly model non-redundant (not identical trains and equipment) as causal events, it would be possible to treat potential dependencies in these cases by the parametric method, although finding supporting data to estimate the parameters would be challenging. 3.5.1.3. Causal-Based This type of dependent failure is important and poses difficult coupling mechanisms that contributed in a major way to the Fukushima Dai-ichi accident. There are two classes of causal dependencies started from within one unit or from events external to the site. They are (1) those started by SCC failures, an initiating event or human action in one unit leading to different SCC failures, initiating events and human actions in the other units, and (2) those started by an external event leading to different SCC failures, initiating events and human actions in the other units. As such, in this approach one attempts to estimate the MUPRA risk metrics conditioned 43 on a common causal factor to multiple units. There are several techniques for causal-based probabilistic estimation, including the parametric method, physics-of- failure and Bayesian Belief Networks (BBN) [91]. This paper contributes to the parametric approach to causal-based multi-unit failures. However, ongoing research at the University of Maryland uses the BBN graph to model causal dependencies [92]. The benefit of the BBNs is that they allow dissimilar information to be combined, such as qualitative information like that from expert elicitation, as well as quantitative data. Also, the ongoing research expands uses of the probabilistic physics-of-failure model to model the causal dependencies [92]. The physics-of-failure approach allows the underlying physical failure mechanisms induced by the root cause of the condition (e.g., seismic impact, SSC failure in one unit causing failure in another, fatigue fracture, etc.) to be incorporated into the assessment of ( ) and thus the entire risk model [93]. Figure 3-3 conceptually illustrates examples of the dependencies discussed above originated from events within one unit or from external causes. In this example the root external event “C” shown by a pentagonal symbol in Figure 3-3 (e.g., a seismic event) could lead (conditionally) to initiating events (similar or different events) in the multiple units. The probability of the conditional event, , (described by a lozenges shaped symbol) represents the likelihood that the root external event will cause the initiating events in unit i. Similarly, other less obvious common conditions across a site rooted in the organizational, design, environmental and operational 44 events may also be the source of causal failures, leading to multiple failures (similar or dissimilar events) in more than one unit. These events are shown by a trapezoid (described as event “D”) in Figure 3-3 leading to similar events in the two units (i.e., failure of component B in units 1 and 2). Finally, event “D” describing shared events (shared SSCs, human, etc.) is explicitly modeled as part of the integrated MUPRA logic affecting multiple units (again through the conditional event W|D and Z|D leading to different outcomes—failure events Z and W in units 1 and 2, respectively). Similar to the shared events, failures originated in one unit could lead to another event in the other unit. This situation in Figure 3-3 is shown by event Y in unit 2, leading to an initiated event in unit 1 described by the probability of the condition: . Figure 3-3: Conceptual examples of unit-to-unit dependencies 45 3.5.2. Parametric Estimation of Common and Causal Dependencies in Multi-Units Section 3.5.1 discussed the types of dependencies in multi-units. It provided three classes of the root-causes of the dependencies in form of (1) common site conditions as denoted by event B (e.g., organizational dependency), (2) external events as denoted by event C include both explicit type (e.g., failure of the grid, earthquake, and flood), and implicit type (e.g., similar design errors, procedure deficiencies, and inadequate operating environments), and (3) common events or components as denoted by event D (e.g. shared electric buses, internal flooding, internal fire). These root causes then, conditionally, could lead to failure events of similar or dissimilar nature in multiple units. In Figure 3-3 such conditional events were shown by the events depicted by a diamond shaped symbol, whose probabilities should be estimated for an integrated MUPRA. Also Section 3.5.1 discussed the parametric method as well as the more elaborate probabilistic physics-of-failure and BBN techniques. It appears that the parametric methods to model unit-to-unit dependencies provide a quick and practical approach to quantifying important dependencies in MUPRA analyses. This approach will be discussed in the remainder of this section. To perform parametric estimates of dependent events across units, the traditional methods of common cause parametric estimation provide sufficient basis for estimating both common cause (here referred to as the dependencies among similar events) and causal (diverse events), occurring in multiple units. For example the Alpha-Factor Method [3] is a simple technique to estimate the parametric dependencies and conditional probabilities. Further, evidences of multiple failure 46 events should be available for such estimations. Considering Schroer’s [21] analysis of the years 2000-2011 Licensee Event Report (LERs) [94] reported by the U.S. nuclear plant operators to the USNRC, one may generate the data needed to parametrically estimate the specific conditional probabilities depicted in Figure 3-3. Note that uncertainties may arise since the information and applicability of the events from the LER reports are subject to interpretation. In the past single-unit common cause studies, such uncertainties were addressed with the notion of the impact vectors [6, 95], which may also be applied to multi-unit issues. However, the impact vectors are not considered at this preliminary stage analysis and we conservatively assume that the LERs events are applicable to all units. It could be a preliminary analysis to get insight about the magnitude of site risks vs. single-unit risks. It is expected that external events and particularly seismic events to play major roles and are these events are extremely small in LERs. For instance, the seismic event was identified to be involved in the dominant Seabrook multi-unit scenario. In the LER events of years 2000-2011, only one LER event [96] involved an earthquake, the August 23, 2011 earthquake in Mineral Springs, VA that affected the North Anna plant and led to a dual unit reactor trip and subsequent engineering safety feature (ESF) actuations. Therefore, it is part of our current research to study such high significant events like seismic event in MUPRA analyses. The methodology used to make the parametric estimations is rather straightforward. Suppose that the parameters of the parametric model are the fractions of the total 47 probability, , of an event of interest (conditional or unconditional) that involves occurrences in multiple units, , due to a common root cause event of type j. The point estimation of , is calculated according to the binomial maximum likelihood estimator: ̂ (3-7) where is the total number of observed events of type j (such as initiating event or human error) involving occurrences in i reactor units (i= 2, or 3 for the U.S. sites) due to the total number of LER events of type-j events observed in N total events that occurred in the multi-unit sites. The confidence within which the real probability, , resides can also be found using the conjugate Bayesian estimation of assuming that the beta distribution represents the random variable . Accordingly, the beta distribution Beta ( ; α, β) with the cumulative density function of the form below may be used to represent ( ) ( | ) ∫ ( ) (3-8) ( ) ( ) where α, β are the distribution shape and scale parameters that should be determined for the prior and posterior distributions. Beta distribution representing the random variable when using the Jeffrey’s non-informative prior Beta ( ; , ), leads to the conjugate posterior distribution Beta ( ; , ). As such using the posterior the ( ) probability interval for using inverse of Equation (3-8) is, 48 ( | ) (3-9) (( ) | ) where ( ) is the inverse of Equation (3-8) with the shape and scale parameters described. A detailed analysis of all 4207 events occurred in years 2000-2011 is reported in the LER. A subset of 2448 events was identified as occurring in reactors located in multi- unit sites, which were analyzed to determine the root cause and consequential outcome of such a root cause. Using this subset of LERs, Schroer’s data [21] shows that 391 of these LERs involved events affected more than one unit. If one considers the LER events as precursors to multi-unit accidents, then the probability that severe accident involves multi-units would be 391/2448, or 16%. If one thinks of the LER events as precursors to severe accidents, then this number is consistent with the results of the Seabrook MUPRA model estimating a conditional probability core damage in a second unit given the first unit core damage of 14% [28]. Schroer [21] provides details of the events involving multi-unit occurrences in this period as also summarized by Schroer and Modarres [20]. As a follow on to Schroer’s analysis, the LER events for the same years were further analyzed as part of the research summarized in this paper to determine the root causes of the 2448 LER events that occurred in multi-unit sites in 2000-2011. The events were judged to have originated from one of the following primary apparent root causes: 49 1. HA: Human action 2. SSC: Structure, System, Component Failure or Degradation 3. OI: Organizational Issue 4. EEP: Event External to Plant 5. IE: Initiating Event A procedure was developed in this study to include events that can potentially cause failures in multiple units during accident conditions that are normally considered in the traditional single-unit PRAs. A summary of steps in this procedure to include and exclude LERs that be designated as precursors to multiple unit events is described below. Exclusion Criteria of the Multi-Unit LERs Identified by Schroer: 1. Organizational LER events were not considered since these are not currently explicitly modeled in the single-unit PRAs. Note that of the 391 multi-unit LERs that Schroer identified, 159 are organizational events, which were excluded in this analysis. 2. Events that are not typically considered in single-unit PRAs were eliminated. For example, events involving precautionary actions on a second unit because of events occurred in the first unit, such the LER where, in one multi-unit divers working on one of the unit’s piping became unresponsive, so the operators tripped the other unit due to concerns for divers’ safety [97]. 50 3. LERs involving violation of the technical specifications (missed, falsification or incorrect actions) were also eliminated for consideration. 4. Design errors that had no impact on the safety function of equipment or operator actions were eliminated. 5. Events involving software logic faults that did not affect emergency operation of equipment were also excluded. Of the 232 non-organizational multi-unit LER events identified by Schroer, 114 were eliminated based on the consideration of the exclusion criteria 2 through 5 above. Therefore, only 118 events were deemed to be important for consideration in estimation of the MUPRA events. Clearly, the total set of the remaining events is not very large and detailed estimation at the level of very specific multi-component, multi-human errors and causal errors would not be possible. For example, the number of events are not large enough to estimate common cause failure of a specific Motor Operated Valve across multiple units. As such, a mostly conservative approach was adopted in which all hardware equipment in two- or three-unit sites were grouped into one category. This effectively would result in a higher conditional probability of failure estimation for dependent failures across multiple units. To differentiate the multiple-unit events involving identical failures (common cause) and causal failures in two or three unit sites, the end effects of the 118 LER events selected as potential multi-unit precursor events leading to dependent failures across 51 two or three units were further divided into more specific mutually exclusive generic categories of LER events that resulted in: 1. Identical human error events in two units 2. Identical human error events in three units 3. Human error event in one unit caused different human error(s) in other unit(s) 4. Identical component failure/degradation events in two units 5. Identical component failure/degradation events in three units 6. Identical initiating events in two units 7. Identical initiating events in three units 8. Initiating events in one unit caused a different initiating event(s) in other unit(s) 9. Component failure/degradation in one unit caused initiating event(s) in other unit(s) 10. Component failure/degradation in one unit caused different component failure/degradation events in other unit(s) 11. Initiating event in one unit caused component failure/degradation event(s) in other units This categorization of the end effects of the multi-unit site LER events was first done on the site-by-site basis and by identifying the specific unit(s) affected, and then they 4 were aggregated into a large Excel-based database . The data were then used to make estimation of the parametric unit-to-unit dependencies as described by Equations (3-7)-(3-9). The above categorization yields the numerator values for Equation (3-7), 4 Interested readers may request the Excel database from the corresponding author. 52 however we need to also identify the total number of LER events that actually involved multi-unit sites (i.e., the denominator in Equation (3-7)). Table 3-1 shows the number of LER events of 2000-2011 involving multi-unit sites primarily attributed to one of the three categories of events later used as the variable N in Equations (3-7) and (3-9). In arriving at the values described in Table 3-1, the following criteria were used: 1. Only LERs that occurred in sites involving more than a single unit were considered 2. LER events involving organizational, technical specifications violations were eliminated. 3. LER events were put into one of the three categories: initiating event, component failure/ degradation, and human error. 5 By using the above criteria, 2459 end effects of the multi-unit unit LERs considered remained for consideration. Of these, 400 occurred in sites that had three reactor units as is shown in Table 3-1. The result of the parametric probabilistic analysis of multiple identical and causal 6 event analysis is shown in Table 3-2 for the integrated data for all the 35 U.S. multi- unit sites as of 2011. The values in the second column of Table 3-2 represent the number of events of type j involving i units (i=2 or 3 for U.S. plants) which were obtained by categorizing Schroer’s data [21] into the mutually exclusive categories 5 Note that a few of the 2448 multi-unit LER events resulted in more than one end effect. 6 The total number of multi-unit sites is 36 as of November 2015 53 described in column 1 of the Table 3-2, as discussed earlier. The point estimate and 95% Bayesian probability intervals for the events described in column 1 of Table 3-2 are shown in the columns 2 and 3 of this table, respectively. Note that in estimating the conditional probabilities in Table 3-2, using Equation (3-7) and Equation (3-9), the values of N were obtained from columns 2 and 3 of Table 3-1. For example, for estimating all two-unit initiating events (due to any reason causal or identical) N=728 was used. Similarly, for estimating any component failure for two-unit sites, N=1390 was used. In three unit cases, column three values in Table 3-1 were used. For example, for any human error across 3 units (identical or causal) N=45 was used. The same analysis as summarized in Table 3-2 is repeated on the basis of the data that belong to the specific plant sites to observe any site-to-site variability. The site- specific analysis showed that the probability of human error ranges from 0 to 0.5; the probability multiple SSC failures ranges from 0 to 0.25; the frequency of initiating events from 0 to 0.17. The result shows that differences do exist among the individual sites, especially for events involving human error. The causal chain of events for each LER event was also mapped and added to the database. An example of one such causal chain mapping is shown in Figure 3-4. This LER is related to a site which includes 3 reactor units. At the time of this event, Unit 2 was operating at full power when its operators were advised of a potentially important condition (cracking) affecting the integrity of Control Element Assemblies (CEAs) which are critical to safe shutdown of the reactor. This resulted from an 54 inspection of Unit 3 CEAs during refueling, which revealed one CEA with cracks. Due to the similar design and operating history of CEAs, similar cracks were assumed to also be present in Unit 2. Subsequent examination discovered the same cracking condition in Unit 1 and Unit 3. Further inspections of the Unit 2 CEAs during replacement activities found also several CEA fingers with evidence of cracking near their lower ends. This event in multi-unit LER events is considered as one occurrence of “Identical Component Failure/Degradation” shown in Table 3-2 among the 221 such failure/degradations reported by the 3-unit sites (note that most of these 221 occurrences of equipment failure/degradation were confined in one-unit of the site only, and very few involved two or three units). Table 3-1: Total number of LER end effects affecting multi-units Number of Events, N, for 2- Number of Events, N, Event Description or 3-Unit Sites 3-Unit Sites 7 Initiating Events 728 134 Component Failure / 1390 221 Degradation Human Error 341 45 Total 2459 400 7 There are two types of reactor scram: voluntary and controlled. Most of the events in Table 3-2 are general transients due to controlled shutdown. 55 Table 3-2: LER events involving 2 or 3 units and estimation of probabilities of multiple events The 95% Number of Point posterior Events Categorization, j occurrences of estimate of Bayesian (identified for either i=2 Corresponding type j events the interval for events involving 2 Events Shown involving i probability within units, or i=3 for events in Figure 3-4 units, , of the event, which the involving 3 units) reported by ̂ true Schroer [21] resides Identical Human Error (1.7E-0.2; HE → HE-S 11 0.032 Event (2 Units) 2 5.5E-02) Identical Human Error (2.4E-03; HE → HE-S3 1 0.022 Event (3 Units) 9.9E-02) Human Error Event in One Unit Causes (1.4E-06; Different Human Errors HE → HE-D 0 0 7.3E-03) in Other Unit(s) ( ) Identical Component (2.0E-02; Failure/Degradation SSC → SSC-S2 39 0.028 3.8E-02) Event (2 Units) Identical Component (1.9E-03; Failure/Degradation SSC → SSC-S3 2 0.009 2.9E-02) Event (3 Units) Identical Initiating Event (2.1E-02; IE → IE-S2 23 0.032 (2 Units) 4.6E-02) Identical Initiating Event (3.1E-03; IE → IE-S 2 0.015 (3 Units) 3 4.7E-02) Initiating Events in One Unit Causes Different (4.3E-03; IE → IE-D 7 0.010 Initiating Event in Other 1.9E-02) Unit(s) ( ) Component Failure/Degradation in (5.2E-03; One Unit Causes SSC → IE-D 8 0.011 2.1E-02) Initiating Event in Other Unit(s): ( | ) Component Failure/Degradation in One Unit Causes (1.1E-02; SSC → SSC-D 24 0.017 Different Component 2.5E-02) Failure/Degradation in Other Unit(s): ( | ) Initiating Event in One Unit Causes Component (1.5E-04; IE → SSC-D 1 0.001 Failure/Degradation in 6.4E-03) Other Units: ( | ) 56 Figure 3-4: An example of causal mapping an LER event of a multi-unit site 3.6. Quantification of a Conceptual Example to Illustrate the MUPRA Approach Consider the conceptual two-unit logic example shown in Figure 3-5, which is similar to but not a real reactor MUPRA logic. It is important to note that this example is used as a demonstration of the approach only and the corresponding results and conclusion are not applicable to any nuclear plant site. In this conceptual situation, four types of unit-to-unit dependencies have been modeled. These are the traditional common cause dependencies among identical components (e.g., between events ( ) and ( ) 8) ; causal dependencies between different events (e.g., ( ) ( ) described by the coupling event [ ( ) | ( ) ]{ ( )| ( ) ); causal dependencies between a ( ) component and initiating event (e.g., ( ) described the coupling event ( ) ( ) ); between identical initiating events caused by an external (coupling) condition such as earthquake or loss of offsite power (e.g., leading to ), and between an external (coupling) condition leading to a component failure [not 8 The superscript shows the unit number 57 explicitly modeled in this conceptual problem, but for example event leading to loss of components (i.e., ( ) ( ) —not shown in the figure). Some of the dependencies discussed were parametrically estimated and reported in Table 3-2 based on the U.S. multi-unit site experiences that are used in this conceptual example. Another dependency would be the existence of identical components shared between units (e.g., event G is shared between the two units). In reality this could be an electric bus or a battery feeding two units. The cut sets of individual system fault trees for the top events ( ), ( ), ( ), and ( ), assuming the symbol "+" stands for the union operator in the Boolean algebra, "⋅" represents the intersection operator, " " shows the causal operator, and "*" represents undeveloped events, are ( ) ( ) ( ) ( ) ( ) ( ) [ ⏟ ( ) ( ) ] ( ) { ( )| ( ) { ( )} [ ( ) ( )] (3-10)~(3-13) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) 58 Figure 3-5: A conceptual two-unit logic for demonstration of classes of dependencies and their probabilistic treatment in the PRA Note that the symbols of the events are further described in Figure 3-5. Similarly, the cuts sets of the unit-1 sequences may be expressed as ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) [ ⏟ ] (3-14)~(3-17) ( ) ( ) { ( ) } ( ) ( [ ) ] ( ) ⏟ ( ) ( ) ( ) ( ) { } 59 Equally, by substitution the support system causal events can be further simplified to ( ) ( ( ) ) ( ) ( ) (3-18)~(3-20) ( ) ( ) ( ) ( ) ( ) , so Substitution of Equations (3-10)~(3-13) and Equations (3-18)~(3-20) into Equations (314)~(3-17) and reducing the Boolean relations will yield the minimal cut sets of the Unit-1 marginal CD sequences. Assume that for all basic events, whether only in one unit, shared or similar components in both, have the probabilities ( ) , ( ) , ( ) , ( ) , and probability of identical failures (common cause failure given one component failing such as ( ( ) ( )) ( ( )) ( ( )| ( )) are obtained from Table 3-2 (e.g., such as ( ( )| ( )) ). Similarly, for causal events originated from another unit, the probability of individual causal events such as [ ( ) ( )] can be expressed by ( ( ) ( ) ) ( ( )) ( ( )| ( )). Conditional probabilities such as ( ( )| ( )) can be obtained from parametric values in Table 3-2. For the initiating events, assume the root events leading to the initiating events and have frequencies of /year, /year with ( | ) , ( | ) . Finally, consider that component-to-component causal failure may be represented as ( ( )| ( )) . 60 3.6.1. Single-Unit Marginal Cut Sets and Frequency Assessment Using the Unit-1 minimal cut sets of the CD sequences and the data described earlier, Table 3-3 summarizes the associated frequency of each sequence cut set specific to Unit-1 only (in this paper frequency of CD is used instead of CDF, since this is not about core damage frequency of a real nuclear plant). Because of the symmetry between the logic of the two units, it is possible to simply change the superscript (1) to (2) in the cut sets shown in Table 3-3 to find Unit-2 specific cut sets. Table 3-3: Unit-1 specific cut sets Freq. Freq. Freq. Cut Set Cut Set Cut Set (/yr.) (/yr.) (/yr.) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) Total /yr. Unit-1 cut sets conditioned on the states of Unit-2 (i.e., occurrence or non-occurrence of dependent events) can also be quantified as described by Table 3-4. Cut sets of Unit-2 that are conditioned (dependent) on Unit-1 event states can equally be found by simply changing superscripts ( ) ( ) and ( ) ( ) in the cut sets of Table 3-4. 61 Unit-1 cut sets with initiating events caused by occurrence of Unit-2 events may also be quantified as shown in Table 3-5. Similarly, Unit- 2 cut sets with initiating events started a part of an event in Unit-1 can also be found by changing the superscripts ( ) ( ) and ( ) ( ) in the cut sets of Table 3-5. The total Unit-1 frequency would be the sum of the total frequencies from Tables 3-5: ( ) / yr. The frequency would be the same for Unit-2. This would be the marginal frequency (i.e., for reactor unit it would be CDF of that unit regardless of the conditions that initiated or affected the sequences that led to its core damage). Table 3-4: Unit-1 cut sets conditioned (causally) on Unit-2 events Freq. (/yr.) Cut Set ( ) [ ⏟ ( ) ( ) ] ( ) ( ) [̅̅ (̅̅ ̅) ( ( )| ( ))] ( ) ( ) ( ) [⏟ ] ( )( ( )| ( )) ( ) [ ⏟ ( ) ( )] ( )( ( )| ( )) ( ) [⏟ ( ) ( )] ( )( ( )| ( )) ( ) ( ) [ ⏟ ] ( )( ( )| ( )) Total /yr. 62 Table 3-5: Unit-1 cut sets with initiating events are dependent (causally) on Unit-2 events Freq. (/yr.) Cut Set Freq. (/yr.) Cut Set ( ) ( ) [ ] ( ) ( ) ( ) [ ( ) ( ) ( ) ] ( ) ( ) ( ) ( ) [ ( ) ( ) ( ) ] [ ( ) ( ) ] ( ) ( ) ( ) ( ) ( ) [ [ ( ) ( ) ] ( ) ( ) ( ) ] ( ) ( ) ( ) ( ) ( )[ ] ( ) ( ) ( ) [ ( ) ( ) ( ) ] [ ( ) ( ) ] ( ) ( ) ( ) [ ( ) ( ) ] ( ) ( ) ( ) Total /yr. 3.6.2. Double-Unit Cut Sets and Frequency Assessment Consider conditioned cut set probabilities of each unit (e.g.., conditioned on event x) expressed as ( ) ( ) ( ) (3-21) ( ) ( ) ( ) (3-22) Accordingly, using Equations (3-21)-(3-22), the probability of the intersection of cut sets would be ( ) ( ) ( ) (3-23) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ∑{ [ ] } 63 When no common condition exists, then Equation (3-23) simplifies to ∑ ( ) ( ) ( ) ( ) ( ). Or, if symmetric cut sets (i.e. ∑ ∑ ), Equation (3- ( ) 23) becomes {∑ } . Not all double cut sets using Equation (3-23) can produce the relatively largest CD frequencies. Denote common cause event as ( ) . Two cut sets account for 90.81% of the double-unit CD frequency: ( ( ) ( ) ), and ( ) ( ). The important causal double unit sequences and common cause double unit sequences (and their frequencies) after Boolean reduction are shown in Table 3-6 and Table 3-7. In summary, the double (concurrent) CD cut sets quantification leads to the following conclusions: Table 3-6: Causal double unit sequences Freq. (/yr.) Cut Set ( ( )| ( )) ( ) ( ) ( ( )| ( )) ( ) ( ) ( ( )| ( )) ( )( ( )| ( ) ) ( ) ( ) ( ( )| ( )) ( ) ( ) ( ) ( ) ( ) ( ( )| ( ) ) Total /yr. 64 Table 3-7: Common cause double unit sequences Freq. (/yr.) Cut Set ( ( ) ( )) ( ) ( ( ) ( ( ) ( ))) ( ( ) ( )) ( ( ) ( )) ( ) ( ( ) ( ( ) ( ))) ( ( ) ( )) ( ( ) ( ) ) ( ( ) ( ) ) ( ( ) ( )) ( ) ( ) ( ) ( ( ) ( ) ) Total /yr. 1. The frequency of double-unit CD frequency (total independence) without consideration and correction for causal or common cause dependencies yr. 2. Double-unit CD frequency with causal dependency correction, but without common cause parametric correction: /yr. 3. Double-unit CD frequency with common cause parametric correction, but without causal dependency correction: /yr. 4. Double-unit CD frequency with causal dependency correction and common cause parametric correction: /yr. 5. Contribution from CCF dependencies to the total double-unit CD frequency = 98.66%. 65 6. Contributions from causal dependencies to the total double-unit CD frequency = 1.18%. 7. Contribution from independent double-unit CD cut sets to the total double- unit CD cut set frequency = 0.16%. 8. The marginal CD frequency of unit 1or unit 2: /yr 9. Site-CD frequency (i.e., frequency of at least a CD) ( ) /yr. 10. Factors by which site CD frequency events are smaller than the double-unit CD frequency events . Note that while the double CD frequency is smaller than single marginal CD frequency (although not significantly), the source term will increase and the total site risk in terms of consequences (early death and health effects) will remain about the same or possibly involve a marked increase. The risks from single unit (marginal) risk and double unit risk are substantially different. Therefore, it is possible that the total site consequences and risk may increase, in some plant sites nonlinearly, due to the increased source term from multiple units. Source term releases can be staggered, even if triggered by the same external event. The emergency response in this case could involve evacuation of the surrounding population, thus affecting the consequences resulting from additional releases. The site consequence and risk, however, requires more analysis. This conclusion assumes that a large number of components due to common fragility to external events (such as external flood and 66 seismic) are not present. If they are present, the difference of the factor multiple and single CD events will have similar frequencies. 3.6.3. Marginal Single-Unit Important Events The cut sets with highest fractional contribution to the total CD frequency of a single unit are shown in Table 3-8 (take unit 1 for illustration): Table 3-8: Significant cut sets for the single-unit CD Fractional Cut Set Contribution 19.19% ( ) ( ) ( ) 19.19% ( ) ( ) ( ) 18.04% ( ) ̅̅( ̅̅ ̅̅ ̅̅(̅̅ ̅̅( ̅̅) ̅̅(̅̅ )̅̅))̅ (̅̅ ̅̅ ̅̅ ̅̅(̅̅ ̅̅( ̅̅) ̅̅(̅̅ )̅̅))̅ ( ) ( ) ( ) ( ) 18.04% ( ) ̅̅ ̅̅ ̅̅ ̅̅ ̅̅ ̅̅( ̅̅) ̅̅(̅̅ )̅̅ ̅ ̅̅ ̅̅ ̅̅ ̅̅ ̅̅ ̅̅( ̅̅) ̅̅(̅̅ )̅̅ ̅ ( ) ( ) ( ) ( ) ( ( )) ( ( )) 9.69% ( ) ̅̅ ̅̅(̅̅ )̅̅ 3.79% ( ) [ ̅̅( ̅̅( ̅̅)̅) ̅̅( ̅̅ ̅̅ ̅̅(̅̅ (̅̅ )̅̅ ̅̅( ̅̅))̅̅) ̅̅((̅̅ ̅̅( ̅̅)̅̅ ̅̅( ̅̅))̅̅)] ( ) ( ) 1.94% ( ) ( ) 1.94% ( ) ( ) 1.94% ( ) ( ) The cut sets with highest fractional contribution to the double-unit CD frequency are shown in Table 3-9: 67 Table 3-9: Significant cut sets for the double-unit CD Fractional Cut Set Contribution 66.91% ( ( ) ( )) 23.90% ( ) ( ) ( ) ( ( ) 1.43% ( ( ) ( ))) ( ( ) ( )) ( ( ) ( )) ( ) ( ( ) ( ( ) ( ))) 0.41% ( ) ( )( ( )| ( ) ) 0.41% ( ) ( )( ( )| ( ) ) 0.37% ( ) ( ) ( ) ( ) ( ) ( ) 0.37% ( ( ) ( )) ( ( ) ( )) ( ) ( ) 0.37% ( ) ( ) ( ) ( ) 3.7. Conclusions In this paper a conceptual procedure was proposed to evaluate and assess the contribution of multi-unit risk for applications to PRA analyses. The procedure identifies and explicitly models four categories of dependencies among units: 1) common (identical) SSCs shared between multiple units; 2) causal dependence of an event (SSC state) in one unit to another event(s) in other units; 3) causal dependence of an initiating event and/or SSC failures in one unit to an event external to the SSCs of other units (seismic, flood, loss of power); 4) parametric (traditional) common cause events within one unit and across multiple-units among similar SSCs, initiating events or human errors. The paper also reported analysis of eleven years of the U.S. LER data from multi-unit sites to estimate probability of common and causal failures among the components. A conceptual two-unit logic example was used to demonstrate the multi-unit PRA procedure proposed in this paper. Results from the analysis showed that all dependencies are important, but the traditional CCF events 68 dominate. Also, causal events initiated by external events that could substantially reduce the margin of protection and mitigation are important to multi-unit risk. From this example and by qualitative examination of the single-unit PRA, it appears that multi-unit core damage CDF would be small and the frequency would be dominated by the traditional common cause failures that can be addressed through traditional parametric methods. Causal core damage sequences starting from another unit could be significant. These conclusions are based on the simple example used in this paper and need to be further validated by using real reactor units. Acknowledgements The authors gratefully appreciate extensive review and extremely helpful comments and suggestions by one of the anonymous reviewers that substantially improved the accuracy and quality of this paper. 69 Chapter 4: Issues in Dependency Modeling in Multi-Unit 9 Seismic PRA 4.1. Abstract This paper addresses issues related to dependency modeling in multi-unit seismic probabilistic risk assessment of nuclear power plants. The concept of multi-unit probabilistic risk assessment (MUPRA) is briefly summarized. The current methodologies to seismic-induced dependency modeling are discussed and grouped into four main approaches. Several issues are identified in the present methodologies for consideration of dependencies in seismic MUPRA. It is shown that the β-factor and correlation coefficient approaches to account for dependencies are different. Further, the paper highlights the weakness of the Reed-McCann method in modeling dependencies. These findings underline the need for improved methods for characterizing dependencies in the multi-unit structures, systems and components (SSCs) with shared features and their links in the MUPRAs. 4.2. Introduction It is evident from the 2011 Fukushima Daiichi accident that correlated external hazards resulted in initiating events and sequences that challenged multiple radiological sources on the plant site, including reactor cores and spent fuel pool storage. The earthquake damaged the electric power supply lines at the site, and the 9 The full-text of this chapter has been published in the Proceedings of the 2017 International Topical Meeting on Probabilistic Safety Assessment and Analysis (PSA 2017), September 24-28, 2017, Pittsburgh, PA. 70 subsequent tsunami extensively damaged the operational and safety infrastructures of the site [98]. These combined effects devastated this six-unit nuclear power site and eventually resulted in the melting of multiple reactor cores (i.e., Units 1, 2 and 3) and spent fuels (i.e., Unit 4). It became evident that the interaction across reactor units, namely unit-to-unit dependencies, played a critical role. An example of such interaction included the explosion in the Unit 4 was caused by the hydrogen escaped from the Unit 3 through the shared ductwork. Recently, the different facets relevant to multi-unit sites (i.e., hazards, methodological challenges, site risk metrics, safety goals and associated quantitative health objectives) have been investigated and possible solutions have been developed by the international experts [36, 39-42] and the research groups in the U.S. [1, 20, 25, 31, 34, 37, 55, 81, 99], Japan [33, 57, 69, 72], South Korea [30, 100, 101], India [61, 102], Canada [60, 78], France [62], and China [63]. This included addressing multi- unit risks due to loss of offsite power, station blackout, and seismic events. External hazards have been recognized by recent research as the most likely events that affect multiple reactor units with significant consequences [56]. As was evident from the Fukushima accident, combined hazards could occur which involve causal natural events such as earthquake and tsunami. The seismic events have been particularly recognized as the most significant natural event that induces unit-to-unit dependencies [33, 72, 103, 104]. Typically, seismic PRA (SPRA) [105] evaluates the impacts of seismic-induced dependencies in SSCs. 71 The main challenge is to appropriately specify the level of dependency and incorporate the dependency effects in seismic MUPRA. Seismic-induced failures of SSCs rely on the magnitude of the ground motion resulted from the earthquake. The failure data could be used for dependencies estimation as long as the failed SSCs are of similar types and are under equal seismic loading. As such, the amount of useful data is extremely scarce for the dependencies estimation due to seismic events. In general, one must either assume that the seismic failures are completely independent, or assume that such failures are perfectly dependent, both of which are incorrect and the truth lies somewhere in between. To account for dependencies the literature can be grouped into four main approaches: 1. Follow the traditional parametric approaches used in common cause failure (CCF) modeling in the internal PRA. For the case of partial dependencies, engineering judgment should be used, which could potentially lead to conservative or even non-conservativeness results. An example of this approach involved evaluation of the seismic fragility of the heat transport system of a liquid metal fast breeder reactor considering partial correlation based on engineering judgment [106]. 2. Incorporate the dependencies by the means of the linear correlation coefficient between seismic failures. In the absence of statistical evidence or engineering experience, one might resort to analytical and simulation techniques. The first study using this approach dates to the studies of the Seismic Safety Margins Research Program (SSMRP) in the 1980’s [107]. Since the SSMRP approach requires extensive analysis of multiple time histories and simulation efforts, a set 72 of rules [108] was developed for the sake of simplification based on the SSMRP findings in 1990. 3. Incorporate the dependencies by means of correlation of the ground acceleration capacity. The essential idea is that the dependencies particularly arise in various attributes considered in the existing fragility development [109], and grouping these attributes can suggest a partial correlation between SSCs within the same or different classes. Therefore, it is practical that one can further account for full or partial correlation between SSCs. This idea was originally proposed by Reed et al. referred to as Reed-McCann method [110] in the literature. The correlation of ground acceleration capacities is treated as constant, regardless of the magnitude of ground motion, which is different from the correlation of seismic failures that is usually increased given larger ground motion. Recent works by R. J. Budnitz consider the Reed-McCann method as a suitable approach to address the partial correlation in seismic events [71]. 4. Combined methodologies have also been introduced to take advantage of specific features of the three approaches above. They usually determine the relationship between the CCF parametric estimation and correlation coefficient of failure occurrences, so as to estimate CCF models (e.g., the β-factor). For example, the β- factor model may be set equal to model to the correlation coefficient [111]. M. Pellissetti treated the seismic-induced dependencies by the integration of correlation models [108] developed in NUREG-1150into the fault tree based seismic PRA with β-factor model [112]. 73 Two issues related to above methodologies that this paper addresses include: 1. Lack of rigorous discussions on the appropriate relationship between β-factor and correlation coefficient. 2. Lack of rigorous discussions on the performance of the numerical quantification procedure in the Reed-McCann method. To address these issues, Section 4.3 briefly summarizes the seismic fragility evaluation. In Section 4.4, the paper shows that the equivalence assumption between β-factor and correlation coefficient is inappropriate. In Section 4.5 the paper highlights limitations of the quantitative procedure of the Reed-McCann method. In Section 4.6, the results and contributions of this paper are summarized. All the simulations and computations in this paper are performed using the open-source language and computing environment R [113]. 4.3. Seismic Fragility Evaluation The seismic fragility is defined as the conditional failure probability given an earthquake. The limit state of each SSC is characterized by the binary case (i.e., fail or survive), depending on the global ground motion intensity (e.g., peak ground acceleration), the ground acceleration capacity and the associated uncertainties. The ground motion capacity of each SSC is denoted as follow: 74 (4-1) where A is the ground acceleration capacity, is the median ground acceleration capacity, and are random variables representing aleatory uncertainty and the epistemic uncertainty of the median estimates, respectively. Both and are assumed to be lognormally distributed with unit medians and logarithmic standard deviations of and . A failure is considered to occur if the ground motion intensity exceeds the associated ground acceleration capacity. The fragility function is typically characterized by a cumulative lognormal distribution as displayed by Equation (4-2). ( ) ( ) ( ) [ ] (4-2) where is the cumulative standard normal distribution, is the ground motion intensity of the earthquake, is the median ground acceleration capacity, represents the aleatory uncertainty, represents the epistemic uncertainty, and Q is the desired non-exceedance probability or confidence. Fragility curves are used to express the failure probability of the SSCs due to earthquakes as a function of ground motion intensity. An example of typical fragility curves is shown in Figure 4-1 including the median, 5% confidence, and 95% confidence curves. 75 Figure 4-1: Example fragility curves [114] The fragility function can also be expressed in terms of the total uncertainty as shown in Equation (4-3), which generates the so-called composite curve displayed in Figure 4-1. This composite curve is often used as equivalent to the mean curve of the family of fragility curves. ( ) ( ) ( ) [ ] [ ] (4-3) √ 4.4. Issues with the Equivalence Hypothesis Between β-Factor and Correlation Coefficient 39 The equivalence hypothesis between the β-factor in CCF and correlation coefficient has been applied in an ad hoc manner with no rigorous proof or discussions. The distinctness between the β-factor and correlations was shown by M. Pellissetti based 40 on a numerical example constructed based on certain specific seismic failure probability and correlation coefficients. Without loss of generality, an analytical 76 approach based on a two-component system is presented to examine the relationship between the β-factor and correlation coefficient. Consider a parallel system composed of two dependent components A and B. Given the occurrence of an earthquake with certain ground motion intensity, the component states and are determined as 1 (failure) or 0 (survive) with the corresponding marginal failure probabilities and , respectively. The joint probability denotes the probability that both components failures with the system state of 1. As such and would be random variables that follow a Bernoulli discrete distribution with the expectations (mean) and , and variances ( ) and ( ), respectively. Suppose the linear correlation coefficient of seismic failures is . Then the correlation is: (4-4) √ ( ) ( ) where is the covariance between the two state variables and ; is the standard deviation of the state variable ; is the standard deviation of the state variable ; is the expectation (mean) of the state variables ; is the expectation (mean) of the state variables ; is the expectation (mean) of the system state variable . Therefore, the joint failure probability can be expressed as: = √ ( ) ( ) (4-5) 77 Consider identical components as is the common consideration in the β-factor modeling. Therefore, for components A and B, , and the joint failure probability in Equation (4-5) would be: = ( ) (4-6) Also, from the β-factor model, the joint failure probability is: ( ) (4-7) Clearly Equations (4-6) and (4-7) are different. The analytical relationship between correlation coefficient and β-factor can be found by equating the Equations (4-6) and (4-7): ( ) ( ) (4-8) The analytical solution to Equation (4-8) provides the correct relationship between the β-factor and correlation coefficient methods as: ( ) √( ) ( ) (4-9) The relationship between β-factor and correlation coefficient varies depending on the correlation coefficient and the seismic failure probability . Treating the β-factor and correlation coefficient equivalently may lead to either conservative (i.e., ) or non-conservative results (i.e., ), given , , and . 78  Let us start by assuming correlation coefficient is greater than the actual β-factor (i.e., ), which can only be satisfied with the following conditions: ( ) √( ) ( ) {  Null solution  On the other hand, let us assume correlation coefficient is no greater than the actual β-factor (i.e., ), which can only be satisfied with the following conditions: ( ) √( ) ( ) {  { The results above indicate that is always the truth. Since and are essentially functions of ground motions, it can be concluded that the β-factor is always greater than the correlation coefficient regardless of the ground motion intensity. As such, the equality between correlation coefficient and the β-factor would ignore the dependent effects and lead to non-conservative estimates. Therefore, the equivalence hypothesis between the β-factor and correlation coefficient was inappropriate. 4.5. Issues with the Reed-McCann Method The Reed-McCann method was proposed by Reed et al. in 1985, to characterize the mutual correlation by the common sources of uncertainties within the fragility development, and ultimately quantify the joint fragility by an analytical approach 79 [110]. The literature lacks a rigorous mathematical discussion on the performance of the Reed-McCann method. This section discusses the Reed-McCann method and examines its quantitative procedure. 4.5.1. Reed-McCann Method The Reed-McCann method is illustrated by the flowchart in Figure 4-2. The first step is to determine the common sources of uncertainties (i.e., epistemic and aleatory) and then numerically quantify the joint fragility by a two-stage process. In the first stage, the median capacities are sampled, for example by using the Latin Hypercube Sampling (LHS) method considering the dependencies among epistemic uncertainties. In the second stage, the dependencies among aleatory uncertainties are addressed and a multiple integration approach is developed to compute the joint fragilities without directly using the correlation. Figure 4-2: Flowchart of the Reed-McCann method 80 Consider an n-component group for illustration purpose. As discussed in Section 4.3, the component fragility is described by the triplet vector: ( ) ( ) ( ) in Equation (4-1), where . Therefore, the properties of all components can be represented as:  Median capacity vector: ⃑⃑ ⃑⃑ ⃑ ( ) ( ) ( )  Epistemic uncertainty vector: ⃑⃑⃑⃑ ( ) ( ) ( )  Aleatory uncertainty vector: ⃑⃑ ⃑⃑ ( ) ( ) ( ) Suppose the common uncertainties between each component have been identified, where , such that:  Common epistemic uncertainty vector: ⃑⃑⃑⃑⃑⃑ ( ) , for epistemic uncertainty between component j and component k.  Common aleatory uncertainty vector: ⃑⃑ ⃑⃑⃑⃑ ( ) , for aleatory uncertainty between component j and component k. Therefore, the reduced uncertainties can be obtained by subtracting the common uncertainties which exists between the component of interest and other relevant components.  Reduced epistemic uncertainty vector: ⃑⃑⃑⃑⃑⃑ ( ) ( ) ( ) , where ( ) √ ( ) ( ) given . 81  Reduced aleatory uncertainty vector: ⃑⃑⃑⃑⃑⃑ ( ) ( ) ( ) , where ( ) √ ( ) ( ) given . Then the two-stage process is carried out to quantify the joint fragilities. The objective of first stage is to sample the median capacities that can be either independent or dependent. If the median capacities are independent, there is no commonality between the epistemic uncertainties. The LHS simply consists of dividing the probability density function for each component into N equal probability slices. For each component and each slice, a random number is generated with equal weight and hence a median capacity is sampled based on the lognormal distribution, bounded by the limits of the slice with the median value of ⃑⃑ ⃑⃑ ⃑ and the logarithmic standard deviation value of ⃑⃑⃑⃑ . When the median capacities are dependent, two steps are necessary to sample the correlated median capacities. The first step is called the independent step in which the same process described for the independent case is done, but by means of different bounded probability distribution with the reduced epistemic uncertainty vector ⃑⃑⃑⃑⃑⃑ instead. The second step is the dependent step where the effect of dependency is considered by multiplying the correction factors. N correction factors are generated using the LHS for each of the common epistemic uncertainty, where the sampled distribution is lognormal with the median value of 1.0 and the logarithmic standard deviation value of ⃑⃑⃑⃑⃑⃑ . Then the components in each set which have the common 82 dependency are scaled sequentially by the same corresponding correction factors. The procedure is repeated for each of the common groups of dependencies. After the scaling option is completed, the N sets of combined median values reflect the inherent dependencies which exist in the median values. After the N sets of median capacity values are generated, the dependencies between the aleatory uncertainties are incorporated in the second stage and then compute the joint fragility.  If there is no commonality between the aleatory uncertainties, it is simple to convert the Boolean systems equation to an algebraic equation. By calculating the individual component failure probability at a series of ground acceleration values, the system fragility curve is obtained based on the system algebraic probability equation.  When dependencies exist between the aleatory uncertainties, the system fragilities could be quantified by a closed-form numerical integration procedure to incorporate the effects of correlated aleatory uncertainties. By using an exact approach, no assumption is made to estimate higher order terms. 4.5.2. Examination and Observations 37 The example problem presented by Reed is herein adopted for examination purpose. There are three safety-related components (i.e., A, B and C) located in two structures. Component A and B have response dependencies since both components are situated in the same building near each other. Components A and C have high capacity 83 dependence since they are the same component made by the same manufacturer. Components B and C are independent since they are different component located in different buildings. The component properties and the common uncertainties can be found in Ref. 37. Table 4-1: Four cases for examination purpose Case No. Dependency System Configuration Description Intersection of All three components 1 Dependent components fail components At least one 2 Union of components component fails Intersection of All three components 3 Independent components fail components At least one 4 Union of components component fails The performance of Reed-McCann method is examined by comparing the joint fragilities of four different cases using Reed-McCann method. The four cases are established depending on the system configuration with or without dependencies as displayed in Table 4-1. It is worthwhile noting that no complete comparison exists, although two sets of results are available for this example problem in the current literatures. The first set of results was provided by Reed [110] for all the four cases, but the results are only for one specific sample set within the ten sample sets generated in total. The second set of results were generated and validated by Budnitz [71]. It is noted that only two dependent cases (i.e., case 1 and case 2) are considered and the results are provided based on the same sample set used in Reed’s results. Although Budnitz also provided the results for ten equally weighted sample sets, the results are generated based on only one iteration. 84 For the sake of performance examination, a complete comparison study is presented: (a) the comparison of independent cases (i.e., case 3 and case 4); and (b) the comparison between dependent and independent cases (i.e., case 1 versus case 3; case 10 2 versus case 4). The R script developed as part of this study was used to implement the Reed-McCann. The validity of this R script was confirmed by: 1) the R script is used to calculate the joint fragility based on the same sample set of the combined median values in the Reed’s. The Budnitz’s results and the results from this study matched perfectly, which were also compared with the Reed’s results; 2) the R script was used to calculate the results for the ten equally weighted sample sets reported in Budnitz’s. It is shown that this study results and Budnitz’s results perfectly compared as well. This was done to test the accuracy of the R script. Within the independent cases, the objective is to examine the performance by the comparison between the mean fragility curve and the composite fragility curve. The R script is used to compute the joint fragility of the independent cases based on ten sample sets as well. As displayed in Figure 4-3, the composite fragility curve overlaps with the mean fragility curve for both intersection and union configurations, which follows the common practice that use composite curve as equivalent to the mean of the family of fragility curves. 10 The multidimensional integration computation is done using the package "cubature" [115] 85 Figure 4-3: Comparison between independent cases with fifty iterations (intersection & union) Examination is also needed to compare the dependence case with the independent one for either intersection or union configuration, respectively. As all the components are positively correlated, one component failure leads to an increased tendency for another component to fail. The existing dependency is going to reduce the failure probability of the union configuration (i.e., either-or condition) and increase the failure probability of the intersection configuration. In other words, the ratio of the joint fragility of Case 1 (i.e., ) to the joint fragility of Case 3 (i.e., ) should always be greater than 1 and the ratio of the joint fragility of Case 2 (i.e., ) to the joint fragility of Case 4 (i.e., ) should always be less than 1. These patterns could be used as the rules to justify the performance of Reed-McCann method. Given a number of iterations with different random seeds, it is found that these patterns could be either violated or followed depending on the initialization of the pseudorandom number generator. Therefore, the performance index can be constructed as shown 86 in Equation (4-10) to track the performance of the Reed-McCann method. In this case when equals to one, it means the perfect performance, and when less than one it means low performance: ∑ [ ( ) ( )] (4-10) where N is the total number of iterations; ( ) is an indicator function which equals 1 given , otherwise equals 0; is the joint fragility of Case 1 in the iteration; is the joint fragility of Case 2 in the iteration; is the joint fragility of Case 3 in the iteration; is the joint fragility of Case 4 in the iteration. Suppose ten tests are designed with ten different numbers of iterations N (i.e., 50, 100, 150, 200, 300, 350, 400, 450 and 500). The performance index for each test is displayed in Figure 4-4. It is obviously seen that the value is far less than 1 for all the tests, which reflects the poor performance of the quantitative procedure of the Reed-McCann method. Figure 4-4: Performance index with ten different numbers of iterations N 87 Let us take one example with seed number 500, the results of which are shown in Figure 4-5. For the comparison between intersection configurations, the consistence of dependent with independent scenarios held true that Case 1 is always higher than Case 3. However, for the comparison between union configurations, an inconsistent pattern is found between Case 2 and Case 4. In the low end of the fragility curve (i.e., less than 0.5g), the failure probability of dependent case is even higher than the independent case. This violates the rules that dependencies reduce the joint fragility of the union configurations. In the upper end of ground motion (i.e., greater than 0.5g), the pattern is alternated and followed the rules. Therefore, the observation reveals the possible limitations of Reed-McCann method. Figure 4-5: Comparison between dependent and independent cases (intersection & union) 88 4.6. Conclusions In this paper, the MUPRA and seismic dependencies were reviewed and highlighted issues in the present methodologies for modeling seismic dependencies. The equivalence hypothesis between the β-factor and correlation coefficient was shown to be inappropriate. The weakness of the Reed-McCann method was examined through a comparison study which showed that the Reed-McCann method cannot correctly characterize the contribution of dependencies. These findings were important and recognized the need for improved methods for characterizing the SSCs with shared features. Development of such improved methods is part of our current research for applications to external event MUPRAs that consider unit-to-unit dependencies. 89 Chapter 5: An Improved Multi-Unit Nuclear Plant Seismic 11 Probabilistic Risk Assessment Approach 5.1. Abstract This paper proposes an improved approach to external event probabilistic risk assessment for multi-unit sites. It considers unit-to-unit dependencies based on the integration of the copula notion, importance sampling, and parallel Monte Carlo simulation, including their implementation on standard PRA software tools. The multi-unit probabilistic risk assessment (MUPRA) approach and issues related to the current methods for seismic dependencies modeling are discussed. The seismic risk quantification is discussed in the context of two typical numerical schemes: the discretization-based scheme and simulation-based scheme. The issues related to the current discretization-based scheme are also highlighted. To address these issues and to quantify the seismic risk at the site level, an improved approach is developed to quantify the site-level fragilities. The approach is based on a hybrid scheme that involves the simulation-based method to account for the dependencies among the multi-unit structures, systems and components (SSCs) at the group level of dependent SSCs, and the discretization-based scheme. Finally, a case study is developed for the seismic-induced Small Loss of Coolant Accident (SLOCA) for a hypothetical nuclear plant site consisting of two identical advanced (GEN-III) reactor units. The results from this case study summarize the effects of correlation across multiple reactor units on the site-level core damage frequency (CDF). Three multi-unit CDF metrics (site, 11 The full-text of this chapter has been published in the Journal of Reliability Engineering & System Safety, Volume 171, Pages 34-47, March 2018. https://doi.org/10.1016/j.ress.2017.11.015 90 concurrent and marginal) were calculated for this case study. It is concluded that based on correlations between the SSCs, the total site CDF metric would be the most appropriate multi-unit CDF metric for seismic risk. 5.2. Introduction The Fukushima Daiichi accident has given greater urgency to the need for nuclear safety regulations that consider unit-to-unit dependencies [98]. Different facets relevant to multi-unit sites have been studied in the past few years, including correlated hazards, methodological challenges, site risk metrics, safety goals and associated quantitative health objectives. For instance, the recent works of international teams [36, 39-42] and the research groups in the U.S. [1, 20, 25, 31, 34, 37, 55, 81, 99], Japan [33, 57, 69, 72], South Korea [30, 100, 101], India [61, 102], Canada [60, 78], France [62], and China [63] have addressed such multi-unit risks as loss of offsite power, seismic events. Among these, seismic events have been recognized as the most significant natural event that induces unit-to-unit dependencies with significant consequences [33, 72, 56, 103, 104]. While seismic- induced failures are the focus in this paper, other random and dependent failures of SSCs involved in multiple reactor units have been addressed in the authors’ previous research [25]. The occurrence of an earthquake imposes strong spatial correlations on structures, systems and components (SSCs) either in the same or different reactor units. Identical or similar SSCs will behave in analogous ways, tending to fail together due to the dependencies that arise from three types of sources: (1) ground motion similarity due to common earthquake sources, similar propagation paths and 91 local conditions; (2) seismic demand similarity due to but not limited to similar construction guidelines and contractors; (3) seismic capacity similarity due to but not limited to similar SSC types and manufacturers. Typically, seismic PRA [105] (SPRA) evaluates the impacts of seismic-induced dependencies among SSCs. The objective is to quantify the seismic risks by integrating the results of the seismic hazard analysis, seismic fragility evaluation and system models. There are two common numerical schemes: (1) the discretization- based scheme, which follows the standard quantification process of internal event PRA by approximating the continuous function with discretized sub-intervals; and (2) the simulation-based scheme, which uses simulation techniques to randomly sample from the continuous function. Single-unit SPRAs have been developed for decades, and standards are available to help develop them. Also available are methods to extend single-unit SPRAs to multi- unit SPRAs [25]. However, the main challenge remaining is a method to appropriately specify the degree of dependency and incorporate these dependent effects into the seismic MUPRA. Nearly all the current SPRAs use the same seismic hazard curve for all SSCs located on the same site. This study follows this practice too, but recognizes the need for methods to consider the ground motion dependencies across nuclear reactor units. Clearly, seismic-induced failures of SSCs depend on the magnitude of the ground motion resulting from the earthquake. Historical failure data can be used to estimate dependency as long as the failed SSCs are of similar types 92 and they are under equal seismic loading. However, the amount of useful data is extremely scarce for estimating all kinds of seismic dependencies. In general, one must either assume that the seismic failures are fully independent, or assume that such failures are perfectly dependent, both of which are inaccurate, with the truth lying somewhere in between. There is no common agreement on how to account for dependencies, but from the present literature the dependency methods can be grouped into four main approaches: 1) Follow the traditional parametric methods used for modeling the common cause failure (CCF) in the internal events PRAs. For the case of partial dependencies, engineering judgment should be used, which could lead to either conservative or non-conservative results. An example of this approach involves evaluation of the seismic fragility of the heat transport system of a liquid metal fast breeder reactor considering partial correlation based on engineering judgment [106]. 2) Incorporate the dependencies by means of the linear correlation coefficient between seismic failures. In the absence of statistical evidence or engineering experience, one might resort to analytical and simulation techniques. The first study using this approach dates to the Seismic Safety Margins Research Program (SSMRP) [107] in the 1980’s. Since the SSMRP approach requires extensive analysis of multiple time histories and simulation efforts, a set of rules [108] was later developed to consider dependencies globally so as to simplify the SSMRP approach. 93 3) Incorporate the dependencies by means of correlation of the ground acceleration capacity. Note the dependencies possess various attributes considered in the existing fragility development [109]. The essential idea is that grouping of these attributes can suggest a partial correlation between SSCs within the same or different classes. Therefore, it is practical that one can further account for full or partial correlation between SSCs. This idea was originally proposed by Reed et al. [110], and is referred to as Separation-of Variable-approach. It is also known as Reed-McCann method in the literature when Separation-of-Variable approach is used with a two-stage process to numerically quantify the joint fragility. The correlation of ground acceleration capacities is treated as constant, regardless of the magnitude of ground motion, which is different from the correlation of seismic failures that is usually increased given larger ground motion. Recent work by Budnitz et al. [71] considered the Reed-McCann method as a suitable approach to address the partial correlation where SSCs are not fully correlated or entirely independent in seismic events. In Reference 37, the interested readers can find more details on how to determine the actual dependencies given the similarities in response and capacity factors. 4) Combined methodologies have also been introduced to take advantage of specific features of the three approaches above. They usually determine the relationship between the CCF parametric estimation and correlation coefficient, so as to estimate CCF models (e.g., the β-factor). For example, the value of the β-factor can be set equal to the correlation of failure occurrences [111]. Pellissetti et al. [112] treated the seismic-induced dependencies by the integration of correlation 94 models developed in NUREG-1150 [108] into the fault tree based seismic PRA with β-factor model. However, several issues exist with these methods. First, the underlying features of the dependencies can be altered with different interpretations of seismic hazard, fragility development and reactor design research results. Hence, further work is needed to check the validity of the application of the SSMRP rules to the current nuclear power plants. Second, the above approaches inappropriately incorporate the effects of seismic dependencies, given the features of dependencies available. Zhou et al. [73] have identified two main issues: (1) the equivalence hypothesis between the β-factor and correlation coefficient is inappropriate; and (2) the Reed-McCann method is limited in its ability to correctly characterize the contribution from dependencies. These open issues underline the need to develop an improved method for characterizing dependencies in the multi-unit structures, systems and components with shared features and their links in the MUPRAs. To address these issues, this paper proposes an improved hybrid approach to quantify unit-to-unit dependencies in the external event MUPRAs. In this approach, the seismic-induced dependencies among the correlated SSCs can be properly considered at the group level using the simulation-based scheme that integrates the copula notion [116], importance sampling [117] and parallel Monte Carlo simulation [118]. Further, the discretization-based scheme in the proposed approach allows for the use of standard PRA software tools to determine the site-level fragilities and allows transfer 95 of the results from level-1 PRA to level-2 PRA. As such, this approach enhances the treatment of dependent SSC reliability under common seismic loading for application to site-level MUPRA, and achieves a balance between risk estimation accuracy and computational simplicity in two steps. First, the estimation accuracy is assured based on the improved characterization of seismic-induced dependencies when compared to the Reed-McCann method, and based on a justified selection of the reference ground motion level in the discretization-based scheme. Second, the computational simplicity is accomplished by the hybrid scheme, where any changes made in the MUPRA model, only the affected correlated groups or individual SSCs need to be modified to reflect the resulting changes in the risk estimates. This is more practical and efficient when compared with the conventional simulation-based scheme in which the whole system must be reconfigured in accordance with the required changes. This approach also ensures scalability of the MUPRA model, especially when dealing with complex seismic MUPRA models. This paper is organized as follows. Section 5.3 discusses the common numerical schemes for seismic risk quantification and highlights the issues associated with the current use of the discretization-based scheme. Section 5.4 discusses the proposed framework to model the seismic-induced dependencies and quantify of the joint fragility. In Section 5.4.1, a method is presented to quantify the joint fragilities at the group level based on the copula notion and importance sampling. To derive the site- level fragility, the proposed discretization-based scheme is discussed in Section 5.4.2, which as a matter of practicality makes use of standard PRA software tools such as 96 SAPHIRE [119]. Finally, a case study is developed in Section 5.5 for a seismic- induced small LOCA for a hypothetical nuclear power site consisting of two identical advanced (GEN-III) reactor units. The risk metric for the MUPRA is defined in terms of the site CDF, and three types of multi-unit CDF metrics are discussed. Results obtained from these activities are discussed to set the basis for studies of the impacts of seismic-induced dependencies on multi-unit risk. Section 5.6 provides conclusions and recommendations. 5.3. An Overview of Seismic Risk Quantification Methods This section discusses the common numerical schemes for performing seismic risk quantification. The objective is to estimate the CDF by integrating the results of the seismic hazard analysis, seismic fragility evaluation, and system models (i.e., seismic event trees or structure functions leading to core damage sequences). To make the discussion self-contained, we follow with a review of the seismic hazard analysis and seismic fragility evaluation. 5.3.1. Seismic Hazard Analysis Probabilistic seismic hazard analysis (PSHA) is the standard approach for characterizing the seismic hazard at each nuclear power site. The results are usually expressed by the seismic hazard curves indicating the annual exceedance frequency in terms of a series of ground motion parameters such as the commonly used Peak Ground Acceleration (PGA). Figure 5-1 shows an example of typical seismic hazard 97 curves including the median, mean, 15% confidence, and 85% confidence curves. The uncertainty expressed in this exceedance frequency is attributed to both aleatory uncertainty and epistemic uncertainty. The seismic hazard curve can be used in two different ways through the risk quantification process. The simpler way is to approximate it by a finite number of discrete intervals. Such discretization simplifies the computation and allows the process follow the quantification procedure of the standard internal-event PRAs. The more complicated way is to use the power law relationship relating the annual exceedance frequency and the ground motion parameter as expressed by Equation (5-1). In this way, the seismic hazard curves that are typically close to log-linear can be properly represented [120-123]: ( ) ( ) (5-1) where IM is the ground motion intensity in units of the gravitational acceleration, g, ( ) is the annual exceedance frequency of a ground motion intensity IM, and and k are empirical constants. 98 Figure 5-1: Example seismic hazard curves [105] 5.3.2. Seismic Fragility Evaluation The seismic fragility is defined as the conditional failure probability given an earthquake. The state of each SSC is characterized by the binary outcome (i.e., fail or survive), depending on the global ground motion intensity (e.g., peak ground acceleration), the ground acceleration capacity and the associated uncertainties. The ground motion capacity of each SSC is denoted as follows: (5-2) where A is the ground acceleration capacity, is the median ground acceleration capacity, and and are random variables representing aleatory uncertainty and the epistemic uncertainty of the median estimates, respectively. Both and are usually assumed to be lognormally distributed in units of median and logarithmic standard deviations of and , respectively. 99 A failure is considered to occur if the ground motion intensity exceeds the associated ground acceleration capacity. The fragility function is typically characterized by the cumulative lognormal distribution of Equation (5-3): ( ) ( ) ( ) [ ] (5-3) where is the cumulative standard normal distribution, is the ground motion intensity of the earthquake, is the median ground acceleration capacity, is the inverse cumulative standard normal distribution, and Q is the desired non-exceedance probability or confidence. Fragility curves are used to express the failure probability of the SSCs due to earthquakes as a function of ground motion intensity. An example of typical fragility curves is shown in Figure 5-2, including the median, 5% confidence, and 95% confidence curves. An alternative and commonly used form of the fragility function is expressed in terms of the total uncertainty, , as shown in Equation (5-4), which generates the so-called composite curve displayed in Figure 5-2. This composite curve is often used as equivalent to the mean curve of the family of fragility curves. ( ) ( ) ( ) [ ] [ ] (5-4) √ 100 Figure 5-2: Example fragility curves [114] 5.3.3. Numerical Schemes Given the seismic hazard curve and seismic fragility curve, there are two common numerical schemes used to execute the risk estimation noted in Section 5.2. The first scheme is the discretization-based scheme, which follows the standard quantification process of the internal event PRA by approximating the seismic hazard curve with discrete sub-intervals. As such, the final seismic risk estimate would be denoted by a finite number of doublets in the form of: . The computational effort could be considerably reduced with the aid of standard PRA software tools, while uncertainty analysis cannot be supported. The second scheme is the simulation-based scheme, which uses simulation techniques such as Monte Carlo simulation and Latin Hypercube Sampling to randomly sample from the continuous function of seismic hazard and then combine it with the fragility of individual SSCs. As such, the final seismic risk estimate would be determined by a large number of trials. The insights of uncertainties would be conveniently obtained, while specific software tools are required to implement such scheme. 101 The numerical scheme should be chosen carefully when dealing with seismic-induced dependencies. For instance, the system failure probability is approximated by the minimal cut-set method in the SSMRP. A Monte Carlo simulation based approach (called DQFM) was developed by Watanabe et al. [124] to directly quantify the fault trees, which also demonstrated that using the minimal cut-set method would overestimate the CDF. Monte Carlo simulation is a general method and can more appropriately consider the effects of dependencies. The effect and efficiency of the above methods for assessment of seismic-induced core damage were investigated by Uchiyama et al. [125-127] On the other hand, the analytical approach proposed in the Reed-McCann method is based on a Latin Hypercube Sampling (LHS) approach [110]. A comparison between the LHS approach and the traditional Monte Carlo simulation is presented by Ravindra et al. [128]. It should be noted that some simplifications are needed in the discretization-based scheme. For instance, the reference level of ground motion needs to be selected for each discrete sub-interval. Due to the importance of uncertainties from hazard and fragility analysis, the simplification should be carefully evaluated. However, there is no standard guideline for such discretization, and different practices have been suggested in the selection of reference ground motion level. For example, the IAEA recommends choosing the upper limit of the subintervals, which typically leads to conservative estimates, so as to be certain that no significant contributions to the assessed probability of core damage are omitted [129]. Other studies use the geometric mean of the two bin range limits for each subinterval [130, 131]. It remains unknown whether using the 102 geometric mean would lead to either non-conservative or conservative insights. However, it will be shown in Section 5.4.2 that the geometric mean of the two bin limits is particularly not a suitable choice. 5.4. Proposed Hybrid Methodology with Demonstration This section discusses the proposed approach to quantify the site-level fragility; the approach integrates the mean seismic hazard curve with the mean fragility curve [120]. This hybrid scheme is used to take advantage of the simulation-based scheme to account for the dependencies at the group level as presented in Section 5.4.1, and then the discretization-based scheme is used to quantify the seismic risk at the site level as presented in Section 5.4.2. It is assumed that the generic fragilities are used and the correlation or dependent features would be provided by the seismic fragility analysts by separating the common sources of uncertainties among the interested SSCs. All the simulations are performed using the open-source language and computing environment R [113]. 5.4.1. Seismic Risk Quantification at the Group Level The proposed simulation-based scheme is shown by the flowchart in Figure 5-3. First, an importance sampling method is used to tackle the ground motion intervals as discussed in Section 5.4.1.1, which allows the propagation of uncertainty in the seismic hazard curve. Second, the copula notion is applied to construct the joint distribution of the ground acceleration capacity for the components with shared 103 features. The constructed joint distribution is then used to randomly simulate correlated ground acceleration capacity as discussed in Section 5.4.1.2. Then, the seismic risk at the group level is estimated as discussed in Section 5.4.1.3. All the random sample sets are used to estimate the seismic risk of single-component or a system consisting of multiple components, where the seismic risk is usually characterized by the failure frequency or the conditional failure probability, referred to as fragility, as noted in Section 5.3.2. A three-component example is demonstrated in Section 5.4.1.4 to derive the joint fragility of a correlated group. Figure 5-3: Flowchart of the simulation-based scheme at the group level 104 5.4.1.1. Seismic Hazard Sampling It is advantageous to exploit the power law relationship (i.e., seismic hazard function) as shown in Equation (5-1) when relating ground motion intensity to seismic ground acceleration capacity. This enables us to assess the risks with the propagation of uncertainty in the seismic hazard curve. The important aspect of these hazard curves is the exponent k as shown in Equation (5-1), which represents the steepness of the hazard curve. Kennedy [120] recommended that the values of k range from 1.66 to 3.32 for eastern and central U.S. sites, and 2.84 to 4.11 for California sites. Baker [123] suggested that k=2 and k=3 represent typical shapes of the spectral acceleration hazard curves observed in seismically active parts of the U.S. It should also be noted that ( ) is usually not normalized, and it is not straightforward to directly sample from ( ). Even with the normalized version of ( ), most samples would be simulated in the regions of small magnitude. However, seismic risk contributions from small-magnitude events are insignificant comparing to large-magnitude events. Accordingly, it would be preferred to simulate from an alternate distribution that allows a high probability of producing samples from the regions of large magnitude so as to improve the efficiency of considering important ground motion intervals and the resulting seismic risk estimates [132]. Therefore, it is proposed to apply the self-normalized importance sampling technique that simulates the samples from an alternate probability density function ( ) called proposal distribution [117]. The truncated normal distribution is used as the proposal distribution within the interested ground motion interval in the current study. 105 The mean is set equal to the upper limit of the ground motion interval and the standard deviation is set equal to the range of the ground motion interval. 5.4.1.2. Correlated Ground Acceleration Capacity Sampling The composite correlations are used to tackle the mutual dependencies among the ground acceleration capacities of SSCs. It would be initially desirable to follow the conventional fragility development to separate the uncertainties and then determine the common sources within epistemic and aleatory uncertainties. For example, the Reed-McCann method treats the correlation of epistemic uncertainty and correlation of aleatory uncertainty separately. However, it is usually difficult and even impractical to separately identify the common randomness and uncertainty. Furthermore, Kennedy [120] shows this separation is unnecessary since it increases the complexity and does not improve the accuracy in the estimate of seismic risk. The sensitivity study of Huang et al. [133, 134] also showed similar observations as Kennedy’s [120]. Therefore, treating the composite correlation should be a more practical and efficient approach that does not compromise accuracy. The copula notion is applied to quantify the dependencies among the ground acceleration capacities of SSCs. One of the main aims of this paper is to introduce the copula as a generic tool for dependency for both individual and combinations of external hazards. The copula is a powerful technique for modeling and simulating the features of large-scale joint distribution from separate marginal distributions. The essential idea of a copula is described in Sklar’s theorem, which states that any 106 multivariate distribution can be modeled using the arbitrarily univariate marginal distribution functions and a copula describing the linear or non-linear dependencies among the variables [116]. Applications of copulas include system health management such as system performance evaluation [135] and prognostics [136]; natural hazard risk modeling including seismic [137], flooding [138], and slope collapses [139]; decision-making support such as event trees analysis [140]; reliability-based design optimization [141]; the uses of expert judgments [142]; and model uncertainty assessment [143]. However, the use of copulas in the risk assessment of nuclear plants has been very limited. The earliest study dates to Yi et al. [144] in which a copula is applied to the precursor analysis in a nuclear power plant site. Another study is presented by Kelly [145], who uses a copula-based approach for CCF analysis. Suppose there is a correlated group of n components where the ground acceleration capacity of the component is represented by the random variable ( ). The marginal distribution of each capacity is denoted by ( ) ( ). By applying the probability integral transform element, one could generate a uniform random vector ( ) ( ), where is a uniform random variable over the unit interval [0, 1]. According to Sklar’s theorem, the joint distribution function ( ) and the marginal distribution functions ( ) ( ) can be connected using the copula ( ) as shown in Equation (5-5): 107 ( ) ( ( ) ( ) ( )) (5-5) ( ) With the constructed multivariate distribution, one can then generate pseudo-random samples from the copula distribution ( ) ( ). With realization from the joint distribution, the correlated samples can be obtained by applying the inverse transform method at each margin distribution ( ). There are two common types of copulas [116]: elliptical copulas (e.g., Gaussian copulas) and Archimedean copulas (e.g., Frank, Clayton, and Gumbel copulas), due to their differences in dealing with tail dependency modeling. The elliptical copulas are restricted to symmetrical tail dependencies. Specifically, the elliptical copulas are the copulas of elliptical distributions [146] and provide useful examples of multivariate distributions because they share many of the tractable properties of the elliptical distributions. The Gaussian copula is the copula that underlies the multivariate normal distribution. It shares the same dependency structure with the multivariate normal distribution and uses pairwise Pearson correlation coefficient to measure dependency and allows arbitrary marginal distributions for the uncertainties. The t-copula [116] presents symmetric and positive upper and lower tail dependencies, which indicates a tendency for the t-copula to generate joint extreme events. When the number of degrees of freedom increases, the t-copula converges to the Gaussian copula [116]. For a limited number of degrees of freedom, however, the behaviors of the two copulas are quite different. Archimedean copulas model upper 108 tail dependency, lower tail dependency, or both, so that they provide additional flexibility to describe the behavior of tail dependency in realistic situations. The current state of practice of seismic PRA assumes lognormal distribution for the ground acceleration capacity. It is noticed that non-parametric approaches [147] were applied to establish fragility curves without any assumption and the results indicate the classical lognormal assumption is not appropriate. As a generic tool for dependency modeling, the copula notion would be promising to handle dependencies among non-normal random variables in future works. Consistent with the current state of practice, this research assumes linear Pearson correlations for modeling SSC dependencies. Hence, the Gaussian copula is adopted to construct the multivariate distribution of the correlated ground acceleration capacities. Consider the joint cumulative distribution function of a multivariate normal distribution with correlation matrix , where is the cumulative distribution function of a standard normal variable. Note that the Gaussian copula is naturally parameterized in terms of the correlation matrix . A Gaussian copula with correlation matrix is defined in Equation (5-6): ( ) ( ) ( ) ( ) (5-6) Given the constructed copula, one can then generate the sample sets of the correlated ground acceleration capacities. This is done using an existing copula package in R [148]. 109 5.4.1.3. Estimation of Seismic Risk at the Group Level All the sample sets of ground motion intensity and ground acceleration capacities are used to estimate the seismic risk by the fraction of sample sets in which ground acceleration capacities are less than or equal to the ground motion intensity. Suppose we are interested in the ground motion interval from to . In the sample set, the state of the of the component is represented by the state variable , which is determined by the ground acceleration capacity and ground motion intensity as shown in Equation (5-7): ( ) (5-7) where ( ) is the state indicator function, which equals 1 for component failure when is less than , and otherwise equals 0, indicating component survival (i.e., ). Within the N sample sets, the fraction of times that represents the seismic fragility of the component. The fragility of the component is denoted by Equation (5-8) based on the realizations of the ground motion and ground acceleration capacity: ( )∑ ( ) { ( ) } ( ) ( ) ∑{ } (5-8) ( ) ( ) ∑ { } ( ) where ( ) is the conditional failure probability of the component, ( ) is the seismic hazard function in Equation (5-1)(5-1) that characterizes the interested ground motion intensity, ( ) is the probability density function of the proposal distribution (i.e., truncated normal distribution) as discussed in Section 5.4.1.1 to 110 generate ground motion intensity, N is the size of sample sets, is the sample of the ground acceleration capacity of SSC, and is the sample of the ground motion intensity. Note that the self-normalized importance sampling technique is applied given ( ) is a non-normalized function. As for the system failures, the system state is usually configured in terms of the intersection and/or unions of the component states. Based on the PRA model logic, it is straightforward to derive the structure function ( ) represents the system state in terms of the components states. Hence the conditional system failure probability ( ) is represented in Equation (5-9) based on the fraction of times that within the N sample sets. ( ) ( ) ∑{ ( ) } ( ) ( ) (5-9) ∑ { ( ) } ( ) ( )∑ { } ( ) If one is interested in the failure frequency regarding a single component or a system consisting of multiple components, the normalization constant C within the interested ( ) ground motion interval is needed, where ∫ ( ) . In doing so, the frequency of the component failure ( ( ) ) can be calculated from Equation (5-10) by normalizing Equation (5-8) with constant C. 111 ( ∑ { ( ) ) } ( ) ( ) ( ) ∫ ( ) (5-10) ( ) ∑ { } ( ) The frequency of system failure can be obtained in Equation (5-11) by multiplying Equation (5-9) by the normalization constant C: ( ) ( ) ∑ { ( ( ) ) }( ) (5-11) ∫ ( ) ∑ ( { ) } ( ) 5.4.1.4. Example of Joint Fragility Assessment This section presents the application of the proposed approach to the three-component example in Reed et al. [110]. There are three safety-related components (i.e., A, B and C) located in two structures. Component A and B have response dependencies since both components are situated in the same building near each other. Components A and C have high capacity dependence since they are the same component made by the same manufacturer. Components B and C are independent since they are different components located in different buildings. Given the component properties and the common uncertainties found in Reed et al. [110], the composite correlation matrix between the components is derived as shown in Table 5-1. 112 Table 5-1: Composite correlation between the components A B C A -- 0.42 0.53 B 0.42 -- 0 C 0.53 0 -- Suppose the marginal probability distribution of the ground acceleration capacity of the component is denoted by ( ) and the joint distribution of the components capacities is ( ). Specifically, the marginal distribution for each component is lognormally distributed with the median values of and logarithmic standard deviation . According to Sklar’s theorem, the copula ( ) could be constructed to describe the dependencies among ( ) and . The observed marginal distributions, the copula and the given ground motion intensity determine the joint fragility. Table 5-2: Four cases for examination purpose Case No. Dependency System Configuration (for Failure Logic) 1 Dependent Intersection of components (3/3) components 2 Union of components (1/3) 3 Independent Intersection of components (3/3) 4 components Union of components (1/3) Four cases are established depending on the system configuration with or without dependencies as displayed in Table 5-2. The joint fragility is then displayed in Figure 5-4 and shows both the independent and dependent cases. The impact of correlations on the joint fragility varies significantly. In the low acceleration cases of less than 0.5 113 g, the influence of correlation on the intersection cases is quite small, which may indicate that the system could be treated as independent. In the high acceleration level, the dependent effects become significant in both the intersection and union cases. However, in the intersection case, treating the components as fully independent would significantly underestimate the seismic risks. The correlation for the union cases always leads to the reduction of the likelihood of joint failure, while the correlation for the intersection cases always increases the tendency of joint failure. These patterns are correct in accordance with the rules used in Zhou et al. [73] to justify the performance of the Reed-McCann method. Figure 5-4: Results for the dependent and independent cases in Table 5-2 using the simulation-based scheme displayed in Figure 5-3 114 5.4.2. Multi-Unit Site Seismic Risk Quantification This section discusses the development of the site-level fragility, in which the discretization-based scheme is formulated using the CCF modeling modules (e.g., β- factor model) in standard PRA software tools like the SAPHIRE. Section 5.4.2.1 discusses the selection of a reference level of ground motion in the discretization- based scheme. A separate computational tool is developed using the R routing code to combine the simulation-based scheme with parallel Monte Carlo simulation [118] to perform the parametric estimation of each correlated group. The parametric estimation process is parallelized so as to allow parallel computing to decrease the computational burden. Specifically, one full Monte Carlo simulation is concurrently carried out for each discretized ground motion interval. These parametric estimations are then input to the PRA model coded in SAPHIRE. The discretization-based scheme is demonstrated using the correlated group consisting of three identical components discussed in Section 5.4.2.2. In Section 5.4.2.3, we present a comparison between the SAPHIRE results and the simulation results obtained by directly using the simulation-based scheme. 5.4.2.1. Selection of the Reference Level of Ground Motion As discussed in Section 5.3.3, the validity of the selected reference ground motion level in the current practice of discretization-based scheme should be carefully evaluated. The simulation-based scheme proposed in Section 5.4.1 can be a practical tool to provide a baseline to check the performance of the discretization-based 115 scheme, since it can accommodate the uncertainties from the hazard into the fragility analysis. Suppose one is interested in the joint fragility of a three-component system, and all these components are identical and independent of each other. The component properties are: . The seismic hazard curve is shown in Figure 5-5 with the seismic hazard data fits into the power relationship: ( ) 2 with R = 0.9961. The first step is then to divide the magnitudes into an appropriate number of intervals, which is done as listed in Table 5-3. The frequency is calculated as the difference between the frequencies at the range limits of each interval. Figure 5-5: Example seismic hazard curve 116 Table 5-3: PGA intervals and frequency Index PGA Interval (g) PGA – Geometric Mean (g) Frequency 1 0.05-0.25 0.11 1.80E-03 2 0.25-0.40 0.32 5.11E-05 3 0.40-0.50 0.45 1.21E-05 4 0.50-0.60 0.55 6.71E-06 5 0.60-0.70 0.65 4.10E-06 6 0.70-0.80 0.75 2.69E-06 7 0.80-0.90 0.85 1.86E-06 8 0.90-1.00 0.95 1.35E-06 9 1.00-1.05 1.02 5.37E-07 10 1.05-1.10 1.07 4.67E-07 11 1.10-1.15 1.12 4.09E-07 12 1.15-1.20 1.17 3.60E-07 Four cases are designed for our testing purpose, as shown in Table 5-4, depending on the system configuration and different choices of reference PGA level. The validity underlying the selection of the reference ground motion is examined by applying the importance sampling approach in Section 5.4.1.1 to account for the uncertainty within each PGA interval, the results from which are employed as the baseline. Table 5-4: Four cases to be analyzed for the selection of reference PGA level System Configuration (for Failure Case No. Reference PGA level Logic) 1 2/3 Upper Limit 2 3/3 Upper Limit 3 2/3 Geometric Mean 4 3/3 Geometric Mean 117 Table 5-5: Results for the system of 2/3 failures PGA Simulated Cumulative CPF Assuming CPF Assuming Index Interval Probability of Failure Geometric Upper Limit (g) (CPF) Mean 0.05- 1 0 0 0 0.25 0.25- 2 1.61E-06 0 2.32E-05 0.40 0.40- 3 5.85E-04 3.42E-04 2.71E-03 0.50 0.50- 4 1.52E-02 1.34E-02 4.61E-02 0.60 0.60- 5 1.10E-01 1.19E-01 2.41E-01 0.70 0.70- 6 3.46E-01 4.04E-01 5.78E-01 0.80 0.80- 7 6.34E-01 7.32E-01 8.46E-01 0.90 0.90- 8 8.42E-01 9.20E-01 9.62E-01 1.00 1.00- 9 9.30E-01 9.72E-01 9.83E-01 1.05 1.05- 10 9.61E-01 9.90E-01 9.93E-01 1.10 1.10- 11 9.79E-01 9.95E-01 9.97E-01 1.15 1.15- 12 9.89E-01 9.98E-01 9.99E-01 1.20 The simulation and SAPHIRE results for the two types of system configurations are summarized in Table 5-5 and Table 5-6, respectively. It is observed that when selecting the geometric mean as the reference PGA level, the SAPHIRE estimates would be non-conservative such as the results in bold, which means significant contributions to the assessed seismic risk might be omitted. On the other hand, selecting the upper limit as the reference PGA can assure all significant contributions are considered over the whole range of ground motions. This suggests that the upper limit of the two bin limits is an appropriate choice for the reference PGA level. This 118 observation supports the recommendations in the seismic PRA guidance of the IAEA [129]. Table 5-6: Results for the system of 3/3 failure PGA Simulated Cumulative CPF Assuming CPF Index Interval Probability of Failure Geometric Assuming (g) (CPF) Mean Upper Limit 0.05- 1 0 0 0 0.25 0.25- 2 0 0 0 0.40 0.40- 3 4.17E-06 1.22E-06 2.71E-05 0.50 0.50- 4 4.78E-04 3.00E-04 1.95E-03 0.60 0.60- 5 9.78E-03 8.38E-03 2.61E-02 0.70 0.70- 6 6.55E-02 6.31E-02 1.25E-01 0.80 0.80- 7 2.12E-01 2.12E-01 3.16E-01 0.90 0.90- 8 4.25E-01 4.29E-01 5.39E-01 1.00 1.00- 9 5.90E-01 5.81E-01 6.40E-01 1.05 1.05- 10 6.84E-01 6.76E-01 7.26E-01 1.10 1.10- 11 7.62E-01 7.56E-01 7.96E-01 1.15 1.15- 12 8.24E-01 8.20E-01 8.51E-01 1.20 5.4.2.2. Parametric Estimation For each correlated group, the parametric estimations can be derived by constructing a simulation scenario under the given failure criteria. In the current study, the β-factor model is employed where one β-factor should be estimated for each correlated group. 119 Unlike the internal CCF modeling, the values of β-factor vary depending on the ground motion intensity and the capacities. The strategy for numerically implementing the parametric estimation is a parallel Monte Carlo simulation, where one full Monte Carlo simulation is carried out for each discretized ground motion interval. The β-factor is then derived based on the results of the parallel Monte Carlo simulation as input to the CCF modeling approach. A flow chart of the proposed parametric estimation approach is displayed in Figure 5-6. Figure 5-6: Flowchart of the discretization-based scheme at the site level The parametric estimation is demonstrated by three correlated groups where one β- factor is derived for each group. Suppose each group contains three nominally 120 identical SSCs. In this case, the general fragilities of each type of SSC are displayed in Table 5-7, representing the containment building, heat exchanger and manual valve. Table 5-7: SSCs properties [131] Containment 1.10 0.46 0.30 0.35 Building Heat Exchanger 1.90 0.46 0.30 0.35 Manual Valve 3.80 0.61 0.35 0.50 Within each correlated group, it is assumed that the correlation coefficient between each SSC is 0.8. Given the twelve PGA intervals in Table 5-3, we execute the simulation-based scheme in Figure 5-3, in which the β-factors are estimated as the fraction of dependent failures involving more than a single component as represented in Equation (5-12), where the denominator denotes the number of all failures and the numerator denotes the number of dependent failures. ( )∑ { [ ∑ ] ∑ } ( ) (5-12) ( )∑ { [ ∑ ] ∑ } ( ) Figure 5-7 displays the β-factor for the three correlated groups. In general, the β- factor is strongly dependent on the acceleration. In the low acceleration range, the β- factor is quite small and the SSCs in the system might be treated as independent. With increasing acceleration, the likelihood of concurrent failures increases rapidly. Given the same acceleration, for the SSCs with different fragilities, it is seen that the heat 121 exchanger group and manual valve group are rugged and the β-factor low, while for containment building the β-factor is high and approaches to unity in the high acceleration range. This demonstrates that increasing the capacities of the SSCs can significantly reduce the influence of seismic-induced dependencies. In summary, the perfect dependent assumption (i.e., setting β-factor to unity) would be a highly conservative approach. Figure 5-7: β-factor with equivalent correlation coefficient of 0.8 5.4.2.3. Example of Parametric Estimation This section outlines a method to estimate the system fragility using the discretization-based scheme in Figure 5-6. The three-component system described in Section 5.4.2.1 is selected and all the components are assumed to be dependent with 122 the correlation coefficient assumed to be 0.8. The first step is to determine the β- factor as discussed in Section 5.4.2.2. The simulated value of the β-factor is illustrated in Figure 5-8 and is then used as the input to the SAPHIRE model. Figure 5-8: β-factor for the three-component system The PRA model for both 2/3 and 3/3 systems are modeled in SAPHIRE, which are used to calculate the joint fragility. In addition, the proposed approach in Section 5.4.1 is used to generate the baseline results of the same scenario. Table 5-8 summarizes the SAPHIRE results and the simulation results, which are compared with each other. It is observed that the SAPHIRE results are more conservative than the simulation results. This is expected since the upper level of each discrete interval is selected as the reference acceleration. 123 Table 5-8: SAPHIRE results and simulation results Simulatio SAPHIR PGA Referenc SAPHIRE Inde Simulation CPF n CPF E CPF Interva e PGA CPF (2/3 x (2/3 System) (3/3 (3/3 l (g) (g) System) System) System) 0.05- 1 0.25 0 0 0 0 0.25 0.25- 2 0.40 2.14E-04 6.63E-04 5.44E-05 6.40E-04 0.40 0.40- 3 0.50 8.42E-03 1.32E-02 2.93E-03 1.05E-02 0.50 0.50- 4 0.60 5.76E-02 1.03E-01 2.57E-02 6.12E-02 0.60 0.60- 5 0.70 1.89E-01 3.77E-01 1.03E-01 2.00E-01 0.70 0.70- 6 0.80 3.89E-01 7.29E-01 2.51E-01 4.38E-01 0.80 0.80- 7 0.90 5.96E-01 9.31E-01 4.41E-01 6.93E-01 0.90 0.90- 8 1.00 7.63E-01 9.89E-01 6.23E-01 8.70E-01 1.00 1.00- 9 1.05 8.53E-01 9.97E-01 7.39E-01 9.26E-01 1.05 1.05- 10 1.10 8.94E-01 9.99E-01 7.99E-01 9.58E-01 1.10 1.10- 11 1.15 9.25E-01 1.00 8.48E-01 9.77E-01 1.15 1.15- 12 1.20 9.48E-01 1.00 8.87E-01 9.88E-01 1.20 5.5. Example Application to the Seismic MUPRA This section illustrates an application of the proposed approach by constructing a seismic-induced accident scenario involving concurrent SLOCA at a generic site consisting of two advanced reactor units at power. The SLOCA event is chosen as the representative case due to many small pipes and tape lines [105, 149]. In this example, the SLOCA is assumed to be caused by a seismic-induced break outside of 124 the containment [150]. The objective herein is to demonstrate the process and understand the influence of seismic-induced multi-unit common cause failures between the SSCs across reactor units on the site risk. In the absence of information to support the correlation specifications, the equi-correlated model [151] is selected as a reasonable characterization model, which means that only one correlation coefficient needs to be specified between the similar or identical SSCs. As a sensitivity study, varying degrees of correlation are considered across the reactor units to examine sensitivities to the assumptions regarding correlations of SSCs across reactor units. When considering the frequency of seismic-induced SLOCA, it is assumed that the seismic-induced SLOCA are fully correlated. In other words, the SLOCA are always assumed to occur concurrently in both reactor units. The multi- unit risk is defined in terms of site core damage, and three types of multi-unit CDF metrics are discussed. The definition of each metric is summarized as follows.  Total Site CDF: defined as the frequency of one or more core damage events; for example, this definition corresponds to the union of the core damage events of Units 1 and Unit 2.  Concurrent CDF: defined as the frequency of multiple core damages events nearly simultaneously; for example, two core damage events in Unit 1 and Unit 2.  Marginal CDF: defined as the frequency of core damage events of one unit that includes consideration of all states of the other units affecting this unit; for example, the core damage events in Unit 1 including consideration of all states of Unit 2. 125 5.5.1. A Two Reactor Unit Site Seismic PRA The two reactor units are assumed to be identical and symmetrically constructed. A single-unit seismic PRA model is adopted [150], and the corresponding event tree results in thirteen core damage sequences. The multi-unit seismic PRA is established from the existing single-unit seismic PRA by superimposing the seismic-induced multi-unit CCF between the identical SSCs across reactor units through the Level 1 fault trees according to the MUPRA methodology proposed by Modarres et al. [25]. The seismic hazard data [152, 153] developed for the eastern United States were used and divided into ten PGA intervals as shown in Table 5-9. The reference PGA is selected as the upper limit of each discrete interval, and the frequency is calculated as the difference between the frequencies at the range limits of each interval. The SLOCA initiating event frequency is estimated based on the generic conditional probability of occurrence of SLOCA developed from the piping calculations in the SSMRP [108]. The annual frequency of core damage for the seismically initiated events is then computed using the initiating event frequencies listed in Table 5-9. The SSCs’ fragility data are employed from the generic fragility database available from published articles and reports [105, 131, 154-156]. A sensitivity study is then performed to investigate the influence of the correlation coefficient on the multi-unit CDF. Five cases are constructed: independent (i.e., 0), partial (i.e., 0.3, 0.5 and 0.8) and full dependency (i.e., 1.0), respectively. 126 Table 5-9: PGA intervals and frequency PGA Reference Exceedance Initiating Event SLOCA Index Interval (g) PGA (g) Frequency (1/yr.) Frequency (1/yr.) 1 0.05-0.25 0.25 1.15E-03 5.75E-05 2 0.25-0.45 0.45 5.70E-05 3.42E-06 3 0.45-0.65 0.65 1.62E-05 3.23E-06 4 0.65-0.85 0.85 7.02E-06 2.81E-06 5 0.85-1.00 1.00 2.99E-06 1.91E-06 6 1.00-1.10 1.10 1.42E-06 1.08E-06 7 1.10-1.20 1.20 1.12E-06 9.87E-07 8 1.20-1.30 1.30 9.02E-07 8.20E-07 9 1.30-1.40 1.40 7.37E-07 7.37E-07 10 1.40-1.50 1.50 6.12E-07 6.12E-07 5.5.2. The Two-Unit Site Seismic PRA Results The software SAPHIRE is used to calculate the conditional core damage probability (CCDP) in each PGA interval, and the corresponding CDF is calculated by multiplying CCDP with the initiating event frequency of that PGA interval. The final multi-unit CDF is then derived by summing the CDFs of all the PGA intervals. The results are summarized based on the three types of metrics in Figure 5-9, Figure 5-10 and Figure 5-11, respectively. These figures summarize the mean CDF estimates in terms of the five correlation strengths and show the contribution of dependency in each PGA interval. The results are useful to examine the impact of the correlation assumptions and to identify the important risk contributors in different PGA levels as the correlation conditions vary. The important insights are summarized as follows: 1. Compared to the conservative full correlation assumption, it is demonstrated in all three CDF metrics that there is a significant reduction of the multi-unit CDF even when using a small reduction of the complete dependent assumption (i.e., when the correlation is changed from 1.0 to 0.8). 127 2. At the higher correlations, the main sensitive region would be shifted to the lower end of the site fragility curve. The most important risk contributors would become the SSCs with lower fragilities and potentially higher correlations. Hence, reducing the degree of correlation for the relatively weak critical SSCs would help enhance the site safety. 3. In the high PGA intervals, the concurrent CDF would approach the total site CDF. In other words, the CCDP is close to unity and both reactor units would fail together. 4. The most sensitive region is the middle region of the site fragility curve with respect to the potential correlation assumption, while it is less sensitive to both the low-end and high-end of the site fragility curve. Specifically, the effect of the seismic capacity of the SSCs on site safety is remarkable in the middle PGA interval around 0.3g to 0.5g. Ruggedizing components in this interval would enhance the site safety. 5. Comparing the three types of CDF metrics shows that the contribution of dependency is the most significant when applying the concurrent CDF metric. This means the concurrent CDF metric would be the most sensitive to the underlying assumption of correlation coefficient. It could provide a less conservative estimate when choosing the multi-unit metrics as the total site CDF or marginal CDF. 128 Figure 5-9: Results for the total site CDF metric Figure 5-10: Results for the concurrent CDF metric 129 Figure 5-11: Results for the marginal CDF metric Figure 5-12: Contribution of concurrent CDF to total site CDF Figure 5-12 shows the contribution of concurrent CDF to the total site CDF. Given the PGA level under 0.4 g, the contribution from the concurrent CDF is relatively small, since the independent failures are dominant within the low PGA interval. As the PGA level increases, the likelihood of concurrent core damage increases rapidly, 130 as indicated by the abrupt slope starting at around 0.7 g. At this point, all the units become more likely to fail simultaneously. For high PGA levels of more than 1.0 g, the contribution of concurrent CDF approaches the 100% mark. The concurrent CDF approaches the total site CDF, and it is very likely that both reactor units on site would fail simultaneously. It is intuitive to understand that an extremely large ground motion would lead to concurrent core damage. The multi-unit CDF results shown in Table 5-10 summarize the corresponding CDFs in the ten PGA intervals. As confirmed earlier, a moderate relaxation of the complete dependency assumption could significantly reduce the multi-unit CDF. The perfect independent assumption would lead to 7.93%, 20.92% and 17.69% underestimation for the total site CDF, concurrent CDF, and marginal CDF metrics, respectively, with respect to a correlation coefficient of 1.0. Again, this confirms that the concurrent CDF metric is the most sensitive to the value of the correlation coefficient. The total site CDF metric is shown to be the least sensitive and, hence, should be used as a relative multi-unit CDF metric when no correlation data is available. Table 5-10: Multi-unit CDF results Correlation Total Site CDF Concurrent CDF Marginal CDF Coefficient 0 8.83E-06 6.54E-06 7.26E-06 0.30 9.00E-06 7.05E-06 7.79E-06 0.50 9.07E-06 7.19E-06 7.92E-06 0.80 9.22E-06 7.53E-06 8.19E-06 1.00 9.59E-06 8.27E-06 8.82E-06 131 5.6. Conclusions and Recommendations This paper presented an improved hybrid approach to evaluate the seismic multi-unit risk, and a case study was developed for a seismic-induced SLOCA for a hypothetical two-unit nuclear power plant site. The improved approach to external MUPRA considers unit-to-unit dependencies based on a hybrid scheme that integrates the copula notion, importance sampling, parallel Monte Carlo simulation and use of the standard PRA model and software tools. In doing so, a balance between estimation accuracy and computational simplicity is achieved in the proposed approach. A review of the current practice of seismic risk assessment emphasized the shortcomings of the current practice of seismic-induced dependencies modeling. The proposed methodology was applied to a three-component example and demonstrated the issues related to the uses of the geometric mean as reference level of the ground motion. Three multi-unit CDF metrics were calculated for this case study. It has been concluded that given the correlations between the SSCs, the total site CDF metric would be the most appropriate multi-unit CDF metric rather than the concurrent CDF or marginal CDF metrics. It was also demonstrated that the effect of the seismic capacity of SSCs on site safety is more important in the midrange of PGAs. Acknowledgements The research presented in this paper was partly funded under the US NRC grant NRCHQ6014G0015. Any views presented in this paper are those of the authors and do not reflect an official position of the US NRC. 132 Chapter 6: A Common Cause Failure Model for Components 12 under Age-Related Degradation 6.1. Abstract This paper reports on the effect of age-related degradation in hardware components on the likelihood of common cause failures (CCFs). It proposes a CCF model for components that undergo age-related degradation that superimposes the impacts of maintenance on the component degradation evolutions inferred from condition monitoring data. Major limitations of the state-of-the-art parametric CCF models are discussed, and recent enhancements including existing gaps are summarized. To bridge the gaps, a new approach is proposed to exploit recent advances in sensor- based data analytic algorithms. The approach involves a state-space based degradation model that describes component degradation processes based on either a physics-based model or a data-driven model. The model uses a degradation index based on features of the sensor monitoring data. The CCF impacts are then characterized as a function of time based on the component degradation states. The proposed approach characterizes the CCF impacts based on the conventional parametric CCF model, but unlike the parametric CCF models, the parameters are derived from estimated degradation states and any renewal or component rejuvenation achieved through maintenance. As such, the proposed parametric CCF model is specific to the component being analyzed and is dynamic over lifetime service. The β-factor model is adopted without loss of generality, and two scenarios 12 The full-text of this chapter forms a paper currently (i.e., April 2018) under review at the Journal of Reliability Engineering & System Safety. 133 are presented depending on the availability of sensor monitoring data. The first scenario is a sensor-driven scenario that estimates degradation state from the sensor monitoring data of components using a recursive Bayesian approach. The second scenario is simulation-based, whereby the component degradation evolution is simulated under an imperfect maintenance regime to estimate the CCF over the component’s lifetime. A laboratory-based degradation test of three identical centrifugal pumps generated several types of sensor monitoring data until failure. The results concluded that the component degradation and maintenance practices could significantly affect the CCF estimates and that treating CCF with the traditional generic CCF parameters would underestimate plant risks as components degrade. This study also introduces physical evidence to the CCF research for application in the multi-unit nuclear power plant Probabilistic Risk Assessment (PRA). 6.2. Introduction The term common cause dependencies encompasses the possible mechanisms that directly compromise component performances and ultimately cause degradation or failure of multiple components, referred to as common cause failure (CCF) events [3]. CCF events are a major contributor to the risk imposed on most engineering systems and notably on nuclear power plants, where considerable research efforts have been devoted to modeling the impacts of CCF on plant risks. The relevant CCF models [2] may be grouped into two major categories: shock models (e.g., binomial failure rate model) and non-shock models. The non-shock models, including the β-factor model, the α-factor model and the multiple Greek letter model, have been widely adopted in 134 probabilistic risk assessment (PRA) practices [4]. In these models, the CCF events are characterized by some static CCF parameters that need to be quantified through statistical analysis based on historical observations and engineering judgment [5, 6]. However, these CCF models suffer from several major limitations summarized as follows:  The models are built mainly from generic operational experience and are usually not specific to the operating conditions of individual components.  The number of observed failure events, particularly in the nuclear power plants is very limited, especially for the events involving failures of more than one identical or similar component.  It is difficult to model asymmetrical components and to account for the dependencies among the components within multiple common cause component groups. To address these limitations, four main approaches in the present literature have been reported to enhance these CCF models:  Improve the quality and quantity of CCF database by compiling the CCF event data in a more consistent manner. For example, the International Common-Cause Failure Data Exchange (ICDE) Project [7] has been established to obtain both qualitative and quantitative insights of CCF by properly integrating many international experiences.  Formulate a causal CCF model to account for the relationship of specific root causes and coupling factors on the CCF events. A Bayesian network is usually 135 adopted to establish the causal framework to probabilistically link all relevant sources. Examples include the unified partial method and its extension, referred to as the Zitrou’s model [8], the Kelly-CCF method [9], the alpha-decomposition method [10], and the general dependency model [11].  Extend the scope of the current parametric CCF to include both the identical and diverse component groups: for example, from multiple nuclear reactor units on a common site. This includes the recent works of Fleming [31], Ebisawa et al. [72], and Modarres et al. [25, 35].  Address other limitations of the current CCF models: for instance, by treating the dependencies among the components across multiple common cause component groups [12, 13], improving the uncertainty treatment [14, 15], and developing the extension of current CCF models [16, 17]. The implicit assumption of the present parametric CCF models is constant failure rate where the failures are treated as fully random and the effects of degradation are not considered. The validity of this implicit assumption is debatable, especially when components are subject to harsh environmental challenges (e.g., high temperatures, corrosive fluid). Indeed, the nuclear industry is faced with concerns due to plant aging and plant life extension where effects of CCF would be paramount. Research efforts presented by US NRC [157], IAEA [158, 159] and CNSC [160] discuss these concerns. It is also evident from the nuclear industry’s operational experiences [161] that the aging impact of CCF on plant risks is important. However, how to properly consider the aging impact on the CCF modeling remains an open and challenging issue. The primary objective of this paper is to address this issue. 136 Figure 6-1: CCF for components under age-related degradation It is important to investigate the dynamic characteristics of CCF for components undergoing age-related degradation (e.g., wear, corrosion, fatigue, erosion). As displayed in Figure 6-1, CCF events are caused by age-related degradation processes [162] that result in cumulative degradation in multiple components, impairing their capacities to perform the design function. Typically, the common cause dependencies are characterized [2] by three related root causes—pre-operational, operational- maintenance and operational-environment-related—and three coupling factors— hardware-based, operation-based and environment-based. While the root causes of most CCF events are attributed to age-related degradation processes [162], other root causes involving extreme loads and shock impacts (e.g., under seismic, flood and fire conditions), and root-causes that leave multiple components in inoperable states (e.g., due to maintenance errors), are excluded in this model. Nevertheless, nearly all CCF coupling factors [161] are influenced by component aging. That is, CCF events from all the environment-based coupling factors, including same component location and 137 internal environment/working medium, some of the operation-based coupling factors, such as same operating procedure and same maintenance/test/calibration, and some of the hardware-based coupling factors, such as component configuration and the attributes of manufacturing, construction and installation, can be attributed to age- related degradation. To bridge the existing gaps and assumptions described above, this paper proposes a novel approach to modeling degradation-related CCF events by integrating the maintenance impacts and the component degradation evolution that can be characterized through condition monitoring data. The proposed approach consists of two main parts and adopts the CCF β-factor model without loss of generality. The first part focuses on the component degradation assessment. Specifically, the degradation state of each component is characterized through a degradation index obtained by extracting features of data obtained from sensors that monitor evidence of degradation. Based on the proposed degradation index, a state-space based degradation model is built to describe the component degradation evolution; this model considers the variations both within and across involved components. In the second part, the time-dependency of CCF events is estimated based on the detected degradation evolution. At each time instant, the β-factor for CCF probability is estimated as the fraction of the degradation states of multiple components that simultaneously exceed each component’s endurance to degradation. The estimation of the β-factor for CCF probability, however, follows the conventional parametric CCF model. Accordingly, the scope of the parametric CCF model is dynamic over lifetime 138 service rather than static. The component degradation evolution under imperfect maintenance and renewal is also simulated to support the CCF estimation over the lifetime. The maintenance effect on CCF is also investigated through sensitivity studies. The primary focus of this research is to advance the state-of-the-art CCF analysis by exploiting the opportunities provided by recent advances in sensor-based techniques that facilitate understanding of the component degradation evolution [163, 164]. Note that the terms degradation state and degradation index are used interchangeably in this paper, since the component degradation state is characterized by the proposed degradation index. The key elements of the proposed approach are summarized as follows:  Integration of components’ degradation evolutions to model CCF.  Generalization of the common cause influences among similar or even slightly dissimilar components with shared features.  Introduction of a new way to quantify the CCF.  Infusion of physics-based information to the CCF.  Reliance on a large amount of sensor-based condition monitoring data, to complement the scarcity of failure data. The proposed approach is demonstrated below by a test rig generating diverse sensory data acquired from a special-purpose experiment involving a redundant pump system at the University of Maryland. Three centrifugal pumps were continuously tested, and 139 the common cause dependencies were monitored and established through application of shared environment, identical system design and proximity. The pump conditions were monitored using three types of techniques including process monitoring, vibration monitoring, and acoustic emission (AE) monitoring. Development of pump failure analysis, degradation assessment and condition-based maintenance policy will also be presented in detail. Simulation, as well as sensitivity analysis, is performed to evaluate the effects of maintenance on CCF over the lifetime of components. The remainder of this paper is organized as follows. Section 6.3 discusses the proposed approach to modeling CCF through integrating component degradation evolution. Section 6.4 presents the experimental design, instrumentation, pump failure analysis, degradation assessment, and CCF estimation results. Section 6.5 presents the conclusions. 6.3. Proposed Approach This section presents the proposed approach, which consists of two parts as illustrated by the flowchart in Figure 6-2. Section 6.3.1 presents the modeling scope and some key assumptions, followed by a description of Part 1, the overall degradation assessment to CCF modeling. Section 6.3.2 covers treatment of the condition monitoring data, definition of the degradation index and development of the degradation model. Section 6.3.3 discusses Part 2, estimation of the β-factor for CCF probability. 140 Figure 6-2: Flowchart of the proposed approach 6.3.1. Modeling Scope and Assumption Before proceeding to the proposed approach, the modeling scope and some key assumption are summarized as follows:  This research advances CCF analysis with a focus on multiple identical or similar components undergoing age-related degradation.  Multiple components are operated under common conditions and environments.  Components are equipped with condition monitoring capabilities, where the sensory data can be directly or indirectly correlated to the severity of the underlying degradation process.  Effective degradation assessment methods are available to infer the degradation state for the components of interest. 141  The component degradation evolution is modeled as a continuous process characterized by a physics-based model or a data-driven model.  No maintenance-based rejuvenation is assumed in the development of sensor- driven scenario, which follows the state-of-the-art practice of degradation modeling.  Effect of imperfect maintenance on CCF is accounted for by superimposing the amount of renewal achieved onto the estimation of the β-factor for CCF probability through simulation-based scenarios. 6.3.2. Degradation Assessment To accurately assess component degradation, three main steps must be taken: (1) determine the most useful condition monitoring techniques that would cost- effectively track component degradation state; (2) obtain useful degradation information that fully characterizes the underlying physical transition of degrading components; (3) develop an appropriate estimation model for the β-factor for CCF probability. 6.3.2.1. Condition Monitoring Techniques Condition monitoring techniques have been widely used to understand and track the component degradation [165]. A variety of techniques can be applied depending on the specific types of component and application of interest [166]. The sensor measurements collected using condition monitoring techniques are known as 142 condition monitoring data, which are analyzed, trended and used to obtain indications of component degradation state. Note that baseline data should be collected usually in the pre-service period; these data provide information on initial component condition and provide a basis for comparison with the data from subsequent examinations [167]. Three typical condition monitoring techniques involved in this study are as follows:  Process monitoring is a condition monitoring technique to detect problems by monitoring changes in any combination of the process variables such as pressure, temperature and power consumption. Monitoring the trend over a long period can typically provide indications of improper machine conditions.  Vibration monitoring is the most common non-destructive technique to measure the level of vibration as acceleration, velocity or displacement. The level of vibration can then be compared to historical baseline values to assess the component’s condition.  AE monitoring was originally developed for non-destructive testing of static structures and has recently received a lot of attention for the applications to machinery condition monitoring. It offers the advantage of early fault detection in comparison to vibration monitoring due to the increased sensitivity of AE [166]. 6.3.2.2. Degradation Index Construction Degradation index construction is influenced by the nature of the data available. In general, the condition monitoring data [168] may be classified into two categories: a) direct condition monitoring data that can be directly related to the underlying physics- 143 of-failure, such as crack size measurements or amount of wear or corroded materials in the oil; (b) indirect condition monitoring data from which the degradation state can only be indirectly inferred, such as vibration and oil analysis data. With the growing complexity of engineered components, it is difficult or even impossible to identify the physical signals that directly characterize the underlying degradation process [169]. Typically, the raw signals are transformed into more informative features, so as to enhance the data quality to better represent the current component condition. As such, indirect approaches are more practical. Numerous signal processing methods are available to extract these features, including time-domain analysis, frequency-domain analysis, and time-frequency-domain analysis [170]. Then the fault relevant features need to be identified, and an appropriate transformation process is used to construct the degradation index. This transformation process is typically achieved by the common algorithms within five categories [171]: classification-based, statistical- hypothesis-testing-based, weighted-based, regression-based and distance-based methods. More recently, machine learning techniques have attracted attention, including application of deep learning to automatically identify data features for diagnostic and prognostic purposes [172]. In this paper, a distance-based degradation index is defined and used to characterize the degradation evolution observed during the experimental case study. Figure 6-3 summarizes the basic steps necessary to assess the component degradation. The first step is to process the raw condition monitoring data to extract useful features that appropriately characterize the component condition. Then the anomaly is detected 144 based on the distance between the test data formed by the features describing the current component condition and the training data observed during the normal operation. Specifically, in this approach, the distance is computed using the Mahalanobis distance (MD) methodology [173-175], which is a process of distinguishing multivariable data groups using a univariate distance measure. The magnitude of the MD values signifies the number of abnormalities, which can then be used to construct the degradation index indicating the component degradation state as a function of time. Figure 6-3: Proposed degradation assessment method Suppose the component condition can be described by an -dimensional feature vector extracted from the raw condition monitoring data at each time step. The feature vectors as the training data collected at the time step during the normal operation are denoted as , where is the feature observed at the time step, with and . Note that the time step is the 145 time index for the training data collected from the normal operation period. These training data are used to describe the normal operation by calculating the corresponding mean ̅ and standard deviation of the feature. Then normalize each feature of the training data as shown in Equation (6-1): ̅ (6-1) ∑ √ ( ̅̅ ̅ ) where is the normalized value of , ̅ ∑ and . The normalized feature vector is denoted by . The covariance coefficient matrix, C, for the normalized vector in Equation (6-2) would be: ∑ (6-2) Consider the feature vectors as the test data collected during the abnormal operation, , where is the feature observed at the time step, , and . Then obtain the normalized feature vectors of test data at the time step by subtracting the mean ̅ and dividing by the standard deviation . The MD value of the test data is calculated using the normalized vector and the covariance coefficient matrix from Equation (6-2): (6-3) 146 where is the MD value, is the normalized feature vector of the test data , is the transpose of the row vector , and is the inverse of the correlation matrix C, The MD values usually fluctuate since the degradation process is driven by multiple dependent competing failure mechanisms involving gradual degradation and random shocks. Post-processing (e.g., smoothing and filtering techniques) is usually required to obtain a smooth degradation index to track the component degradation. In this study, a distanced-based degradation index is proposed in Equation (6-4) to extract the central tendency of the degradation, where is the degradation state at the time step k, and is a tuning parameter to control the extent of smoothing. ∑ ( ) (6-4) 6.3.2.3. State-Space Based Degradation Model The degradation evolution of the component is modeled as a continuous stochastic process, where is the degradation state of the component at the time step k. Note that this process can be built according to a physics-based degradation model or some functional form referred to as the empirical degradation model based on the degradation index developed in Section 6.3.2.2. In this study, the degradation process is modeled by one of the most common stochastic processes referred to as general path model [176]. The parametric function is assumed to be ( ) , where is a vector of model parameters that is usually treated as 147 a vector of random variables to account for unit-to-unit variability, and is an independent and identically distributed (i.i.d.) random error term. Herein, we assume the initial degradation state is zero without loss of generality. Note that this functional form can be linear, polynomial or exponential, and depends on the specific application. Furthermore, a state-space model is built to describe the dynamics of the degradation process, because of its ability to account for different sources of uncertainties [177-179]. The state-space model assumes that the degradation model parameters are unobserved states that evolve over time as a random walk process, so as to capture the variability across components. The variation within each component itself is reflected by the observation noise. The state-space model is applied to track the nonlinear degradation process of the component in terms of the state function in Equation (6-5) and observation function in Equation (6-6). State function: ( ) (6-5) Observation function: ( ) ( ) (6-6) where ( ) is the empirical degradation model, is the state vector of the component that is assumed as the hidden Markov process; is the observation (i.e., degradation index) of the component that is conditionally independent given the hidden process; is the i.i.d. process noise vector; is the i.i.d. observation noise; k is the time step; ( ) is the transition distribution; and ( ) is the observation distribution. 148 6.3.3. Estimation of the β-Factor for CCF Probability In the context of degradation modeling, a component failure is usually defined as the point at which the degradation state exceeds a predetermined level of endurance to degradation. Given that the degradation state estimate is known at each time instant, the occurrence of CCF would be indicated by the concurrent exceedance of the endurance to degradation. Therefore, the CCF impacts would be characterized by the fraction of multiple exceedances of the endurance to degradation, which follows the conventional parametric CCF model. As such, the scope of the parametric CCF model would be extended to be dynamic over the service lifetime rather than being static. Without loss of generality, as illustrated in Figure 6-4, suppose the degradation state ( ) of component 1 is realized by the N samples denoted by circles and the ( ) degradation state of component 2 is realized by the N samples denoted by triangles at the time step k. The endurance to degradation is treated as the same for both components, as is the convention of CCF. Note that it is straightforward to generalize to different points of endurance to degradation regarding each component due to the different operational requirement. 149 Figure 6-4: Characterization of CCF with the components’ degradation states and endurance to degradation All the samples associated with each component at each time step would be gathered to describe the degradation state of the two-component system. In this study, the - factor model is adopted for demonstration. The -factor at each time instant k is estimated as the fraction of dependent failures involving more than a single component as represented in Equation (6-7), where the denominator denotes the number of all failures and the numerator denotes the number of dependent failures: ∑ ( ∑ ) ( ∑ ) { [ ( )] ( )} ( ) (∑ ) { [ ∑ ( )] ∑ ( )} (6-7) where is the estimate of the degradation CCF -factor parameter at the time step k, ( ) denotes the endurance to degradation, N is the total number of samples, is 150 ( ) the realization of the degradation state of component 1 at the time step k, is the realization of the degradation state of component 2 at the time step k, ( ) is the ( ) state indicator function, which equals 1 for component failure when is greater than , and otherwise equals 0, indicating component survival. 6.3.3.1. Sensor-Driven Degradation Monitoring Sensor-driven monitoring enables the -factor for CCF probability to be estimated by combining the general degradation property with the sensor monitoring data of plant- specific components. To do this, we monitor individual components using real-time sensor monitoring data to update the component degradation states, and in turn, update the CCF estimation. Specifically, the state-space model in Section 6.3.2.3 is further utilized such that once the sensor monitoring data are collected from an operating component, the hidden states can be inferred to calibrate the estimate of CCF in real time. The recursive Bayesian updating method provides a rigorous and general way to estimate the posterior probability density function (pdf) of the degradation state of the component ( ) given the observations. Through recursive Bayesian filtering, prediction and update will be recursively implemented in two steps. (1) Prediction step: obtain the prior pdf ( ), which means that the state is inferred from the observations . ( ) ∫ ( ) ( ) (6-8) 151 (2) Update step: obtain the posterior pdf ( ) in terms of the current observation. ( ( ) ) ( ) ( | ) (6-9) ( ) ∫ ( ) ( ) ∬ ( | ) ( | ) ( | ) It is usually difficult to obtain Equation (6-9) in a closed-form, so we must resort to Monte Carlo methods. In this study, the particle filtering approach is used to achieve such a recursive state estimate and update because of its capability of handling non- linear and non-Gaussian systems. The key idea of particle filter [180] is to approximate the posterior pdf ( ) at the step by N random samples or ( ) particles { } with associated weights { ( )} . Here N is the total number of particles: ( ) ( ) ( ) ( ) ∑ ( ) (6-10) ( ) where is Dirac’s delta function and is the weight of the particle of the component at time k. ( ) The weights are normalized as ∑ , where is the identity column vector. ( ) ( ) The sample is drawn from importance density ( | ). Through recursive relation, the weights are updated as follows: 152 ( ( ) ( ) ( ) ) ( ( ) ) ( ) (6-11) ( ) ( ) ( ) After multiple iterations, the variance of the weights increases such that only some particles have a significant weight, and all the other particles have negligible weights. This is known as the degeneracy problem, which is usually addressed by a resampling to eliminate particles that have small weights and concentrates on particles with large weights [181]. At each time step, the samples obtained from the resampling process could be viewed as the realizations of the degradation state for each component, and hence can be used to estimate CCF as shown in Equation (6-7). 6.3.3.2. Consideration of Maintenance Impacts on Degradation-Based Common Cause Failure Probability This section aims to develop a simulation-based approach to superimpose the imperfect maintenance on the degradation process identified in Section 6.3.2. The component degradation history can then be simulated given the identified component degradation process and the specific maintenance policy. With a few iterations of simulations, one can generate samples of component degradation states at each time step which are ultimately used to estimate the -factor as shown in Equation (6-7). This allows the effects of various maintenance policies on the CCF over the component lifetime to be evaluated. A generic condition-based maintenance policy is established for elucidating the approach. The maintenance policy is subject to the following assumptions (the 153 authors also recognize the possibility of optimizing the decision variable [182, 183], which is out of the scope of this paper):  The component is subject to periodic inspection and the component failure can only be detected at the time of inspection.  The inspection itself is perfect in that it reveals the true degradation state of the component and does not change the condition of the component.  Inspection and maintenance actions take negligible time compared to the expected lifetime of the maintained component.  Preventive and corrective replacement is perfect, while preventive maintenance (PM) could be imperfect.  Each component over lifetime service would randomly follow one of the various types of classical degradation processes or failure mechanisms according to the knowledge of component’s historical performance. Suppose the inter-inspection interval length is so the degradation state of a component after its installation at time 0 will be inspected and measured at times { }. According to the degradation state at the inspection, one of the following maintenance actions would be needed, and the degradation state of the component after maintenance would be :  If , no maintenance action is performed, and , where is the preventive repair threshold.  If , an imperfect preventive maintenance of the component is immediately performed. The impact of imperfect maintenance is considered by 154 adjusting the degradation state of a maintained component by a random amount to some level lower or equal to the preventive repair threshold . As such, would be ( ) , where is a rejuvenation factor defined within the interval [0, 1]. The θ indicates the degree of repair and follows the beta distribution parameterized by two positive shape parameters, denoted by α and γ. Note means a perfect repair and means a minimal repair, and is the preventive replacement threshold.  If , preventively replace the system. In doing so, the component is considered as good as new, which means is equal to zero, and is the threshold (i.e., endurance to degradation) to trigger corrective replacement.  If , the component fails and correctively replace the component. The component is considered as good as new, indicating equals to zero. 6.4. Experimental Study There are six steps involved in the experimental case study to demonstrate the proposed approach: 1) design a special-purpose experiment with advanced sensing capabilities for a redundant pump system; 2) conduct failure analysis to identify the failure mechanisms and root causes; 3) construct a degradation index using information from diverse sensor data; 4) develop a degradation model that quantitatively characterizes the degradation evolution; 5) estimate the β-factor for CCF probability using the observed sensor monitoring data; 6) estimate the β-factor 155 for CCF probability by simulating, superimposing and accounting for the inspection frequency and the rejuvenation effects of preventive maintenance. 6.4.1. Experimental Design and Instrumentation As an active component susceptible to CCF [184], the centrifugal pump was chosen for this case study. The general-purpose horizontal centrifugal pump tested was a mechanically sealed pump driven by a 12-Vdc Totally Enclosed Fan-Cooled (TEFC) motor. The centrifugal pumps were tested from brand-new condition to full failure inside a temperature chamber, where pump degradation and failure were exposed to recirculated seawater at elevated temperature. The test rig, depicted in Figure 6-5, consisted of two testing loops: (1) a heating loop to heat the temperature of testing fluid, which contained a circulation pump, titanium inline immersion heater, and titanium shell and tube heat exchanger; and (2) a testing loop to expose the testing pump to the environmental stresses. The fluid temperature was maintained at around 75°C, and the chamber temperature was kept at around 70°C. Thus, the pumps operated in a harsh condition that is close to the upper limit of the fluid temperature (i.e., 95°C). Accordingly, the common-cause dependencies among the pumps were rooted in the same component configuration, the same operating practice, and common inter-environmental conditions (i.e., elevated temperature) and intra- environmental conditions (i.e., elevated temperature and corrosive seawater). Unlike most current research based on the data acquired from artificially-seeded damage experiments, no artificial damage was seeded in this experimental setting, to 156 more closely represent the real field situation. Since the useful life of a centrifugal pump can range from several months to a few years, the experiment was planned to stop when the pump fully or partially ceased to perform. The operating conditions of the pumps tested were continually monitored by the sensing system, which comprises three types of condition monitoring techniques: 1. Process monitoring was implemented through measurements of the pump’s differential pressure, flow rate, electric current and electric voltage. All the measurements were performed at the sampling rate of 0.5 Hz using a Keysight Technology 34972A LXI data acquisition and an in-house developed Labview-based tool. 2. Vibration monitoring was implemented using three single-axis AC240 accelerometers from Connection Technology Center (CTC) Inc., the National Instruments (NI)-9230 analog input module, NI-cDAQ-9174 CompactDAQ chassis, and an in-house developed Labview-based tool. To ensure accurate measurement, the sampling rate was set at 10240 Hz, which is approximately 40 times the maximum vane passing frequency. The recordings of every 60 seconds of data were stored in a separate file. 3. AE monitoring was implemented using three Micro30 miniature AE sensors manufactured by Physical Acoustics Corporation (PAC). The sensors were placed in three locations on the pump: suction, bearing, and motor. The sampling rate was set at 1 MHz. The output signal was pre-amplified at 40 dB and was collected by a commercial AE data acquisition System by PAC. The 157 AE signals from all three channels were recorded simultaneously by extracting selected features such as absolute energy, root mean square (RMS), and counts. Figure 6-5: Test rig and instrumentation 6.4.2. Pump Failure Analysis The interactions between the surrounding environment and the operating pump could lead to degradation in form of changes in physical properties and dynamic behaviors including part damage and reduction of performance. Failure analysis was conducted after the experiment to identify the root causes and the actual failed parts of the pumps. Table 6-1 shows the experiment duration, failure mode and failure mechanism and root causes for each pump. 158 Table 6-1: Failure analysis for each of the three pumps Failure Inde Durati Failure Failure Observation Mechani Root Cause x on Mode sm Excessive fluid pressure on seal caused seal fracture. The broken pieces then led to Pum 1954 Seal rubbing between Fatigue p 1 hours fracture pump’s housing and impeller, which caused the impeller to stick, and the pump functionally stopped. Fretting corrosion occurred on the contact surface between the Red- mechanical seal Fretting Pum 5103 brown and the rotating corrosio p 2 hours corrosiv shaft. This caused n e fluid extensive corrosion indicated by the red-brown corrosive fluid. Pitting corrosion occurred on the contact surface between the Pitting Pum 4654 Pump mechanical seal corrosio p 3 hours leaking and the rotating n shaft. This caused serious leakage located in the coupling section. 159 6.4.3. Pump Degradation Assessment This section describes the construction of the degradation index based on the three types of condition monitoring data. The main challenge was to extract useful information from raw sensor signals and to establish a feature vector representing the pump condition. The degradation assessment method proposed in Section 6.3.2.2 was used to construct a degradation index using the established feature vector with the tuning parameter γ=100. Data collected in the first ten days of the testing were used to establish the health baseline that characterizes the pump’s normal operation. 6.4.3.1. Process Monitoring Data The pump degradation state was monitored by the rich information contained in the pump efficiency data derived from the four measured operational characteristics: electric current, electric voltage, differential pressure and flow rate. The objective was to track the statistical features extracted from the pump efficiency data that indicate the pump performance fluctuations as it degraded. As shown in Equation (6-12), the pump efficiency is determined as the ratio of the hydraulic power to the electric power consumed by the pump [185]: (6-12) where is the electric voltage is the electric current, is the density of the pump liquid, g is the gravity constant, is the measured flow rate, and H is the pump head converted from the measured differential pressure [185]. 160 The four sensor measurements were first smoothened with a moving average filter and then used to determine the pump efficiency according to Equation (6-12). The pump efficiency data were segmented every hour and various statistical features were extracted from each segment. Specifically, seven statistical features were extracted from the pump efficiency data including mean value, peak to peak value, root mean square, standard deviation, crest factor, shape factor and mean square frequency. These features constituted a seven-dimensional feature vector describing the pump operating condition. The resulting degradation index of the three pumps is illustrated in Figure 6-6, where the x-axis is the testing time of pumps and the y-axis is the degradation index. The same pattern in the degradation index was observed indicating similar degradation paths for all pumps. Figure 6-6: Degradation index constructed based on process monitoring data 6.4.3.2. Vibration Monitoring Data The operating pumps produce vibration signals with distinctive characteristics recognized by specific vibration spectrum patterns [186]. By inspecting the frequency spectrum of the related vibration signals, one can identify the characteristic frequencies, tracking the changes that uniquely indicate pump degradation status. 161 In this study, the vibration signals were segmented every minute, and then each segment was transformed into a frequency spectrum using Fast Fourier transform (FFT). The first step was to identify the characteristic frequencies by searching for the frequencies with top one-hundred magnitudes in the frequency spectrum. The measurements collected from all three directions were analyzed to explore the allocation of energy within the frequency spectrums. The same pattern was discovered in each pump: most energy was distributed in the five principal frequency bands from 20 Hz to 300 Hz. Indeed, variations existed across different pumps because of the speed variations and the different failure mechanisms involved. The next step was to track the pump condition by measuring the energy of characteristic frequencies, which is expressed by the RMS of the spectrum magnitude in terms of the five principal frequency bands. Given one single-axis vibration signal at each time instant, the pump condition could be represented by a five-dimensional feature vector consisting of the RMS of each characteristic frequency. In this study, the vibration data from the three directions were used to establish a fifteen-dimensional feature vector to describe the pump condition at each time step. Figure 6-7 demonstrates the degradation index for the three pumps. It is important to note the similar pattern observed among the three pumps’ degradation paths. 162 Figure 6-7: Degradation index constructed based on vibration monitoring data 6.4.3.3. AE Monitoring Data The sources of AE in rotating machinery include impacting, friction, turbulence, cavitation and leakage [187]. Depending on the underlying failure mechanism, the degradation of a rotating machine can be captured by the changes in the AE signal features (e.g., amplitude, counts, energy), among which the energy-related features are useful indicators of damage in rotating machinery [188]. The energy-related features include RMS, energy, absolute energy and average signal level. In this study, it was observed that all four types of energy-related features above were highly correlated, and hence only the RMS feature was selected for further analysis. The RMS features were utilized in terms of the AE signals collected in each of the three different locations. Thereafter, a three-dimensional feature vector was established to describe the pump condition at each time step. Figure 6-8 shows the degradation index of the three pumps, where the x-axis is the testing time of pumps (in hours), and the y-axis is the magnitude of degradation index. The results also show a similar pattern across different pumps. 163 Figure 6-8: Degradation index constructed based on AE monitoring data 6.4.4. Pump Degradation Model Development Given the three types of degradation indices constructed in Section 6.4.3, this section first discusses the most appropriate degradation index that can be used to characterize the degradation behaviors of the three pumps, followed by a description of a state- space based degradation model. 6.4.4.1. Selection of Degradation Index The degradation profiles of the three pumps are summarized in Figure 6-9 for each type of condition monitoring technique used. For comparison purposes, a reference level is provided of the minimum value of the degradation index at the end of each experiment. It clearly shows that the degradation profiles of the pumps tend to be highly correlated in all three types of condition monitoring technique. On the other hand, regardless of the sensitivity or sampling frequency differences among the monitoring techniques, the same functional relationship could be applied to characterize the pump degradation behaviors associated with different failure mechanisms. Indeed, the ability to track the pump degradation behavior varied 164 depending on the sensitivity of condition monitoring technique to the underlying failure mechanism. Some insights are summarized as follows:  For all three monitoring techniques used in this study, the levels of degradation index tended to stabilize at the end of the test, which provides a reference level to properly define the endurance to degradation. Clearly, variations exist due to the stochastic nature of the degradation process.  Given the same failure mechanism, the sensitivity of the condition monitoring technique was different. Hence, the same family of monitoring technique should be used to monitor.  Different failure mechanisms may have distinct influences on the degradation evolution, which can be indicated by the differences between degradation rate and/or degradation state. For instance, the failure mechanism underlying Pump 3 (pitting corrosion) results in a higher degradation state than those involved in Pump 1 (fatigue) and Pump 2 (fretting corrosion).  The developed degradation index is demonstrated to be applicable to a pump involved in any one of the three failure mechanisms. Figure 6-9: Degradation index regarding three types of condition monitoring data 165 With the three types of degradation index, the most appropriate degradation index was related to the process monitoring data. This choice was based on the following criteria [189, 190]: (a) the variance in the failure limit of the developed degradation index should be minimal; (b) the larger slope of the data provides a clearer trend; and (c) the range of information should be as large as possible. Indeed, the authors also recognize that a fusion approach has the potential to improve the characterization of the degradation evolution by making use of the information from different monitoring techniques, but this was considered out of the scope of this paper. Finally, the endurance to degradation should be selected as some value lower than the degradation state at the end of each test, where the pumps failed. Specifically, the endurance to degradation was empirically selected as 6.0. 6.4.4.2. Degradation Model Development Once the degradation index was developed, a mathematical degradation model was needed to describe the pump degradation evolution. Examination of the degradation index showed that the degradation path follows the power function in Equation (6-13): (6-13) where a, b and c are the model parameters, k is the time step, is the observation of the degradation index at time step k, and is the additive Gaussian noise with zero means and different standard deviation . To demonstrate the feasibility of Equation (6-13) for describing the degradation evolution, a nonlinear least squares regression is conducted for the selected degradation index. The goodness-of-fit statistics is 166 employed to measure the fitting performance of Equation (6-13). The R-squared values ( ), the adjusted , the root mean squared error (RMSE) and model parameters are summarized in Table 6-2. Based on these results, one can conclude that Equation (6-13) represents a good fit for describing the degradation evolution. Table 6-2: Results for regression and goodness-of-fit statistics Goodness-of-Fit Statistics Model Parameters Component Adjusted RMSE a b c Pump 1 0.9763 0.9762 0.2386 5.206 0.1367 -7.711 Pump 2 0.9915 0.9915 0.1245 1.985 0.1908 -3.355 Pump 3 0.9644 0.9643 0.3054 7.47 0.1063 -10.94 The pump state-space model used in this paper is constructed as follows. The power function is used as the observation function, and the model parameters are incorporated as the elements of the state vector with . State function: (6-14) (6-15) (6-16) Observation function: (6-17) where k is the time step, is the observation of the degradation index of the pump at time step k, , and are the model parameters of the pump at time 167 step k, and , , and are the additive Gaussian noises with zero means and different standard deviation , , and , respectively. 6.4.5. Experimental Results for CCF Estimation 6.4.5.1. Results for Sensor-Driven Scenario As shown in Figure 6-10, the entire testing profile is categorized into three phases (Phase 1, Phase 2 and Phase 3) based on the system configuration changes. Phase 1 involved a three-pump redundant system from the beginning to 1714 hours of operation. When Pump 1 failed, the test proceeded to Phase 2 involving a two-pump redundant system until 4414 hours of operation. After Pump 3 failed, Phase 3 ran with only Pump 2 until 4863 hours of operation. Note that only Phase 1 and Phase 2 are of interest for CCF events. Figure 6-10: Testing profile with three phases At a time instant of 1500 hours of operation, the degradation state of each pump is estimated and characterized by six thousand samples as illustrated by the histograms 168 in Figure 6-11 (a), which respectively indicates the number of occurrences for the possible degradation states associated with all three pumps. Then the CCF is estimated based on the fractions of concurrent exceedance of endurance to degradation as discussed in Section 6.3.3. Therefore, given newly arrived sensor monitoring data at each time instant, the degradation state of each pump is estimated and is then used to update the CCF estimation. Figure 6-11 (b) displays the estimate of CCF over the entire test, which shows the dynamic features of CCF assuming no maintenance-based rejuvenation. Figure 6-11: (a) Illustration of the degradation states of all three pumps at t= 1500 hours; (b) estimate of β-factor over the entire test The CCF estimates for Phase 1 and Phase 2 are provided in Figure 6-12. The differences between Phase 1 and Phase 2 are attributed to the different failure mechanisms underlying each pump and system configuration changes. Some important observations are summarized as follows: 169  Over the entire test, the β-factor starts from zero and approaches one at the end. It is intuitive that the redundant pump system would fail eventually without any maintenance actions.  In Phase 1, Pump 1 degrades much faster than the others, as is evident from its shortest experiment duration in Section 6.4.2. As such, Pump 1 is subject to more likely failure, while the other two pumps are not. It appears that independent failure is dominant in Phase 1, which results in low β-factor.  In Phase 2, the β-factor approaches one because of the pump degradation without mitigating actions.  From the perspective of the CCF control, knowing the pump degradation state allows one to determine the time required to implement mitigating actions based on some critical level of CCF [191]. Suppose the β-factor should be less than 0.05, as such maintenance actions would be needed before 2870 hours of operation. Figure 6-12: Estimate of -factor for Phase 1 and Phase 2 170 6.4.5.2. Results for Simulation-based Scenario With the knowledge of the degradation processes involved in the three testing pumps, a condition-based maintenance policy was selected as illustrated in Figure 6-13 (a). It is assumed that the preventive repair threshold is , the preventive replacement threshold is , and the endurance to degradation is . During service, the pump is subject to any of the three failure mechanisms identified in Section 6.4.2. The degradation behavior would be random throughout the service based on the different parameters set regarding the three failure mechanisms as provided in Table 2. The following results are generated based on the simulation of one year of pump service. Figure 6-13: (a) Condition-based maintenance policy; (b) imperfect maintenance characterized by the beta distribution with α=5 and γ=1.5 As an illustrative example, suppose the inspection interval is hours, and the rejuvenation or renewal factor follows the beta distribution with and , indicating good maintenance practices as displayed in Figure 6-13 (b). The hourly evolution of the -factor is provided in Figure 6-14 (a) and the overall 171 variation of the -factor is summarized by the distribution of the -factor as shown in Figure 6-14 (b). The mean estimate of the -factor is 0.025 and the component failure rate is failures/hr. The 5% quantile, median and 95% quantile estimates of -factor are 0, 0.008 and 0.084, respectively. Some important observations are discussed as follows:  The dynamic characteristics of CCF are captured by evolution of the β-factor, which shows a periodical increasing trend. This indicates that the β-factor would be underestimated as the components degrade and maintenance actions vary, which results in the underestimation of plant risks.  It is expected that most β-factors are close to zero, and the distribution of β-factor is positively skewed. The variation of β-factor is large and is attributed to the underlying component degradation and the relevant maintenance actions.  Examination of the quantile estimates indicates that simply treating the CCF impacts based on the mean estimate of β-factor is not sufficient and would lead to underestimation of the β-factor. Figure 6-14: (a) Hourly evolution of β-factor; (b) distribution of β-factor in relation to component degradation and maintenance actions. 172 Different maintenance policies lead to different patterns of -factor through component service. Therefore, a sensitivity study was conducted to investigate the CCF changes under different maintenance effectiveness in terms of two decision parameters: the inspection interval and the rejuvenation factor θ. There are twenty-seven maintenance interval and effectiveness characteristics defined by the combinations of (1) nine options for inspection interval in units of hours: {240, 360, 480, 600, 720, 840, 960, 1080, 1200}, and (2) three options for rejuvenation factor with the parameter sets ( ): {(5, 1.5), (5, 2.5), (5, 3.5)}, which respectively represents a decreased degree of repair as displayed in Figure 6-15 (a). Figure 6-15: (a) Three options for rejuvenation factor considered in sensitivity analysis; (b) mean β-factor changes according to twenty-seven imperfect maintenance policies The results of the sensitivity analysis are summarized in Figure 6-15 (b), which provides the mean estimates of the -factor for each imperfect maintenance characteristic. The results are used to examine the overall impact of maintenance 173 policy on the CCF through service and to identify the appropriate maintenance policy from the perspective of CCF control. It shows that the component degradation and maintenance practices could significantly affect the -factor for CCF probability. The insights are discussed as follows:  Examination of the nine options for the inspection interval shows that, as expected, with longer inspection intervals, the β-factor monotonically increases. Assuming the same degree of effectiveness, it is straightforward to see that performing inspections more frequently is more likely to prevent potential failure and thus less concurrent failure, leading to smaller β-factor.  Poor maintenance is associated with low rejuvenation and higher β-factor for CCF probability.  It is demonstrated that there would be a significant increase in the β-factor with a decrease in the degree of repair quality (i.e., lower rejuvenation). This means that the β-factor would be significantly underestimated when assuming the maintenance practices are perfect, when in practice there is a degree of effectiveness.  Overall, it is intuitive that frequent high quality maintenance reduces pump degradation, leading to smaller β-factor. Although perfect maintenance would considerably reduce pump degradation and lead to a small -factor, the patterns of the -factor would vary depending on the effectiveness of the maintenance repair. Another sensitivity study was conducted to investigate the effects on -factor assuming perfect maintenance, but for different 174 inspection intervals, , in units of hours: {720, 1080, 1440, 1800, 2160, 2520, 2880, 3240, 3600}. The results are summarized in Figure 6-16, which shows that mean estimates of the β-factor incresaes as the inspection intervals become longer. A comparison of the inspection intervals 720 and 1080 hours for perfect and imperfect maintenace regimes from Figure 6-16 and Figure 6-15 (b), respectively, further demonstrates that perfect maintenance would significantly reduce the -factor. On the other hand, this confirms that the -factor would be significantly underestimated under the assumption of perfect maintenance, when in practice there is always a degree of maintenance effectiveness. As the -factor monotonically increases with longer inspection intervals, it is possible to underestimate the -factor as components degrade even under perfect maintenance practices. Figure 6-16: Mean β-factor changes according to nine perfect maintenance policies 175 6.4.5.3. Application to Estimate the -Factor for CCF Probability of Specific Components The results of this research are envisioned to have applicability to estimate the - factor for CCF probability of degrading components during their useful life. Specifically, one needs to develop a CCF adjustment curve that characterizes the relationship between component failure rate and the -factor estimates from the experimental observations of the component. Next, adjustment in the level of component failure rate would be needed to accommodate the differences between the experimental results and the field performance of components. Finally, the -factor estimate would be determined in the CCF adjustment curve based on the adjusted failure rate. Suppose the differences between field performance and experimental study could be adjusted based on a composite multiplication factor and the base failure rate as determined from operational experience, for which historical failure rate data from IAEA and NRC are available [192, 193]. need to be determined based on engineering judgment, design standards, regulatory requirements and operational practices [194, 195]. Note that could be further decomposed to address the differences associated with specific sources, depending on the level of information available. For instance, could be further decomposed into three multiplying factors that consider the differences in terms of the three types of coupling factors discussed in Section 6.2. In doing so, the adjusted failure rate would be derived from the base failure rate as shown in Equation (6-18). 176 (6-18) where is the adjusted failure rate in units of failures per hour, is the base failure rate in units of failures per hour, is the composite multiplying factor which considers the overall difference between field performance and experimental study, is the multiplying factor which considers the differences in terms of hardware- based coupling factors, is the multiplying factor which considers the differences in terms of operation-based coupling factors, is the multiplying factor which considers the differences in terms of environment-based coupling factors. For illustration purpose, a conceptual example is demonstrated to infer the CCF probability of the service water pump in the nuclear power plant. It is also discussed the relationship between the inferred -factor and the generic -factor estimated from the operational experience of the nuclear industry. The generic -factor used for service water pump is 0.03 and its failure rate is failures/hr [191]. In this example, a CCF adjustment curve is developed based on the results of sensitivity analysis involved twenty-seven maintenance policies in Section 6.4.5.2. The results are denoted by a number of doublets in the form of , which are summarized in terms of the three options for the maintenance renewal or rejuvenation factors described by the circle symbol, the diamond symbol and the triangle symbol, respectively, as shown in Figure 6-17. Examination of these results shows that this 177 adjustment curve follows a power function ( ) in Equation (6-19) with . As such, this parametric function would be used to estimate the -factor given an adjusted failure rate available. ( ) (6-19) Figure 6-17: CCF adjustment curve according to twenty-seven maintenance policies The composite multiplying factor is adopted to account for the differences between the testing pump and the service water pump. In particular, a lower bound of the composite multiplying factor is estimated based on the knowledge from the three sources as follows:  General regulation of nuclear facility pump unit: as claimed in [196], the nuclear facility pump unit must have a design basis operating lifetime of at least three 178 years of uninterrupted operation. This indicates the requirement of minimum 25920 hours of uninterrupted operation.  Design specification of testing pump: the technical manual from the manufacturer [197] indicates that the motor lifetime of the testing pump is 3500 hours and motor brushes need to be replaced.  Experimental observation of testing pump: the operating lifetime of uninterrupted operation could be estimated as 3903 hours, which is the average of the duration of all three testing pumps as provided in Table 6-1. The lower bound of the composite multiplying factor is then estimated as , which is the ratio of 25920 to 3903. With Equation (6-18), the adjusted failure rate is then determined as /hr. Finally, one could obtain the estimate of - factor using the power function in Equation (6-19). The lower bound of the -factor estimate is , which is close to the generic -factor 0.03. Note that is a lower bound of the -factor estimate. The differences between the estimate of such lower bound and the generic -factor could be explained from two aspects. (1) If one considers most of service water pumps are designed beyond the minimum requirement of three years of uninterrupted operation, the actual composite multiplying factor should be larger and hence results in a larger adjusted failure rate. Then the actual -factor estimate should be some value greater than 0.023; (2) As discussed in Section 6.2, the CCF sources with only instantaneous or short-term effect are excluded in the proposed approach. If one aggregates the 179 contribution of such sources, the actual -factor estimate should also be some value greater than 0.023. Therefore, the estimate of -factor based on the experiment examinations is consistent with the generic -factor used for service water pumps in the nuclear power plant. 6.5. Conclusions This paper presented a novel approach to advance the state-of-the-art CCF research by taking advantage of the recent advances in sensor-based techniques and computation capabilities. The proposed approach models the CCF for components under age-related degradation by superimposing the maintenance impacts on the component degradation evolutions that can be characterized through condition monitoring data. An experimental case study involving three redundant centrifugal pump systems was presented to demonstrate the approach. The pump degradation assessment and condition-based maintenance policy were presented. The significance of CCF events using a component-specific study was discussed, along with the dynamic characteristics of CCF by a sensor-driven scenario and a simulation-based scenario. Sensitivity studies were provided to evaluate the maintenance effects on CCF over lifetime services. The results concluded that the parametric estimates of CCF failure probability may be limited to ideal conditions of perfect maintenance, and age-related degradation could significantly affect the β-factor for CCF probability, leading to underestimation of risks as components degrade. This study also showed the important role of recent advances in sensing techniques and data 180 analytic algorithms in enhancing the current PRA research via online monitoring with reduced uncertainty. Acknowledgments The research presented in this paper was partly funded under the US NRC grant NRCHQ6014G0015. The funding for the experimental effort was provided by the Center for Risk and Reliability at the University of Maryland. The authors also appreciate the help from Mr. Jan Muehlbauer on the experimental works. Any views presented in this paper are those of the authors and do not reflect an official position of the US NRC. 181 Chapter 7: Conclusions, Contributions and Recommendations 7.1. Conclusions This dissertation developed three approaches to address the important issues regarding both external event and interval event in the MUPRA. The research results mainly concluded that: 1) Multi-unit accidents are important contributors to multi-unit site risks. 2) The dynamic MUPRA model that relies on certain dynamic simulator offers limited practicality, and the parametric MUPRA model is recommended in the development of MUPRA methodology. 3) It is a useful and practical approach to develop parametric MUPRA model that relies on expanding the existing SUPRA model and uses traditional CCF parametric method to treat multi-unit dependencies. 4) It is possible to provide a defensible technical basis to characterize the impact of multi-unit dependencies based on the LERs reported to the U.S. NRC. 5) Seismic-induced CCFs between the SSCs across reactor units are significant contributors to the multi-unit site risks. 6) Assumption of perfect dependency among SSCs is highly conservative, and partial dependency values should be considered for applications to MUPRAs with reduced conservatisms. 7) The geometric mean of the two bin limits is not an appropriate choice for the reference PGA level when implementing the discretization-based scheme for seismic risk quantification. The upper limit of the two bin limits should be used. 182 8) The CCF for specific common cause component group could vary significantly due to the variations of component failure behaviors, operational requirement and maintenance practice. 9) The generally estimated CCF parameters used in the current practice might not properly reflect the actual effects of common cause dependencies. Therefore, the estimate of component-specific CCF would be more appropriate. 10) Age-related degradation significantly affects CCF probabilities and estimates of plant and site risks. 11) The recent advances in the sensing techniques and data analytic algorithms could play an important role in enhancing the current PRA research via online monitoring with reduced uncertainty. 7.2. Contributions Major contributions of this dissertation are summarized in three categories as follows: 1) Extend the state-of-the-art seismic PRA.  Develop a seismic dependency modeling technique for seismic MUPRA.  Identify and demonstrate the existing issues.  The incorrect equivalence hypothesis between the β-factor and correlation coefficient.  The weaknesses of the Reed-McCann method that is recently recognized as a suitable approach and recommended to use in modeling dependencies in seismic PRA. 183  The issues related to the inappropriate uses of the geometric mean as the reference level for the ground motion in the discretization-based scheme of seismic risk quantification. 2) Extend parametric-based approach to CCF modeling for MUPRA.  Develop a general MUPRA framework for applications to multi-unit nuclear power plant site that considers unit to unit dependencies.  Develop an improved approach to external event PRA for multi-unit site that considers the seismic-induced dependencies across units.  Propose a more appropriate seismic dependency approach and demonstrate it in a case study for the seismic-induced Small Loss of Coolant Accident (SLOCA) at a hypothetical nuclear plant site consisting of two identical advanced (GEN-III) reactor units.  Conduct a feasibility analysis of three different multi-unit risk metrics in the MUPRA. 3) Develop a novel approach to CCF modeling for internal events of the MUPRA.  Propose a novel CCF model for components under age-related degradation by exploiting the recent advancement in the sensing techniques and data analytic algorithms.  Introduce a new way to model CCF by integrating components’ degradation evolutions inferred from condition monitoring data. 184  Extend the scope of conventional parametric CCF model to be component specific and dynamic over lifetime.  Develop a sensor-driven scenario to achieve specific CCF evaluation by combining the general degradation property and the sensor monitoring data of plant-specific components in a recursive Bayesian approach.  Develop a simulation-based scenario to assess the effects of imperfect maintenance and renewal on the CCF over lifetime.  Demonstrate the proposed approach using the diverse condition monitoring data acquired from a special-purpose experiment.  Design and set up a special-purpose test rig involving redundant centrifugal pump systems with the heating function and advanced condition monitoring capability.  Develop a degradation database, including diverse sensor measurements: temperature, electric current, electric voltage, flow rate, differential pressure, vibration and acoustic emission.  Develop pump failure analysis, degradation assessment and condition- based maintenance policy for testing pump.  Conduct simulation as well as sensitivity studies to investigate the maintenance impacts on the CCF over lifetime.  Demonstrate the applicability of the proposed approach to estimate the β- factor for CCF probability of specific components. 185 7.3. Recommendations for Future Research  Conduct multi-hazards risk aggregation and importance analysis to better characterize the critical contributors to multi-unit risks.  Consider the ground motion dependencies across nuclear reactor units, rather than using the same seismic hazard curve for all nuclear reactor units on the same site.  Improve the computational efficiency for the simulation of seismic risk, for instance, use stratified sampling for seismic hazard simulation; use Latin Hyper Cube Sampling with Copula for the simulation of correlated ground acceleration capacity; optimize the importance sampling with the applications of the cross-entropy method.  Enhance the capability of degradation modeling from different aspects: consider other options of stochastic process modeling (e.g., Wiener process, Gamma process and inverse Gaussian process); develop a fusion approach to make use of all the information from the different condition monitoring techniques.  Improve the modeling for sensor-driven scenario as matter of practicality to address the uncertainties of environmental evolutions and imperfect maintenance effects. For instance, apply variations of particle filtering algorithm (e.g., auxiliary particle filtering, regularized particle filtering); implement model noise adaptive strategies (e.g., expectation maximization algorithm); consider estimation of maintenance effects for the online filtering algorithm. 186  Application of simulation-based approach to assess the influences of various operational requirements, maintenance practices and repair quality on the CCF evolution during service. Furthermore, the integration of organizational and human effects might generate a more realistic insight of the dynamic properties of CCF. 187 Bibliography [1] M. MODARRES. “Multi-unit nuclear plant risks and implications of the quantitative health objectives,” Proceedings of the International Topical Meeting on Probabilistic Safety Assessment and Analysis (PSA), Sun Valley, Idaho (2015). [2] P. HOKSTAD and M. RAUSAND. “Common Cause Failure Modeling: Status and Trends,” In Handbook of Performability Engineering. Springer, London (2008). [3] US Nuclear Regulatory Commission. "Guidelines on modeling common- cause failures in probabilistic risk assessment," NUREG/CR-5485, Washington, DC (1998). [4] M. MODARRES, M.P. KAMINSKIY and V. KRIVTSOV. “Reliability engineering and risk analysis: a practical guide,” CRC press (2016). [5] US Nuclear Regulatory Commission. "Common–Cause Failure Parameter Estimations," NUREG/CR-5497, Washington, DC (1998). [6] US Nuclear Regulatory Commission. “Common-cause failure database and analysis system: event data collection, classification, and coding,” NUREG/CR-6268, Washington, DC (2007). [7] Nuclear Energy Agency. “International common-cause failure data exchange (ICDE): general coding guidelines,” NEA/CSNI/R(2011)12, 2012. 188 [8] A. ZITROU, T. BEDFOR and L. WALLS. "An influence diagram extension of the unified partial method for common cause failures," Quality Technology & Quantitative Management, 4(1), 111-128 (2007). [9] D. KELLY, S. SHEN, G. DEMOSS, K. COYNE and D. MARKSBERRY. "Common-cause failure treatment in event assessment: basis for a proposed new model," Proceedings of the International Conference on Probabilistic Safety Assessment and Management (PSAM 10), Seattle, WA, 7-11 June (2010). [10] X. ZHENG, A. YAMAGUCHI and T. TAKATA. "α-Decomposition for estimating parameters in common cause failure modeling based on causal inference," Reliability Engineering & System Safety, 116, 20-27 (2013). [11] A. O’CONNOR and A. MOSLEH. "A general cause based methodology for analysis of common cause and dependent failures in system risk and reliability assessments," Reliability Engineering & System Safety, 145, 341-350 (2016). [12] J.C. STILLER, M. LEBERECHT, G. GANBMANTEL, A. WIELENBERG, A. KREUSER and C. VERSTEGEN. “Common cause failures exceeding CCF groups,” Proceedings of the International Topical Meeting on Probabilistic Safety Assessment and Analysis (PSA), Sun Valley, Idaho (2015). [13] B. BRUCK, A. KREUSER, J. STILLER and M. LEBERECHT. “Extending the scope of ICDE: systematic collection of operating experience with cross component group CCFs,” Proceedings of the International Conference on 189 Probabilistic Safety Assessment and Management (PSAM 13), Seoul, Korea (2016). [14] J.K. VAURIO. "Uncertainties and quantification of common cause failure rates and probabilities for system analyses." Reliability Engineering & System Safety, 90(2), 186-195 (2005). [15] M. TROFFAES, G. WALTER and D. KELLY. "A robust Bayesian approach to modeling epistemic uncertainty in common-cause failure models," Reliability Engineering & System Safety 125 (2014): 13-21. [16] D. KANCEV and M. CEPIN. "A new method for explicit modelling of single failure event within different common cause failure groups," Reliability Engineering & System Safety, 103, 84-93 (2012). [17] Z. REJC and M. CEPIN. "An extension of Multiple Greek Letter method for common cause failures modeling," Journal of Loss Prevention in the Process Industries, 29, 144-154 (2014). [18] A. MOSLEH. “PRA: a perspective on strengths, current limitations, and possible improvements,” Nuclear Engineering and Technology, 46(1), 1-10 (2014). [19] US Nuclear Regulatory Commission. “Issues and recommendations for advancement of PRA technology in risk-informed decision making,” NUREG/CR-6813, Washington, DC (2003). [20] S. SCHROER and M. MODARRES. “An Event Classification Schema for Evaluating Site Risk in a Multi-Unit Nuclear Power Plant Probabilistic Risk Assessment,” Reliability Engineering & System Safety, 117, 40-51 (2013). 190 [21] S. SCHROER. “An event classification schema for considering site risk in a multi-unit nuclear power plant probabilistic risk assessment,” University of Maryland, Master of Science Thesis in Reliability Engineering (2012). [22] N. SIU. “PSA R&D: Changing the Way We Do Business,” International Topical Meeting on Probabilistic Safety Assessment (PSA 2017), September 24-28, Pittsburgh, PA (2017). [23] A. SIVORI, K. KIPER, A. MAIOLI and D. TEOLIS. “Further Development of a Framework for Addressing Site Integrated Risk,” International Topical Meeting on Probabilistic Safety Assessment (PSA 2017), September 24-28, Pittsburgh, PA (2017). [24] W. KELLER and M. MODARRES. “A historical overview of probabilistic risk assessment development and its use in the nuclear power industry: a tribute to the late Professor Norman Carl Rasmussen,” Reliability Engineering & System Safety, 89(3), 271-285 (2005). [25] M. MODARRES, T. ZHOU and M. MASSOUD. “Advances in multi-unit nuclear power plant probabilistic risk assessment,” Reliability Engineering & System Safety, 157, 87-100 (2017). [26] D.W. HUDSON and M. MODARRES. “Multiunit Accident Contributions to Quantitative Health Objectives: A Safety Goal Policy Analysis,” Nuclear Technology, 197(3), 227-247 (2017). [27] Lowe Pickard, Garrick. Indian point probabilistic safety study. (Inc.). Prepared for Power Authority of the State of New York, Consolidated Edison Company of New York, Inc., 1982. 191 [28] Pickard Lowe and Garrick, Inc., Seabrook Station Probabilistic Safety Assessment -Section 13.3 Risk of Two Unit Station, Prepared for Public Service Company of New Hampshire, PLG-0300, 1983. [29] W. BRINFIELD, A. MCCLYMONT, E. KRANTZ, J. TRAINER and G. KLOPP. “Approach to the evaluation of multi-unit loss of offsite power events,” International Topical Meeting on Probability, Reliability and Safety Assessment (PSA '89), Pittsburgh, PA, April 2-7 (1989). [30] W.S. JUNG, J. YANG, and J. HA. “A New Method to Evaluate Alternate AC Power Source Effects in Multi-Unit Nuclear Power Plants.” Reliability Engineering & System Safety, 82(2), 165-172 (2003). [31] K.N. FLEMING. “On the Issue of Integrated Risk–A PRA Practitioner’s Perspective,” Proc. of the ANS International Topical Meeting on Probabilistic Safety Analysis, San Francisco, CA (2005). [32] M.D. MUHLHEIM and R.T. WOOD. “Design Strategies and Evaluation for Sharing Systems at Multi-Unit Plants Phase I,” ORNL/LTR/INERI- BRAZIL/06-01 (2007). [33] T. HAKATA. “Seismic PRA Method for Multiple Nuclear Power Plants in a Site,” Reliability Engineering & System Safety, 92, pp. 883-894 (2007). [34] S. ARNDT. “Methods and Strategies for Future Reactor Safety Goals,” PhD Thesis, Ohio State Univ. (2010). [35] T. ZHOU, M. MODARRES and E.L. DROGUETT. “An improved multi-unit nuclear plant seismic probabilistic risk assessment approach,” Reliability Engineering & System Safety, 171, 34-47 (2018). 192 [36] A. LYUBARSKIY, I. KUZMINA and M. EL-SHANAWANY. “Notes on Potential Areas for Enhancement of the PSA Methodology based on Lessons Learned from the Fukushima Accident,” Proc. of the 2nd Probabilistic Safety Analysis/Human Factors Assessment Forum, Warrington, UK, September 8-9 (2011). [37] N. SIU, D. MARKSBERRY, S. COOPER, K. COYNE and M. STUTZKE. “PSA technology challenges revealed by the Great East Japan Earthquake,” In PSAM Topical Conference in Light of the Fukushima Dai-Ichi Accident, Tokyo, Japan, April 14-18 (2013). [38] J.E. YANG. “Fukushima Dai-Ichi accident: lessons learned and future actions from the risk perspectives,” Nuclear Engineering and Technology, 46(1), 27- 38 (2014). [39] IAEA International Workshop on the Safety of Multi-Unit Nuclear Power Plant Sites against External Natural Hazards, Mumbai, India, October 17-19, 2012. [40] CNSC International Workshop on Multi-unit PSA, Ottawa, Canada, November 17-20, 2014. [41] IAEA International Seismic Safety Centre Working Area 8. “Technical Approach to Multi-Unit Probabilistic Safety Assessment,” SR 8.5, 2014 Not released yet (undergoing peer review). [42] IAEA International Seismic Safety Centre. “External Hazard Considerations for Single and Multi-Unit Probabilistic Safety Assessment,” SR 8.4, 2014. Not released yet (undergoing peer review). 193 [43] Proceedings of The OECD/NEA Workshop on Seismic Risk, NEA/CSNI/R(99)28, Tokyo, Japan, August 10-12, 1999. [44] Specialist Meeting on the Seismic Probabilistic Safety Assessment of Nuclear Facilities, NEA/CSNI/R(2007)14, Jeju Island, Republic of Korea, November 6-8, 2006. [45] PSA OF NATURAL EXTERNAL HAZARDS INCLUDING EARTHQUAKE Workshop proceedings, NEA/CSNI/R(2014)9, Prague, Czech Republic, June 17-20, 2013. [46] International Topical Meeting on Probabilistic Safety Assessment (PSA2013), Columbia, SC, USA, September 22-26, 2013. http://meetingsandconferences.com/psa2013/index.html [47] The 12th Probabilistic Safety Assessment & Management conference (PSAM12), Honolulu, Hawaii, USA, June 22-27, 2014. http://psam12.org/ [48] International Topical Meeting on Probabilistic Safety Assessment (PSA2015), Sun Valley, ID, USA, April 26-30, 2015. http://www.psa2015.org/ [49] The 13th Probabilistic Safety Assessment & Management conference (PSAM13), Seoul, Korea, October 2-7, 2016. http://iapsam.org/PSAM13/ [50] International Topical Meeting on Probabilistic Safety Assessment (PSA2017), Pittsburgh, PA, USA, September 24-28, 2017. http://psa.ans.org/ [51] The 14th Probabilistic Safety Assessment & Management conference (PSAM14), Los Angeles, CA, USA, September 16-21, 2018. http://www.psam14.org/ 194 [52] Advanced Safety Assessment Methodologies: Extended PSA (ASAMPSA_E) Project: http://asampsa.eu/ [53] IAEA Division of Nuclear Installation Safety. “Methodology for Multiunit Probabilistic Safety Assessment, NSNI Project on Multiunit PSA, Phase I (Draft),” January (2018). [54] M.D. MUHLHEIM, G.F. FLANAGAN and W.P. POORE III. “Initiating Events for Multi-Reactor Plant Sites,” No. ORNL/TM--2014/533, Oak Ridge National Laboratory (ORNL), Oak Ridge, TN, US (2014). [55] M. DENNIS, M. MODARRES and A. MOSLEH. “Framework for Assessing Integrated Site Risk of Small Modular Reactors using Dynamic Probabilistic Risk Assessment Simulation,” ESREL2015, Zürich, Switzerland (2015). [56] IAEA, “Development and Application of Level 1 Probabilistic Safety Assessment for Nuclear Power Plants,” No. SSG-3, 2010. [57] K. EBISAWA, M. FUJITA, Y. IWABUCHI, and H. SUGINO. "Current issues on PRA regarding seismic and tsunami events at multi units and sites based on lessons learned from Tohoku earthquake/tsunami." Nuclear Engineering and Technology, 44, no. 5, 437-452 (2012). [58] D. MANDELLI, C. PARISI, A. ALFONSI, D. MALJOVEC, S.ST. GERMAIN, R. BORING, S. EWING, C. SMITH and C. RABITI. “Dynamic PRA of a Multi-Unit Plant,” International Topical Meeting on Probabilistic Safety Assessment (PSA 2017), September 24-28, Pittsburgh, PA (2017). 195 [59] J.E. YANG. “Development of an integrated risk assessment framework for internal/external events and all power modes,” Nuclear Engineering and Technology, 44(5), 459-470 (2012). [60] J. VECCHIARELLI, K. DINNIE and J. LUXAT. “Development of a Whole- Site PSA Methodology,” CANDU Owners Group, COG-13-9034 (2014). [61] V. HASSIJA, C.S. KUMAR and K. VELUSAMY. "Probabilistic safety assessment of multi-unit nuclear power plant sites–An integrated approach," J. of Loss Prevention in the Process Industries, 32, 52-62 (2014). [62] T.D.L. DUY, D. VASSEUR and E. SERDET. “Probabilistic Safety Assessment of twin-unit nuclear sites: Methodological elements,” Reliability Engineering & System Safety, 145, 250-261 (2016). [63] S. ZHANG, J. TONG and J. ZHAO “An integrated modeling approach for event sequence development in multi-unit Probabilistic Risk Assessment,” Reliability Engineering & System Safety, 155, 147-159 (2016). [64] T. ZHOU and M. MODARRES. “Parametric Estimation of Multi-Unit Dependencies,” International Topical Meeting on Probabilistic Safety Assessment (PSA 2017), September 24-28, Pittsburgh, PA (2017). [65] T.D.L. DUY and D. VASSEUR. “A practical methodology for modeling and estimation of Common Cause Failure Parameters in multi-unit nuclear PSA model,” Reliability Engineering & System Safety, 170, 159-174 (2018). [66] N. SIU, S. DENNIS, M. TOBIN, P. APPIGNANI, K. COYNE, G. YOUNG and S. RAIMIST. “Advanced Knowledge Engineering Tools to Support Risk- Informed Decision Making: Final Report (Public Version),” (2016). 196 [67] M.C. KIM. “Feasibility of shared use of alternative AC diesel generator under dual-unit station blackout,” Journal of Nuclear Science and Technology, 54(10), 1029-1035 (2017). [68] S.ST. GERMAIN, R. BORING, G. BANASEANU, Y. AKL and H. CHATRI. “Multi-Unit Considerations for Human Reliability Analysis,” No. INL/CON- 17-41526. Idaho National Lab, Idaho Falls, ID, US (2017). [69] S. SAMADDAR, K. HIBINO and O. COMAN. "Technical approach for safety assessment of multi-unit NPP sites subject to external events," Proc. of the 12th International Probabilistic Safety Assessment and Management (PSAM) Conference, Honolulu, Hawaii (USA), June 22-27 (2014). [70] P.C. BASU, M.K. RAVINDRA and Y. MIHARA. “Component fragility for use in PSA of nuclear power plant,” Nuclear Engineering and Design, 323, 209-227 (2017). [71] R.J. BUDNITZ, G.S. HARDY, D.L. MOORE and M.K. RAVINDRA. “Correlation of Seismic Performance in Similar SSCs (Structures, Systems, and Components),” Final Report Draft, Lawrence Berkeley National Laboratory, Berkeley, California (2015) (under NRC review) [72] K. EBISAWA, T. TERAGAKI, S. NOMURA, H. ABE, M. SHIGEMORI and M. SHIMOMOTO. "Concept and methodology for evaluating core damage frequency considering failure correlation at multi units and sites and its application," Nuclear Engineering and Design, 288, 82-97 (2015). [73] T. ZHOU, M. MODARRES and E.L. DROGUETT. “Issues in Dependency Modeling in Multi-Unit Seismic PRA”, International Topical Meeting on 197 Probabilistic Safety Assessment (PSA 2017), September 24-28, Pittsburgh, PA (2017). [74] US Nuclear Regulatory Commission. “Options for Proceeding with Future Level 3 Probabilistic Risk Assessment Activities,” SECY-11-0089. Washington, DC (2011). Available at: http://www.nrc.gov/reading-rm/doc- collections/commission/secys/2011/2011-0089scy.pdf. [75] US Nuclear Regulatory Commission. “Technical Analysis Approach Plan for Level 3 PRA Project (Rev 0b, Working Draft),” Washington, DC (2013). Available at: https://www.nrc.gov/docs/ML1329/ML13296A064.pdf. [76] S. KAPLAN, and B. GARRICK, "On the quantitative definition of risk," Risk analysis, no.1, pp. 11-27, 1981. [77] A. OMOTO, “Design Consideration on Severe Accident for Future LWR (IAEA-TECDOC-1020),” in Proceedings of a Technical Committee Meeting of the International Atomic Energy Agency, Vienna, 1996. [78] K. DINNIE, “Considerations for Future Development of SAMG at Multi-Unit CANDU Sites,” in Proceedings of PSAM 11, Helsinki, June 2011 [79] U.S. Nuclear Regulatory Commission, “Resolution of Generic Safety Issues: Task CH2: Design (NUREG-0933),” Washington, D.C., 1981. [80] U.S. Nuclear Regulatory Commission, Code of Federal Regulations, 10 CFR 50, Appendix A – “General Design Criteria for Nuclear Power Plants.” [81] U.S. Nuclear Regulatory Commission, “Staff Requirements Memorandum- SECY-11-0089-Options for Proceeding with Future Level 3 Probabilistic Risk Assessment (PRA) Activities,” Washington, D.C., 2011. 198 [82] R. A. MESERVE, “The Evolution of Safety Goals and Their Connection to Safety Culture”, Talk Delivered at the Atomic Energy Society Of Japan/American Nuclear Society Topical Meeting on Safety Goals and Safety Culture, Milwaukee, Wisconsin June 18, 2001. [83] M. MODARRES, M. LEONARD, K. WELTER, and J. POTTORF, “Options for Defining Large Release Frequency for Applications to the Level-2 PRA and Licensing of SMRs,” PSA 2011 - International Topical Meeting on Probabilistic Safety Assessment and Analysis, March 14-17, 2011. [84] ASME/ANS RA-S-1.2-2014, “Severe Accident Progression and Radiological Release (Level 2) PRA Standard for Nuclear Power Plant Applications for Light Water Reactors (LWRs)” Jan., 2015. (Available for Trial Use) [85] U.S. NRC, SECY-13-0029, “History of the Use and Consideration of the Large Release Frequency Metric by the U.S. Nuclear Regulatory Commission,” March 22, 2013. [86] U.S. Nuclear Regulatory Commission, “An Approach for Estimating the Frequencies of Various Containment Failure Modes and Bypass Events,” NUREG/CR-6595, BNL-NUREG-52539, Washington, D.C, 2004. [87] Nuclear Energy Agency, “Use and Development of Probabilistic Safety Assessment: An Overview of the Situation at the end of 2010,” NEA/CSNI/R (2012)11, January 3, 2013. [88] Nuclear Energy Agency, “Probabilistic Risk Criteria and Safety Goals,” NEA/CSNI/R(2009)16, December 17, 2009. 199 [89] U.S. Nuclear Regulatory Commission, “Calculations in support of a potential definition of large release (NUREG/CR-6094),” Washington, D.C., 1994. [90] U.S. Nuclear Regulatory Commission, “Regulatory Guide 1.174 - An Approach for Using Probabilistic Risk Assessment in Risk-Informed Decisions on Plant-Specific Changes to the Licensing Basis,” Rev. 2, August 2009. [91] P. TRUCCO, E. CAGNO, F. RUGGERI, and O. GRANDE, “A Bayesian Belief Network Modeling of Organizational Factors in Risk Analysis: A Case Study in Maritime Transportation,” Reliability Engineering and System Safety, no. 93, pp. 823-834, 2008. [92] Z. MOHAGHEGH, M. MODARRES and A. CHRISTOU, “Physics -Based Common Cause Failure Modeling in Probabilistic Risk Analysis: A Mechanistic Approach,” in Proceedings of the ASME 2011 Power Conference, Denver, Colorado, 2011. [93] B. A. GRAN and A. HELMINEN, “A Bayesian Belief Network for Reliability Assessment,” in SAFECOMP ‘01 Proceedings of the 20th International Conference on Computer Safety, Safety, Reliability, and Security, London, 2001. [94] U.S. Nuclear Regulatory Commission, 10CFR 50.73 Licensee event report system, Search Website: https://lersearch.inl.gov/LERSearchCriteria.aspx. [95] S. NATHAN and A. MOSLEH, "Treating data uncertainties in common-cause failure analysis." Nuclear Technology, 84.3: 265-281, 1989. 200 [96] North Anna Power Station, Letter to the U.S. Nuclear Regulatory Commission, Licensee event report 2011-003-00, 2011. [97] Point Beach Nuclear Plant. Letter to the U.S. Nuclear Regulatory Commission, Licensee event report 2000-010-00, 2000. [98] IAEA, “The Fukushima Daiichi Accident Report by the Director General,” Vienna, Austria (2015). [99] T. ZHOU, E.L. DROGUETT, and M. MODARRES, "A Hybrid Probabilistic Physics of Failure Pattern Recognition Based Approach for Assessment of Multi-Unit Causal Dependencies." Proc. of the 2016 24th International Conference on Nuclear Engineering (ICONE24), Charlotte, North Carolina, June 26-30 (2016). [100] W.S. JUNG, J. YANG, and J. HA. "An approach to estimate SBO risks in multi-unit nuclear power plants with a shared alternate AC power source," Proc. of the 7th International Probabilistic Safety Assessment and Management (PSAM) Conference, Berlin, Germany, June 14-18 (2004). [101] K. OH, S. JUNG, G. HEO, and S. JANG. "Technical issues of PSA for Korean multi-unit nuclear power plants," Safety and Reliability of Complex Engineered Systems: ESREL (2015). [102] C.S. KUMAR, V. HASSIJA, K. VELUSAMY, and V. BALASUBRAMANIYAN, "Integrated risk assessment for multi-unit NPP sites—A comparison," Nuclear Engineering and Design, 293, 53-62 (2015). [103] K. MURAMATSU, Q. LIU, and T. UCHIYAMA, “Effect of Correlations of Component Failures and Cross-connections of EDGs on seismically induced 201 Core Damages of a Multi-Unit Site,” J. of Power & Energy Systems, 2(1), 122-132 (2008). [104] D. STRAUB and A.D. KIUREGHIAN, "Improved seismic fragility modeling from empirical data." Structural safety, 30, no. 4, 320-336 (2008). [105] US Electric Power Research Institute, “Seismic Probabilistic Risk Assessment Implementation Guide,” 002000709 (2013). [106] A. YAMAGUCHI, "Seismic fragility analysis of the heat transport system of LMFBR considering partial correlation of multiple failure modes." Transactions of the 11th international conference on structural mechanics in reactor technology, Tokyo, Japan, Aug. 18-23 (1991). [107] US Nuclear Regulatory Commission, “Seismic Safety Margins Research Program Phase I Final Report (NUREG/CR-2015),” Washington, D.C. (1981). [108] US Nuclear Regulatory Commission, “Procedures for the External Event Core Damage Frequency Analysis for NUREG-1150 (NUREG/CR-4840),” Washington, D.C. (1990). [109] US Electric Power Research Institute, “Seismic Fragility Application Guide,” 1002988 (2002). [110] J.W. REED, M.W. MCCANN, JR.J. IIHARA, and H.H. TAMJED, “Analytical Techniques for Performing Probabilistic Seismic Risk Assessment of Nuclear Power Plants,” Proc. of the 4th International Conference on Structural safety and reliability, Kobe, Japan, May 27-29 (1985). 202 [111] J. U. KLUGEL, "On the treatment of dependency of seismically induced component failures in seismic PRA," Transactions of the 20th International Conference on Structural Mechanics in Reactor Technology (SMiRT 20), Helsinki, Finland, August 9-14 (2009). [112] M. PELLISSETTI and U. KLAPP. "Integration of correlation models for seismic failures into fault tree based seismic PSA," Transactions of the 21st International Conference on Structural Mechanics in Reactor Technology (SMiRT 21), New Delhi, India, Nov 6-11 (2011). [113] R Core Team, “R: A language and environment for statistical computing. R Foundation for Statistical Computing,” Vienna, Austria. URL https://www.R- project.org/ (2016). [114] S. KAPLAN, V. M. BIER, and D. C. BLEY. "A note on families of fragility curves—is the composite curve equivalent to the mean curve?" Reliability Engineering & System Safety, 43, no. 3, 257-261 (1994). [115] C code by Steven G. Johnson and R by Balasubramanian Narasimhan, “cubature: Adaptive multivariate integration over hypercubes,” R package version 1.1-2. https://CRAN.R-project.org/package=cubature (2013). [116] R.B. NELSEN. “An introduction to copulas.” Springer Science & Business Media (2006). [117] D.P. KROESE, T. TAIMRE and Z.I. BOTEV. “Handbook of Monte Carlo Methods.” Vol. 706. John Wiley & Sons (2013). 203 [118] A.C. REILLY, A. STAID, M. GAO, and S.D. GUIKEMA. "Tutorial: Parallel Computing of Simulation Models for Risk Analysis." Risk Analysis, 36, no. 10, 1844-1854 (2016). [119] US Nuclear Regulatory Commission. “Systems Analysis Programs for Hands- on Integrated Reliability Evaluations (SAPHIRE) Version 8,” NUREG/CR- 7039 (2011). [120] R. P. Kennedy. “Overview of Methods for Seismic PRA and Margin Analysis including Recent Innovations,” Proc. of the OECD/NEA Workshop on Seismic Risk, Tokyo, Japan (1999). [121] N. LUCO and C.A. CORNELL. “Seismic drift demands for two SMRF structures with brittle connections,” Structural Engineering World Wide 1998, Elsevier Science Ltd., Oxford, England, Paper T158-3 (1998). [122] F. JALAYER. “Direct probabilistic seismic analysis: implementing non-linear dynamic assessments." PhD dissertation, Stanford University (2003). [123] J.W. BAKER. “Efficient analytical fragility function fitting using dynamic structural analysis." Earthquake Spectra, 31, no. 1, 579-599 (2015). [124] Y. WATANABE, T. OIKAWA and K. MURAMASTU. "Development of the DQFM method to consider the effect of correlation of component failures in seismic PSA of nuclear power plant." Reliability Engineering & System Safety, 79, no. 3, 265-279 (2003). [125] T. UCHIYAMA. “Effect of Analytical Methodology for Assessment on Seismically Induced Core Damage,” Journal of Power and Energy Systems, 5(3), 279-294 (2011). 204 [126] K. KAWAGUCHI. “Efficiency of Analytical Methodologies in Uncertainty Analysis of Seismic Core Damage Frequency,” Journal of Power and Energy Systems, 6(3), 378-393 (2012). [127] T. UCHIYAMA, K. KAWAGUCHI and T. WAKABAYASHI. “Effect of Simultaneous Consideration for Seismically Induced Events on Core Damage Frequency,” Journal of Power and Energy Systems, 5(3), 360-375 (2011). [128] M.K. RAVINDRA and L.W. TIONG. "Comparison of methods for seismic risk quantification." In Transactions of the 10th international conference on structural mechanics in reactor technology (1989). [129] H. SHIBATA, D. LAPPA, R.J. BUDNITZ, and A.B. IBRAHIM. “Probabilistic safety assessment for seismic events (IAEA-TECDOC-724)." Vienna, Austria: International Atomic Energy Agency (1993). [130] S. KHERICHA, R. BUELL, S. SANCAKTAR, M. GONZALEZ and F. FERRANTE. “Development of Simplified Probabilistic Risk Assessment Model for Seismic Initiating Event,” Eleventh International Probabilistic Safety Assessment and Management and European Safety and Reliability, Helsinki, Finland (2012). [131] US Nuclear Regulatory Commission. “Risk Assessment Standardization Project (RASP): Handbook for risk assessment of operational events, Volume 2 – External Events.” (2008). [132] N. JAYARAM and J.W. BAKER. "Efficient sampling techniques for seismic risk assessment of lifelines." In 10th International Conference on Structural Safety and Reliability (ICOSSAR09), (2009). 205 [133] Y. HUANG, A.S. WHITTAKER and N. LUCO. "A probabilistic seismic risk assessment procedure for nuclear power plants: (I) Methodology." Nuclear Engineering and Design, 241, no. 9, 3996-4003 (2011). [134] Y. HUANG, A.S. WHITTAKER and N. LUCO. "A probabilistic seismic risk assessment procedure for nuclear power plants: (II) Application." Nuclear Engineering and Design, 241, no. 9, 3985-3995 (2011). [135] B. STEPHEN, S.J. GALLOWAY, D. MCMILLAN, D.C. HILL and D.G. INFIELD. "A copula model of wind turbine performance." IEEE Transactions on Power Systems, 26, no. 2, 965-966 (2011). [136] Z. XI, J. RONG, P. WANG and C. Hu. "A copula-based sampling method for data-driven prognostics." Reliability Engineering & System Safety, 132, 72-82 (2014). [137] K. GODA and J. REN. "Assessment of seismic loss dependence using copula." Risk analysis, 30, no. 7, 1076-1091 (2010). [138] L. CHEN, V.P. SINGH, S. GUO, Z. HAO and T. LI. "Flood coincidence risk analysis using multivariate copula functions." Journal of Hydrologic Engineering, 17, no. 6, 742-755 (2011). [139] X.S. TANG, D.Q. LI, C.B. ZHOU and K.K. PHOON. "Copula-based approaches for evaluating slope reliability under incomplete probability information." Structural Safety, 52, 90-99 (2015). [140] T. WANG, J.S. DYER and J.C. BUTLER. "Modeling Correlated Discrete Uncertainties in Event Trees with Copulas." Risk Analysis, (2015). 206 [141] W. HU, K.K. CHOI and H. CHO. "Reliability-based design optimization of wind turbine blades for fatigue life under dynamic wind load uncertainty." Structural and Multidisciplinary Optimization, 54, no. 4, 953-970 (2016). [142] R.T. CLEMEN and T. Reilly. "Correlations and copulas for decision and risk analysis." Management Science, 45, no. 2, 208-224 (1999). [143] E.L. DROGUETT and A. MOSLEH. "Integrated treatment of model and parameter uncertainties through a Bayesian approach." Proceedings of the Institution of Mechanical Engineers, Part O: Journal of Risk and Reliability, 227, no. 1, 41-54 (2013). [144] W. YI and V.M. Bier. "An application of copulas to accident precursor analysis." Management Science, 44, no. 12-part-2, S257-S270 (1998). [145] D.L. KELLY. "Using copulas to model dependence in simulation risk assessment." ASME 2007 International Mechanical Engineering Congress and Exposition, Seattle, Washington, USA, November 11–15 (2007). [146] T.W. ANDERSON. “An Introduction to Multivariate Statistical Analysis.” Wiley (2004). [147] B. SUDRET, C.V. MAI and K. KONAKLO. "Computing seismic fragility curves using non-parametric representations." Earthquake Eng. Structura. Dynamics 00, 1-17 (2014). [148] M. HOFERT, I. KOJADINOVIC, M. MAECHLER and J. YAN. “copula: Multivariate Dependence with Copulas. R package version 0.999-14,” (2015). URL http://CRAN.R-project.org/package=copula 207 [149] J. PROCHASKA, P. HALADA, M. M. PELLISSETTI and M. KUMAR. "Report 1: Guidance document on practices to model and implement SEISMIC hazards in extended PSA Volume 2 (implementation in Level 1 PSA),” ASAMPSA REPORTS, IRSN/PSN-RES-SAG 2017-0004 (2017). [150] S.C. BHATT and R.C. WACHOWIAK. “ESBWR Certification Probabilistic Risk Assessment,” GE-Hitachi Nuclear Energy, NEDO-33201, Revision 6 (2010). [151] J.W. BAKER. “Introducing Correlation among Fragility Functions for Multiple Components,” Proc. World Conference on Earthquake Engineering, Beijing, China (2008). [152] US Nuclear Regulatory Commission. “Revised Livermore Seismic Hazard Estimates for 69 Nuclear Power Plant Sites East of the Rocky Mountains,” NUREG-1488 (1994). [153] Letter from NextEra Energy to U.S. NRC. “NextEra Energy Seabrook, LLC Seismic Hazard and Screening Report (CEUS Sites) Response to NRC Request for Information Pursuant to 10 CFR 50.54(f) Regarding Recommendation 2.1 of the Near-Term Task Force Review of Insights From the Fukushima Dai-ichi Accident”, March 27, 2014, 10 CFR 50.54(f), Docket No. 50-443, SBK-L-14052 [154] US Nuclear Regulatory Commission. “An Approach to the Quantification of Seismic Margins in Nuclear Power Plants,” NUREG/CR-4334 (1985). [155] US Nuclear Regulatory Commission. “Handbook of Nuclear Power Plant Seismic Fragilities,” NUREG/CR-3558 (1985). 208 [156] Y.J. PARK, C.H. HOFMAYER and N.C. CHOKSHI. “Survey of seismic fragilities used in PRA studies of nuclear power plants,” Reliability Engineering & System Safety, 62(3), 185-195 (1998). [157] U.S. NRC. “Evaluations of core melt frequency effects due to component aging and maintenance”, NUREG/CR-5510, Washington DC (1990). [158] IAEA. “Safety Aspects of Nuclear Power Plant Ageing”, TECDOC-540, Vienna, Austria (1990). [159] IAEA. “Ageing Management for Nuclear Power Plants”, Safety Guide, No. NS-G-2.12, Vienna, Austria (2009). [160] Canadian Nuclear Safety Commission. “Incorporating Ageing Effects into PSA Applications”, ENCO FR-(14)-10 (2014). [161] J. HOLY, M. NITOI, I. DINU and L. BURGAZZI, “Analysis of common cause failures coupling factors and mechanisms from aging point of view”, APSA Network Task 5 POS Task 4 (2010). [162] US Nuclear Regulatory Commission. “Nuclear Plant Aging Research (NPAR) Program Plan,” Rev.2, NUREG/CR-1144, Washington, DC (1991). [163] W.Q. MEEKER and Y. HONG. “Reliability meets big data: opportunities and challenges,” Quality Engineering, 26(1), 102-116 (2014). [164] E. ZIO. "Some challenges and opportunities in reliability engineering," IEEE Transactions on Reliability, 65 (4), 1769-1782 (2016). [165] J.B. COBLE, P. RAMUHALLI, L.J. BOND, J.W. HINES and B.R. UPADHYAYA. “Prognostics and health management in nuclear power 209 plants: a review of technologies and applications,” No. PNNL-21515. Pacific Northwest National Laboratory (PNNL), Richland, WA (2012). [166] American Bureau of Shipping (ABS). “Guidance Notes on Equipment Condition Monitoring Techniques,” Houston, TX (2016) [167] IAEA. “Implementation Strategies and Tools for Condition Based Maintenance at Nuclear Power Plants,” TECDOC-1551, Vienna, Austria (2007). [168] W. WANG and A.H. CHRISTER. “Towards a general condition based maintenance model for a stochastic dynamic system,” Journal of the operational research society, 51(2), 145-155 (2000). [169] P. WANG, B.D. YOUN and C. HU. "A generic probabilistic framework for structural health prognostics and uncertainty management," Mechanical Systems and Signal Processing, 28, 622-637 (2012). [170] A. JARDINE, D. LIN and D. BANJEVIC. "A review on machinery diagnostics and prognostics implementing condition-based maintenance," Mechanical systems and signal processing, 20(7), 1483-1510 (2006). [171] D SIEGEL and J LEE. "Reconfigurable informatics platform for rapid prognostic design and implementation: tools and case studies," Proceedings of the Machinery Failure Prevention Technology (MFPT) Conference, Cleveland, Ohio, May 13-17 (2013). [172] D. VERSTRAETE, A. FERRADA, E.L. DROGUETT, V. MERUANE and M. MODARRES. “Deep Learning Enabled Fault Diagnosis Using Time- 210 Frequency Image Analysis of Rolling Element Bearings,” Shock and Vibration, Article ID 5067651 (2017). [173] A. SOYLEMEZOGLU, S. JAGANNATHAN and C. SAYGIN. "Mahalanobis-Taguchi system as a multi-sensor based decision making prognostics tool for centrifugal pump failures," IEEE Transactions on Reliability, 60(4), 864-878 (2011). [174] A.S.S. VASAN, B. LONG and M. PECHT. "Diagnostics and prognostics method for analog electronic circuits," IEEE Transactions on Industrial Electronics, 60(11), 5277-5291 (2013). [175] H. OH, M.H. AZARIAN and M.G. PECHT. "Estimation of fan bearing degradation using acoustic emission analysis and Mahalanobis distance," Proceedings of the Applied Systems Health Management Conference, Virginia Beach, VA, May 10-12 (2011). [176] X. SI, W. WANG, C. HU and D. ZHOU. "Remaining useful life estimation–a review on the statistical data driven approaches," European journal of operational research, 213(1), 1-14 (2011). [177] J. SUN, H. ZUO, W. WANG and M.G. PECHT. “Prognostics uncertainty reduction by fusing on-line monitoring data based on a state-space-based degradation model,” Mechanical Systems and Signal Processing, 45(2), 396- 407 (2014). [178] X. SI, W. WANG, C. HU and D. ZHOU. "Estimating remaining useful life with three-source variability in degradation modeling," IEEE Transactions on Reliability, 63(1), 167-190 (2014). 211 [179] E.L. DROGUETT and A. MOSLEH. “Bayesian methodology for model uncertainty using model performance data. Risk Analysis,” 28(5), 1457-1476 (2008). [180] E. RABIEI, E.L. DROGUETT and M. MODARRES. “A prognostics approach based on the evolution of damage precursors using dynamic Bayesian networks,” Advances in Mechanical Engineering, 8(9) (2016). [181] M.S. ARULAMPALAM, S. MASKELL, N. GORDON and T. CLAPP. "A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking," IEEE Transactions on signal processing, 50(2), 174-188 (2002). [182] IAEA. “Maintenance, Surveillance and In-service Inspection in Nuclear Power Plants Safety Guide,” Safety Standards Series No. NS-G-2.6, Vienna, Austria (2002). [183] B. LIU, S. WU, M. XIE and W. KUO. “A condition-based maintenance policy for degrading systems with age-and state-dependent operating cost,” European Journal of Operational Research, 263(3), 879-887 (2017). [184] US NRC. “Common-Cause Failure Event Insights, Volume 3, Pumps,” NUREG/CR-6819 (2003). [185] R.S. BEEBE. “Predictive maintenance of pumps using condition monitoring”, Elsevier (2004). [186] D. WANG and P.W. TSE. “Prognostics of slurry pumps based on a moving- average wear degradation index and a general sequential Monte Carlo method,” Mechanical Systems and Signal Processing, 56, 213-229 (2015). 212 [187] M. LEAHY, D. MBA, P. COOPER, A. MONTGOMERY and D. OWEN. "Experimental investigation into the capabilities of acoustic emission for the detection of shaft-to-seal rubbing in large power generation turbines: a case study," Proceedings of the Institution of Mechanical Engineers, Part J: Journal of Engineering Tribology, 220(7), 607-615 (2006). [188] D. MBA and R.B. RAO. “Development of Acoustic Emission Technology for Condition Monitoring and Diagnosis of Rotating Machines; Bearings, Pumps, Gearboxes, Engines and Rotating Structures,” The Shock and Vibration Digest, 38(1), 3-16 (2006). [189] K. LIU, A. CHEHADE and C. SONG. “Optimize the signal quality of the composite health index via data fusion for degradation modeling and prognostic analysis,” IEEE Transactions on Automation Science and Engineering, 14(3), 1504-1514 (2017). [190] A. CHEHADE, S. BONK and K. LIU. “Sensory-Based Failure Threshold Estimation for Remaining Useful Life Prediction,” IEEE Transactions on Reliability, 66(3), 939-949 (2017). [191] S.M. WONG. “Common Cause Failure (CCF) Analysis and Generic CCF Data ~ US Experience,” IAEA Technical Review Meeting, November 06-08 (2013). [192] IAEA. “Component Reliability Data for Use in Probabilistic Safety Assessment,” TECDOC-478, Vienna, Austria (1988). [193] US Nuclear Regulatory Commission. "Probabilistic Safety Analysis Procedures Guide," NUREG/CR-2815, Washington, DC (1984). 213 [194] Naval Surface Warfare Center (NSWC) Carderock Division. “Handbook of Reliability Prediction Procedures for Mechanical equipment,” Maryland (2011). [195] R. PAN. “A Bayes approach to reliability prediction utilizing data from accelerated life tests and field failure observations,” Quality and Reliability Engineering International, 25(2), 229-240 (2009). [196] STUK-Radiation and Nuclear Safety Authority. “Nuclear facility pump units,” Guide YVL5.7, Fourth, revised edition, ISSN 0783-2400, Helsinki (2009). https://www.stuklex.fi/en/ohje/YVL5-7 [197] JABSCO Datasheet http://www.xylemflowcontrol.com/files/50840_43000_0820.pdf. 214