ABSTRACT 
 
 
 
 
Title of Document: TOWARDS A FORMAL AND SCALABLE 
APPROACH FOR QUANTIFYING 
SOFTWARE RELIABILITY AT EARLY 
DEVELOPMENT STAGES 
  
 Wende Kong, Doctor of Philosophy 2009 
  
Directed By: Dr. Carol S. Smidts, 
Visiting Associate Professor, 
Department of Mechanical Engineering. 
 
 
 Problems which originate in early development stages can have a lasting 
influence on the reliability, safety, and cost of a software system. The requirements 
document, which is usually available at the requirements analysis stage, must be 
correct, unambiguous, and complete if the rest of the development effort is to 
succeed. The ability to identify faults in requirements and predict the reliability of a 
software system early in its development can help organizations make informative 
decisions about corrective actions and improve the system?s quality in a cost-effective 
manner. A review of the literature reveals that existing approaches are unsuited to 
provide trustworthy reliability prediction either due to the ignorance of the 
requirements documents, or because of the informal and fairly sketchy way in 
detecting faults in requirements. 
  
 This study explores the use of a preselected software reliability measurement for 
early software faults detection and reliability prediction. This measurement, originally 
a black-box testing technique, was broadly recognized for its ability to detect 
incomplete and ambiguous requirements, although no information was found in the 
literature about how to take advantage of its power. This study mathematically 
formalized the measurement to enhance its rigidity, repeatability and scalability and 
further extended it as an effective requirements faults detection technique. An 
automation-oriented algorithm was developed for quantifying the impact of the 
detected requirements faults on software reliability. The feasibility and scalability of 
the proposed approach for early faults detection and reliability prediction were 
examined using two real applications. The results clearly confirmed its feasibility and 
usefulness, particularly when no failure data is available and other methods are not 
applicable. The scalability barriers were also spotted in the approach.  An empirical 
study was thus conducted to gain insight into the nature of the technical barriers. As 
an attempt to overcome the barrier, a set of rules was proposed based on the observed 
patterns. Finally, a preliminarily controlled experiment was conducted to evaluate the 
usability of the proposed rules. 
 This study will enable software project stakeholders to effectively detect 
requirements faults and assess the quality of requirements early in development, and 
ultimately lead to improved software reliability if the identified faults are removed in 
time. Software project practitioners, regulators, and policy makers involved in the 
certification of software systems can benefit most from the techniques proposed in 
this study.  
  
 
 
 
 
 
 
 
 
 
TOWARDS A FORMAL AND SCALABLE APPROACH FOR QUANTIFYING 
SOFTWARE RELIABILITY AT EARLY DEVELOPMENT STAGES 
 
 
 
By 
 
 
Wende Kong 
 
 
 
 
 
Dissertation submitted to the Faculty of the Graduate School of the  
University of Maryland, College Park, in partial fulfillment 
of the requirements for the degree of 
Doctor of Philosophy 
2009 
 
 
 
 
 
 
 
 
 
 
Advisory Committee: 
Visiting Associate Professor Carol Smidts, Chair 
Professor Ali Mosleh 
Assistant Processor Byeng Dong Youn 
Professor Mohammad Modarres 
Professor Paul Pietroski 
 
 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
? Copyright by 
Wende Kong 
2009 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 ii 
 
Acknowledgements 
 
 
 I would first and foremost like to express my deep gratitude to my adviser, Dr. 
Carol S. Smidts. She patiently guided me through the dissertation process, never 
accepting less than my best efforts. Her wisdom, knowledge and commitment to the 
highest standards inspired and motivated me. 
 I am also indebted to all my dissertation defense committee members, Dr. Ali 
Mosleh, Dr. Byeng Dong Youn, Dr. Mohammad Modarres, and Dr. Paul Pietroski for 
their support and helpful comments. Their insightful guidance is indispensable to the 
accomplishment of this dissertation and has greatly improved the quality of this 
study.   
 I would like to express my deep thanks to Ms. Ying Shi, Mr. Kenan Cehic and all 
students who participated in the experiments. I gratefully acknowledge US Nuclear 
Regulatory Commission for partly financing this study. 
 Special thanks go to my lab-mates and friends, Mr. Jun Dai, Dr. Yuan Wei and 
Dr. Ming Li, for sharing innumerous days and nights in the lab and for their 
encouragements and support when I was in need.  
 Finally, I heartily appreciate my family who has given meaning to all the effort I 
have done. This dissertation could not have been written without the unwavering love 
and support of my grandmother, Yazhen Ma, my parents: my father, Qinglu Kong, 
my mother, Qiuer Chen, and my uncle, Qingrong Kong. 
  
 
 iii 
 
Table of Contents 
 
 
Acknowledgements ....................................................................................................... ii?
 Table of Contents ......................................................................................................... iii?
 List of Tables .............................................................................................................. vii?
 List of Figures .............................................................................................................. ix?
 Chapter 1:? Introduction ................................................................................................. 1?
 1.1?Research Statement ............................................................................................. 1?
 1.2?Research Objective ............................................................................................. 3?
 1.3?Approach ............................................................................................................. 4?
 1.4?Content ................................................................................................................ 5?
 1.5?Summary of Contributions .................................................................................. 7?
 Chapter 2:?Background and Related Work ................................................................... 9?
 2.1?Definitions .......................................................................................................... 9?
 2.2?Current Situation ............................................................................................... 10?
 2.3?Reliability Measurement vs. Development Phases ........................................... 12?
 2.4?Brief Taxonomy of Software Reliability Measurement Models ...................... 15?
 2.5?Why Early Reliability Measurement is Necessary ........................................... 17?
 2.5.1? Majority of Software Projects Failed to Achieve Schedule and Budget 
Goal ....................................................................................................... 18?
 2.5.2? Requirements are the Root of Many Problems ...................................... 19?
 2.5.3? Faults Cost Less when Detected and Fixed in Early Stages of 
Development .......................................................................................... 21?
 2.6?Virtues of Early Software Reliability Measurement ......................................... 26?
 2.7?Previous Work on Early Reliability Measurement ........................................... 27?
 2.8?Selecting Software Measurement for Early Reliability Assessment ................ 31?
 Chapter 3:?Formalization of Cause-Effect Graphing Analysis as a Software 
Reliability Measurement ........................................................................... 35?
 3.1?What is Cause-Effect Graphing Analysis (CEGA) ........................................... 35?
 3.2?Construction of CEG ........................................................................................ 37?
 3.3?CEGA as a Software Reliability Measurement ................................................ 38?
 
 iv 
 
3.4?Advantages and Disadvantages of CEGA ........................................................ 40?
 3.5?Formal Definition of CEG ................................................................................ 49?
 3.6?Example of CEG Construction ......................................................................... 53?
 3.6.1? Identified Causes, Effects, Logical Relationships, and Constraints for the 
Sample SRS ........................................................................................... 55?
 3.6.2? Graphical Expression of CEG for the Sample SRS ............................... 56?
 3.6.3? Mathematical Expression of CEG for the Sample SRS ......................... 56?
 3.7?Summary ........................................................................................................... 57?
 Chapter 4:? Identification of Faults in Software Requirements Specifications Using 
CEGA ........................................................................................................ 58?
 4.1?Definition, Contents, and Organization of SRSs .............................................. 59?
 4.2?Characteristics of a ?Good? SRS ...................................................................... 62?
 4.3?Faults in SRS .................................................................................................... 63?
 4.4?V&V Techniques for SRS Faults Detection ..................................................... 66?
 4.5?CEGA-based Techniques for SRS Faults Detection ......................................... 68?
 4.5.1? CEGA-based SRS Faults Taxonomy ..................................................... 69?
 4.5.2? Detecting SRS Faults by CEG Construction and Optional Ambiguities 
Review ................................................................................................... 70?
 4.5.3? Detecting More Implicit SRS Faults by CEG Validation ...................... 72?
 4.6?Summary ........................................................................................................... 76?
 4.6.1? Advantages of our Methods ................................................................... 77?
 4.6.2? Limitations of our Methods ................................................................... 77?
 Chapter 5:?Quantification of the Impact of Faults on Software Reliability ................ 79?
 5.1?Basic Notations and Definitions ....................................................................... 79?
 5.2?Fundamental Lemma and Overall Algorithm for Quantifying Software 
Reliability .......................................................................................................... 82?
 5.3?Determination of Failure-relevant Inputs ......................................................... 85?
 5.3.1? Introduction of B-CEG .......................................................................... 85?
 5.3.2? Rules for B-CEG Construction and A-CEG Revision ........................... 87?
 5.3.3? Introduction of Virtual Effect for Mating Missing or Extra Effects ...... 90?
 5.3.4? Determination of an Effect?s Output ...................................................... 92?
 
 v 
 
5.3.5? Algorithm for Determining the Category of an Input ............................ 94?
 5.3.6? Examples of Identifying Failure-relevant Inputs ................................... 96?
 5.4?Calculation of the Occurrence Probability of Failure-relevant Inputs ............ 106?
 5.4.1? Representation of a Boolean Expression Using BDD Techniques ...... 107?
 5.4.2? Recursive Algorithm for Calculating the Occurrence Probability of a 
BDD?s Top Node ................................................................................. 109?
 5.4.3? Operational Profile (OP) ...................................................................... 111?
 5.5?Summary ......................................................................................................... 112?
 Chapter 6:?Examination of the Applicability of the Proposed CEGA Techniques for 
Early-stage Software Reliability Prediction by Case Studies ................. 113?
 6.1?Applications Used Case Studies ..................................................................... 114?
 6.2?Procedure ........................................................................................................ 115?
 6.3?Results and Findings ....................................................................................... 117?
 6.4?Summary ......................................................................................................... 125?
 Chapter 7:?Exploration of the Scalability of A-CEG Construction .......................... 126?
 7.1?Objectives ....................................................................................................... 127?
 7.2?Methodology and Procedure ........................................................................... 127?
 7.2.1? Step 1: Experiment Preparation ........................................................... 128?
 7.2.2? Step 2: Implementation of the Independent Study Project .................. 130?
 7.2.3? Step 3: Postmortem Analysis and Improvement .................................. 134?
 7.3?Results and Discussion ................................................................................... 136?
 7.3.1? Database of Rules and Indicators for A-CEG Elements Identification 136?
 7.3.2? A-CEG Construction Rules .................................................................. 137?
 7.3.3? Potential Influencing Factors of A-CEG Construction ........................ 149?
 7.3.4? Suggestions for Writing an SRS .......................................................... 152?
 7.4?Summary ......................................................................................................... 154?
 Chapter 8:?Validation of the Usability of the A-CEG Construction Rules ............... 156?
 8.1?Definitions ...................................................................................................... 157?
 8.2?Research Questions and Hypotheses .............................................................. 164?
 8.3?Variables ......................................................................................................... 166?
 8.3.1? Independent Variables ......................................................................... 166?
 
 vi 
 
8.3.2? Controlled Variables ............................................................................ 167?
 8.3.3? Dependent Variables: ........................................................................... 167?
 8.4?Subjects ........................................................................................................... 168?
 8.5?Experiment Materials ...................................................................................... 169?
 8.6?Procedure ........................................................................................................ 173?
 8.6.1? Training and Preparation (Phase I) ...................................................... 173?
 8.6.2? Running the Experiment (Phase II) ...................................................... 175?
 8.7?Experiment Results and Discussion ................................................................ 179?
 8.7.1? Statistical Analysis ............................................................................... 179?
 8.7.2? Summary of Statistical Testing ............................................................ 193?
 8.7.3? Qualitative Analysis ............................................................................. 194?
 8.8?Threats to Validity .......................................................................................... 196?
 8.9?Summary ......................................................................................................... 197?
 Chapter 9:?Conclusion and Suggestions for Future Research .................................. 199?
 9.1?Principal Results of this Study and its Significance ....................................... 199?
 9.2?Advantages ...................................................................................................... 203?
 9.3?Limitations ...................................................................................................... 204?
 9.4?Suggestions for Future Research .................................................................... 206?
 Appendix A: List of Words that Point to Potential Ambiguities (adapted from [70])
 ................................................................................................................................... 209?
 Appendix B: Sample Source Code for Calculating the Occurrence Probability of a 
BDD?s Top Node ...................................................................................................... 212?
 Appendix C: Results of Case Study A ...................................................................... 215?
 Appendix D: Reporting Tables Used in Experiment D ............................................ 221?
 Appendix E: Questionnaire Used in Experiment D (to assess subjects? background)
 ................................................................................................................................... 224?
 Appendix F: Postmortem Questionnaire Used in Experiment D (to assess usability of 
the A-CEG Construction Rules set) .......................................................................... 226?
 Glossary .................................................................................................................... 227?
 Bibliography ............................................................................................................. 228?
   
 
 vii 
 
List of Tables 
 
 
Table 2-1: Phase-based Applicability and Ranking Classification of 40 Software 
Reliability Measurements ........................................................................ 33?
 Table 3-1: Mathematical Symbols of CEG Constraints ............................................. 51?
 Table 3-2: Identified Causes for the Sample SRS ...................................................... 55?
 Table 3-3: Identified Effects for the Sample SRS ...................................................... 55?
 Table 3-4: Identified Constraints for the Sample SRS ................................................ 56?
 Table 4-1: Ten Language Quality Characteristics of an SRS (Adapted from [67]) ... 62?
 Table 4-2: Taxonomy of SRS Faults (Excerpted from [68]) ...................................... 64?
 Table 4-3: Categories of SRS Faults in Terms of CEG .............................................. 70?
 Table 5-1: Faults vs. Actions that should be taken for A-CEG or B-CEG ................. 89?
 Table 5-2: Effects? Outputs for Case 1 ....................................................................... 98?
 Table 5-3: Effects? Outputs for Case 2 ....................................................................... 99?
 Table 5-4: Effects? Outputs for Case 3 ..................................................................... 100?
 Table 5-5: Effects? Outputs for Case 4 ..................................................................... 102?
 Table 5-6: Effects? Outputs for Case 5 ..................................................................... 103?
 Table 5-7: Effects? Outputs for Case 6 ..................................................................... 104?
 Table 5-8: Effects? Outputs for Case 7 ..................................................................... 106?
 Table 6-1: Steps vs. Required Techniques/Tools ..................................................... 116?
 Table 6-2: Scalability of the Proposed Techniques .................................................. 117?
 Table 6-3: A-CEGs and CE(%) for PACS and SXXX ............................................. 118?
 Table 6-4: Number of Detected Faults vs. Efforts in Using SRS-related Measurements
  ............................................................................................................... 121?
 Table 8-1: Results of the Preliminary Study for Determining the Threshold Relative 
Inter-textual Distance ............................................................................. 162?
 Table 8-1: Basic Information on SRS Segments Used in Experiment D ................. 171?
 Table 8-2: Data on SRS Segments Used in Experiment Phase II ............................. 172?
 Table 8-3: Assignments of SRS Segments ............................................................... 177?
 Table 8-4: Entire Design of Experiment D ............................................................... 178?
 Table 8-5: Experiment Data Used for Hypotheses Testing ...................................... 179?
 
 viii 
 
Table 8-6: Descriptive Statistics for the Impact of A-CEG Construction Method on 
Effectiveness .......................................................................................... 181?
 Table 8-7: Statistical Testing Results for Hypothesis H1 ( 0.05a = ) ..................... 182?
 Table 8-8: Descriptive statistics for the Impact of A-CEG Construction Method on 
Efficiency ............................................................................................... 183?
 Table 8-9: Statistical Testing Results for Hypothesis H2 ( 0.05a = ) ..................... 184?
 Table 8-10: Descriptive Statistics for the Impact of SRS? Writing Styles on 
Effectiveness .......................................................................................... 185?
 Table 8-11: Statistical Testing Results for Hypothesis H3 ( 0.05a = ) ................... 186?
 Table 8-12: Descriptive Statistics for the Impact of SRS? Writing Style on Efficiency
  ............................................................................................................... 187?
 Table 8-13: Statistical Testing Results for Hypothesis H4 ( 0.05a = ) ................... 188?
 Table 8-14: Descriptive Statistics for the Impact of SRS? Application Type on 
Effectiveness .......................................................................................... 189?
 Table 8-15: Statistical Testing Results for Hypothesis H5 ( 0.05a = ) ................... 190?
 Table 8-16: Descriptive Statistics for the Impact of SRS? Application Type on 
Efficiency ............................................................................................... 191?
 Table 8-17: Statistical Testing Results for Hypothesis H6 ( 0.05a = ) ................... 192?
 Table 8-18: Summary of Statistical Tests ................................................................. 193?
 Table 8-19: Experiment data for Qualitative Analysis ............................................. 194?
 Table Appendix C-1: Definitions of Effects? in PACS? A-CEG and B-CEG .......... 219?
 Table Appendix C-2: PACS? OP .............................................................................. 220?
 
 ix 
 
List of Figures 
 
 
Figure 2-1: Relationship among the Error, Fault, and Failure .................................... 10?
 Figure 2-2: Software Reliability Measurement vs. Development Process ................. 15?
 Figure 2-3: Brief Taxonomy of Software Reliability Measurement Models .............. 17?
 Figure 2-4: Outcomes of Department of Defense Software Spending ....................... 18?
 Figure 2-5: Distribution of Faults in Software Projects .............................................. 20?
 Figure 2-6: Distribution of Failure Causes of 8000+ projects .................................... 21?
 Figure 2-7: Distribution of Effort to Fix Faults .......................................................... 22?
 Figure 2-8: Industry Standard Cost Ratio to Fix a Defect .......................................... 22?
 Figure 2-9: Cost Ratio vs. Development Phases in Which Faults are Found ............. 23?
 Figure 2-10: Summation Effect of Faults ................................................................... 25?
 Figure 2-11: Development Schedule with/without Early Fault Detection .................. 27?
 Figure 3-1: Symbols of Basic CEG Logical Relationships ........................................ 36?
 Figure 3-2: Symbols of CEG Constraints ................................................................... 36?
 Figure 3-3: Example of Identifying Causes and Effects in an SRS ............................ 54?
 Figure 3-4: Graphical Expression of CEG for the Sample SRS ................................. 56?
 Figure 3-5: Mathematical Expression of CEG for the Sample SRS ........................... 57?
 Figure 4-1: Prototype Outline of SRS (extracted from IEEE Std. 830-1998 [53]) ..... 61?
 Figure 4-2: Desired vs. Actually Documented Requirements Specifications ............. 63?
 Figure 4-3: Requirements Fault Categorization Percentage Data ............................... 66?
 Figure 4-4: CEG Validation Algorithm ...................................................................... 73?
 Figure 5-1: CEGA-based Software Reliability Prediction Algorithm ........................ 84?
 Figure 5-2: Software Testing Using Test Oracle ........................................................ 85?
 Figure 5-3: Identifying Failure-relevant Inputs Using B-CEG ................................... 86?
 Figure 5-4: Example of Adding Virtual Effects into A-CEG and B-CEG ................. 92?
 Figure 5-5: Unified Process for Determining the Output of an Effect ....................... 94?
 Figure 5-6: Algorithm for Determining the Category of a Given Input kX???  .............. 96?
 Figure 5-7: Mathematical Expression of the Sample A-CEG .................................... 97?
 Figure 5-8: Revised A-CEG and B-CEG for Case 1 .................................................. 97?
 Figure 5-9: Revised A-CEG and B-CEG for Case 2 .................................................. 99?
 Figure 5-10: Revised A-CEG and B-CEG for Case 3 .............................................. 100?
 
 x 
 
Figure 5-11: Revised A-CEG and B-CEG for Case 4 .............................................. 101?
 Figure 5-12: Revised A-CEG and B-CEG for Case 5 .............................................. 102?
 Figure 5-13: Revised A-CEG and B-CEG for Case 6 .............................................. 104?
 Figure 5-14: Revised A-CEG and B-CEG for Case 7 .............................................. 105?
 Figure 5-15: Generic Fault Tree Model for A-CEG ................................................. 106?
 Figure 5-16: Example of BDD for the Boolean Function f xy z? ?  ....................... 108?
 Figure 5-17: Recursive Algorithm for Calculating the Occurrence Probability of a 
BDD?s Top Node ................................................................................... 110?
 Figure 6-1: Distribution of Detected Faults in Case Study A (for PACS) ................ 119?
 Figure 6-2: Distribution of Detected Faults in Case Study B (for SXXX) ............... 119?
 Figure 6-3: Distribution of Logical Relationships in PACS? A-CEG ...................... 120?
 Figure 6-4: Distribution of Logical Relationships in SXXX?s A-CEG .................... 120?
 Figure 6-5: Number of Detected Faults vs. Effort for PACS ................................... 121?
 Figure 6-6: Number of Detected Faults vs. Effort for SXXX ................................... 121?
 Figure 6-7: Distribution of Efforts in Case Study A (for PACS) ............................. 123?
 Figure 6-8: Distribution of Efforts in Case Study B (for SXXX) ............................. 123?
 Figure 7-1: Timeline for Implementing the Independent Study Project ................... 130?
 Figure 7-2: Workflow for Extracting Rules and Indicators for identification of A-
 CEG Elements ....................................................................................... 133?
 Figure 7-3: Process Used to Distill the A-CEG Construction Rules ........................ 135?
 Figure 7-4: Number of Rules Extracted from Selected SRSs ................................... 136?
 Figure 7-5: Number of Indicators Extracted from Selected SRSs ............................ 136?
 Figure 7-6: Suggested Workflow for Using the A-CEG Construction Rules ........... 146?
 Figure 8-1: Confusion Matrix ................................................................................... 158?
 Figure 8-2: Impact of A-CEG construction method on the Effectiveness ................ 182?
 Figure 8-3: Impact of A-CEG Construction Method on Efficiency ......................... 184?
 Figure 8-4: Impact of Writing Style on Effectiveness .............................................. 186?
 Figure 8-5: Impact of Writing Style on Efficiency ................................................... 187?
 Figure 8-6: Impact of Application Type on Effectiveness ........................................ 190?
 Figure 8-7: Impact of Application Type on Efficiency ............................................ 192?
 Figure Appendix C-1: Graphical Expression of PACS?s A-CEG ............................ 215?
 Figure Appendix C-2: Mathematical Expression of PACS?s A-CEG ...................... 216?
 
 xi 
 
Figure Appendix C-3: Graphical Expression of PACS?s B-CEG ............................ 217?
 Figure Appendix C-4: Mathematical Expression of PACS?s B-CEG ...................... 218?
   
 
1 
 
Chapter 1:  Introduction 
1.1 Research Statement 
 Initiating software reliability prediction earlier in the software development 
lifecycle is critical in the success of implementing high quality software systems in 
today?s fast-paced development environment, because early prediction of software 
reliability can help organizations make informed decisions about corrective actions in 
a cost-effective manner.  
 Early estimations and predictions of software quality attributes are essential for 
control of software development and delivery of software products. In fact, the 
literature reveals that the use of early-stage software reliability models may well 
contribute to project success, as it enables the early detection and addressing of risks 
and issues of concern in an early stage of the project. Especially, time spent early on 
making sure that requirements are correct and has been observed saving much time 
and effort later. It has been shown many times that a bug found in the early stages of 
the product lifecycle is cheaper, in terms of money, effort and time, to fix than the 
same bug found later on in the process. As programming and test techniques have 
improved, the bugs have shifted closer to the process front end, to requirements and 
their specifications. Because they are first-in and last-out, faults originating in 
requirements are the costliest of all. These faults, like wrong, incomplete, and 
inconsistent requirements, cause costly development cycles, delay time to market, and 
lower product quality. Inspection of a requirements document can detect faults in an 
early stage of development, improve software quality, and prevents effort for 
 
2 
 
unnecessary rework. However, existing methods for requirements faults detection are 
mostly informal and fairly sketchy. Therefore, any research aiming at a systematic 
derivation or even an automatic detection of requirements faults is of great practical 
importance. 
 Studies describe what has been the industry reality for decades: the majority of 
software projects fail to achieve schedule and budget goals. As a result, there is an 
ever-increasing need for an affordable early predictor for software projects in 
academia, industry, and government. The purpose of such a predictor is to identify 
projects that were likely to be at high risk of failure in a very early stage. This would 
enable the project stakeholders to take corrective actions before significant resources 
have been expended in accordance with problematic requirements. Most of the 
existing software reliability models are applicable only in the testing phase when 
failure data are available. This is too late for affordably guiding corrective action to 
improve the quality of the software. Although some approaches have been proposed 
for early reliability prediction, common problems prevent these approaches from 
being practicable. These problems are: lack of generic applicability and scalability, 
over-dependence on industry-average data, such as faults content per function point, 
and/or ignorance of product documents. In particular, Software Requirements 
Specifications documents (SRS), the most significant documents usually available at 
the end of requirements analysis phase, are neglected by these approaches due to 
difficulties in linking requirements-based measurement(s) to reliability. Therefore, 
they are inevitably inadequate to provide trustworthy results. 
 
3 
 
 Apparently, a new approach is required to bridge the gap between requirements-
 based measurement(s) and reliability quantification. This probabilistic-reliability 
prediction approach should also enable software professionals to identify problematic 
requirements to reduce the risks of projects. 
1.2 Research Objective 
 The objective of the study is to develop an approach that would allow the project 
stakeholders to determine at a very early development stage the problematic areas in 
the requirements and whether or not the project is at high risk of failure. The results 
of the study should be of value to the organization and project managers as they can 
assess the risks of a project at an earlier stage and either mitigate the risks before it is 
too late to do so or cancel the project. More specifically, this study is to investigate 
how to detect problematic requirements specifications and how software reliability 
assessment can be achieved at requirements analysis phase while limited information 
about the software project is available. 
 To achieve this, the following critical questions need to be answered: 
? Are there quantifiable features that can be extracted from information 
available in early stages that can be used to help predict software reliability? 
? What should be measured for software reliability prediction at early 
development stages? What is the right data to collect and what is the right way 
to process the collected data? 
? What are the limitations of the approach? Is the approach feasible and scalable? 
 
4 
 
1.3 Approach 
 This study deals with software reliability prediction on the basis of assessing the 
quality of the plain-text requirements specifications, which are usually available at the 
end of the requirements analysis phase, a very early stage of development. 
 Our approach begins with selecting the right one out of 40 ranked software 
reliability measurements according to several predefined criteria, such as the 
applicability at early stages, repeatability, potential in usability and scalability, and so 
on. These measurements were ranked with respect to its ability at predicting software 
reliability through an expert opinion elicitation process and the ranking was partially 
validated in our previous research. 
 After thoroughly analyzing its advantages, disadvantages, and other technical 
barriers as a software reliability measurement, we mathematically formalize the 
selected measurement to enhance its rigidity, repeatability, and scalability. We further 
investigate its ability in detecting problematic requirements specifications and 
develop a systematic method for requirements faults detection on the basis of the 
enhanced measurement. 
 We develop a unique automation-oriented algorithm to quantify the impact of the 
detected requirements faults on software reliability.  The quantification algorithm is 
based on the use of the formalized measurement, the Binary Decision Diagram 
techniques, and a recursive algorithm developed in this study. Moreover, this 
quantification algorithm uses detected faults instead of the number of faults because 
the former is believed to provide a more solid foundation for reliability quantification. 
 
5 
 
We also introduce artifacts and develop techniques to enable the automation of the 
quantification algorithm. 
 We then apply our approach to two real applications, one smaller and the other 
larger, to examine the feasibility and scalability of the proposed techniques for 
detecting SRS faults and predicting the reliability at requirement analysis phase. After 
the feasibility is clearly confirmed, we focus on identifying the scalability bottlenecks 
in our approach. 
 We employ the empirical study approach to gain insight into the nature of the 
technical barriers on scalability because quantitative research requires large sample 
sizes and such a sampling is not feasible for this study. We collect and distill the 
patterns observed in the empirical study, and develop rule-based methods to 
overcome these barriers. The influencing factors are identified and analyzed as well. 
 Finally, a controlled experiment is conducted to evaluate the usability of the 
proposed techniques/methods addressing the technical barriers. Due to the lack of 
enough resources to reliably test the effects of all identified influencing factors, we 
only statistically verify the impact of two influencing factors on using the proposed 
techniques/methods while the impact of other factors could be significant.  
1.4 Content 
 The rest of this dissertation is organized as follows: 
 In Chapter 2 we discuss the background and researches related to this study, and 
provide readers with the details on how we selected a software reliability 
measurement for this study. 
 
6 
 
 Chapter 3 focuses on exploring the advantages and disadvantages of the selected 
measurement. Several attempts to enhance the measurement towards a scalable 
software reliability prediction technique are discussed. Especially, the mathematical 
expression of the measurement is defined in terms of well understood mathematical 
entities, such as sets and Boolean formula, whose semantics are formally defined and 
can be easily stored and processed by computers. At the end of this chapter, we 
illustrate the use of the enhanced measurement with a simple example. 
 Chapter 4 introduces the concept of Software Requirements Specifications (SRS), 
attributes of a ?good? SRS, commonly seen SRS faults, and existing techniques for 
SRS faults detection. The remainder of this chapter discusses our disciplined methods 
proposed for systematically detecting faults in natural language SRSs. 
 Chapter 5 describes the unique automation-oriented algorithm proposed for 
quantifying the impact of the detected SRS faults on software reliability. This 
algorithm is based on the formalized measurement and applicable in requirements 
analysis stage and other development stages. 
 Chapter 6 reports the procedure, results, and analysis of two case studies: Case 
Study A and Case Study B. These two case studies were conducted to evaluate the 
feasibility and scalability of the proposed CEGA techniques for quantification of 
software reliability at the requirements analysis stage. 
 Chapter 7 presents the objectives, procedure, detailed findings and analysis 
pertinent to Empirical Study C, which focuses on gaining insight into the nature of 
the scalability barriers identified in our approach. A set of rules attempting to 
overcome the scalability barrier is also proposed and presented in this chapter. 
 
7 
 
 Chapter 8 provides the pertinent information about a small-scale controlled 
experiment, called Experiment D. This experiment aims at comparing and evaluating 
how well the rules set proposed in Chapter 7 performs in comparison to other 
methods, and investigating whether the rules set succeeds in its goals of providing the 
same or improved benefits, with what cost, and under what circumstances it makes 
the most sense. The hypotheses about the impact of two factors (the SRS? writing 
style and SRS? application type) on the effectiveness and efficiency of using the rules 
set are formulated and tested. 
 In Chapter 9, we discuss restrictions and limitations on the use of our approach, 
conclude the study, and propose some suggestions for future research. 
 Please be aware that the theories and methodologies developed and presented in 
this dissertation are part of the University of Maryland Invention Disclosure No. IS-
 2007-114 (November 2007), titled ?Cause Effect Graphing Analysis of Software 
Requirements Specifications for Early Software Reliability Prediction?, Copyright? 
2007 University of Maryland, All right reserved. 
1.5 Summary of Contributions 
 The significant contributions of this study are as follows: 
1. Development of a method for systematical detection of faults in natural 
language requirements. This method is based on an existing technique 
originally used for black-box testing. This study thoroughly discusses this 
technique to compensate for the obvious absence of review papers in this area, 
mathematically formalizes this technique, and enhances its rigidity, 
repeatability, and scalability towards a scalable software reliability 
 
8 
 
measurement applicable to the analysis of natural language requirements. The 
enhanced technique is further extended to a new method capable of 
systematically detecting faults in requirements. This method allows software 
project stakeholders to identify the problematic areas in the requirements at a 
very early development stage. Moreover, this method overcomes the 
shortcomings of other techniques that fail to ensure complete coverage of 
functional requirements. 
2. Development of a method for quantifying the impact of detected faults on 
software reliability. This is the first method of its kind in the literature. 
Starting from this method, software project stakeholders are allowed to 
determine at a very early development stage whether or not the project is at 
high risk of failure while limited information about the software project is 
available. They can assess the risks of a project and either mitigate the risks 
before it is too late to do so or cancel the project. This method can be easily 
adapted to computer processing and automation. 
3. Feasibility and scalability assessment of the early-stage reliability prediction 
approach. This study also addresses the feasibility and scalability aspects of 
modeling natural languages functional requirements based on the formalized 
measurement. The nature of the technical barriers on scalability is explored 
and rule-based methods are developed to overcome these barriers. The impact 
of the writing style and application type (domain) of the requirements 
specifications on the effectiveness and efficiency in using the formalized 
measurement is statistically verified.   
 
9 
 
Chapter 2:  Background and Related Work 
2.1 Definitions 
 Within the software engineering community, there is much inconsistency and 
confusion over the use of the terms bug, error, defect, fault, failure, measure, metric, 
and measurement. Please be aware that this study follows the definitions of IEEE Std. 
610.12-1990 [1] and IEEE Std. 1061-1998 [2] when using these terms. It is first 
necessary to define some of the terms used in this dissertation: 
? Software reliability: is defined as ?the probability of failure-free software 
operation for a specified period of time in a specified environment? [3]. By 
this definition, software reliability is a strictly operational quality attribute. 
Although researchers have come up with models relating the two, software 
reliability is inherently not a function of time [4]. 
? Error: a human action that produces an incorrect result [1]. 
? Fault (also known as bug or defect): a flaw in a component or system that can 
cause the component or system to fail to perform its required function, e.g. an 
incorrect statement or data definition. A fault is a manifestation of an error in 
software. A defect, if encountered during execution, may cause a failure of the 
component or system [1].  
? Failure: the inability of a system or component to perform its required 
functions within specified performance requirements [1]. 
 
10 
 
 The relationship among software error, fault, and failure is illustrated in Figure 
2-1. 
 
Figure 2-1: Relationship among the Error, Fault, and Failure 
? Measure: a way to ascertain or appraise value by comparing it to a norm; to 
apply a metric [2]. 
? Metric: a quantitative measure of the degree to which a system, component, or 
process possesses a given attribute [1]. 
? Measurement: the act or process of assigning a number or category to an 
entity to describe an attribute of that entity; a figure, extent, or amount 
obtained by measuring [2]. 
2.2 Current Situation 
 As revealed in the literature [3][4][5], the cost of a software application in the past 
decades was sweat, blood, tears, and endless debugging sessions. This is because the 
demand for complex software systems has increased more rapidly than the ability to 
design, implement, test, and maintain them. Besides, the ever increasing complexity 
of software has impaired our ability to understand how faults are born, manifest, 
propagate, and eventually lead to failures of the software. Many reported system 
outages or machine crashes were traced down to computer software failures, such as 
the London Stock Exchange crash in 2008, the Air-Traffic Controller incident at LA 
 
11 
 
Airport in 2004, and the Northeast Blackout in 2003 [6]. As literature is replete with 
horror stories regarding software problems, the reliability of software systems has 
become a major concern for our modern society. 
 Though frustrating, the quest of quantifying software reliability has never ceased. 
The magnitude of costs involved in software development and maintenance magnifies 
the need for a scientific foundation to support programming standards and 
management decisions by measurement. Naturally, software reliability measurement 
has become essential to quality-assured software engineering [7]. 
 Unfortunately, measuring and ensuring software reliability is no easy task. The 
high complexity of software is the major contributing factor of software reliability 
problems [3]. As hard as the problem is, promising progresses are still being made 
toward more reliable software. More standard components and better process are 
introduced in the software engineering field. However, until now, we still have no 
good way of measuring software reliability. Actually, reliability measurement in 
software is still in its infancy [3]. This is because: 
? We do not have a good understanding of the nature of software.  
? We cannot find a suitable way to measure software reliability, and most of the 
aspects related to software reliability. 
? Software reliability cannot be directly measured, so other related factors are 
measured to estimate software reliability and compare it among products. 
However, even the most obvious product metrics such as software size have 
not uniform definition. 
 
12 
 
? Even though researchers agree that development process, faults and failures 
found are all factors related to software reliability, no good quantitative 
methods have been developed to represent software reliability without 
excessive limitations. 
2.3 Reliability Measurement vs. Development Phases 
 A software project is made up of series of development phases. Broadly, most 
software projects are comprised of the following phases [8]: 
1. Requirements analysis: This first step is also the most important, because it 
involves gathering information about what the customer needs and defining, 
in the clearest possible terms, the problem that the product is expected to solve. 
Analysis includes understanding the customer's business context and 
constraints, the functions the product must perform, the performance levels it 
must adhere to, and the external systems it must be compatible with. 
Techniques used to obtain this understanding include customer interviews, use 
cases, and "shopping lists" of software features. The results of the analysis are 
typically captured in a formal Software Requirements Specification document 
(SRS), which serves as input to the next step. Proper requirements and 
specifications are critical for having a successful project. Removing faults at 
this phase can reduce the cost as much as faults found in the Design phase. 
2. Design: This step consists of defining the hardware and software architecture, 
specifying performance and security parameters, designing data storage 
containers and constraints, choosing the Integrated Development Environment 
(IDE) and programming language, and indicating strategies to deal with issues 
 
13 
 
such as exception handling, resource management and interface connectivity. 
This is also the stage at which user interface design is addressed, including 
issues relating to navigation and accessibility. The output of this stage is one 
or more design specifications, which are used in the next stage of 
implementation. 
3. Implementation: This step consists of actually constructing the product as per 
the design specification(s) developed in the previous step. Typically, this step 
is performed by a development team consisting of programmers, interface 
designers and other specialists, using tools such as compilers, debuggers, 
interpreters and media editors. The output of this step is one or more product 
components, built according to a pre-defined coding standard and debugged, 
tested and integrated to satisfy the system architecture requirements. 
4. Testing: In this stage, both individual components and the integrated whole 
are methodically verified to ensure that they are fault-free and fully meet the 
requirements outlined in the first step. An independent quality assurance team 
defines "test cases" to evaluate whether the product fully or partially satisfies 
the requirements outlined in the first step. Three types of testing typically take 
place: unit testing of individual code modules; system testing of the integrated 
product; and acceptance testing, formally conducted by or on behalf of the 
customer. Faults, if found, are logged and feedback provided to the 
implementation team to enable correction. This is also the stage at which 
product documentation, such as a user manual, is prepared, reviewed and 
published. 
 
14 
 
5. Operation: This step occurs once the product has been tested and certified as 
fit for use, and involves preparing the system or product for installation and 
use at the customer site. Delivery may take place via the Internet or physical 
media, and the deliverable is typically tagged with a formal revision number 
to facilitate updates at a later date. 
6. Maintenance: This step occurs after installation, and involves making 
modifications to the system or an individual component to alter attributes or 
improve performance. These modifications arise either due to change requests 
initiated by the customer, or faults uncovered during live use of the system. 
Typically, every change made to the product during the maintenance cycle is 
recorded and a new product release is performed to enable the customer to 
gain the benefit of the update. 
 Measurement of both the product and development processes has long been 
recognized as a critical activity for successful software development [4]. Good 
measurement practices and data enable realistic project planning, timely monitoring 
of project progress and status, identification of project risks, and effective process 
improvement. Appropriate measurements and indicators of software artifacts such as 
requirements, designs, and source code can be analyzed to diagnose problems and 
identify solutions during project execution and reduce faults, rework (effort, resources, 
etc.), and cycle time. These practices enable organizations to achieve higher quality 
products and reflect more mature processes, as delineated by the Capability Maturity 
Model Integration (CMMI?) [9]. The relationship between software development 
process and reliability measurement is depicted in Figure 2-2. 
 
15 
 
Early-Stage 
Measurement
 Implementation 
Phase 
Measurement
 Testing   Phase 
Measurement
 Operation Phase 
Measurement
 Maintenance 
Phase 
Measurement
 Faults/Failure Data Collection
 Software Reliability Measurement
 Development Phase
 CodeDesign Testing Operation Maintenance
 Process/Product Characteristics
 Requirements
   
Figure 2-2: Software Reliability Measurement vs. Development Process 
 
2.4 Brief Taxonomy of Software Reliability Measurement Models 
 The current practices of software reliability measurement include two types of 
activity: reliability estimation and reliability prediction [3]: 
? Reliability estimation: This activity determines current software reliability by 
applying statistical inference techniques to failure data obtained during system 
test or during system operation. This is a measure regarding the achieved 
reliability from the past until the current point. Its main purpose is to assess 
the current reliability, and determine whether a reliability model is a good fit 
in retrospect. 
? Reliability prediction: This activity determines future software reliability 
based on available software metrics and measures. Depending on the software 
development stage, prediction involves different techniques: 
 
16 
 
o  When failure data are available (e.g., software is in system test or 
operation stage), the estimation techniques can be used to parameterize 
and verify software reliability models, which can perform future 
reliability prediction. 
o When failure data are not available (e.g., software is in the design 
stage), the metrics obtained from the software development process 
and the characteristics of the resulting product(s) can be used to 
determine reliability of the software upon testing or delivery. This is 
usually called ?early prediction?. 
 Fenton [10] classified software metrics into three main categories: product, 
process, and project metrics: 
? Product metrics are those that describe characteristics of the software 
development life cycle processes outputs such as requirements specifications 
documents, design diagrams, source code, and executable programs. 
Examples of classical product oriented metrics are McCabe?s Cyclomatic 
Complexity, Line of Code (LOC), and Mean Time To Failure (MTTF) 
[11][12]. 
? Process metrics quantify attributes of the development process and of the 
development environment. Research has demonstrated that a relationship 
exists between the development process and the ability to complete projects 
on time and within the desired quality objectives [13]. Higher reliability can 
be achieved by using better development process, risk management process, 
 
17 
 
configuration management process, etc. Therefore, process metrics, such as 
the SEI Software CMM level, were also used to estimate, monitor and 
improve the reliability and quality of software. 
? Project metrics are those that describe the available resources characteristics, 
for instance, the number of developers and their skills. These metric are rarely 
used in the field of software reliability measurement. 
 Figure 2-3 shows a brief taxonomy of software reliability models. Most of the 
existing software reliability models fall in the estimation category. 
 
Figure 2-3: Brief Taxonomy of Software Reliability Measurement Models 
 This study focuses on developing product-based methods for early software 
reliability measurement. 
2.5 Why Early Reliability Measurement is Necessary 
 Studies support the following claims: 
? The majority of software projects fail to achieve schedule and budget goals. 
? The majority of faults have their root cause in poorly defined requirements. 
 
18 
 
? The cost of fixing a software fault is lowest in the requirements phase. 
2.5.1 Majority of Software Projects Failed to Achieve Schedule and Budget Goal 
 Studies describe what has been the industry reality for decades: the majority of 
software projects failed to achieve schedule and budget goals. 
 A summary of 1995 Department of Defense (DoD) software spending [14] is 
shown in Figure 2-4. As indicated, of the $35.7 billion spent by the DoD for software 
development, only 2 percent of the software was able to be used as delivered. The 
vast majority, 75 percent, of the software was either never used or was cancelled prior 
to delivery. The remaining 23 percent of the software was used following 
modification. 
 
Figure 2-4: Outcomes of Department of Defense Software Spending 
 A similar study conducted by the Standish Group [15] on non-DoD software 
projects in 1994 produced very similar results. In over 8,000 projects conducted by 
350 companies, 28% of projects are failures, 46% are challenged, and only 26 percent 
of the projects were considered successful.  
Poor software quality is a primary factor behind many failures, and often results 
in massive rework of application scope, design and code [15]. Such rework extends 
Software 
delivered, but 
not 
successfully 
used
 46%
 Software paid 
for, but not 
delivered
 29%
 Software used, 
but extensively 
reworked or 
abandoned
 20%Software used after changed
 3%
 Software used 
as delivered
 2%
 
19 
 
release cycles and consumes significant additional budget. Aside from the time and 
money spent for application rework and increased help desk support, business 
reputation and market position can also be compromised. To reduce software failures, 
it is imperative that we better understand the quality initiatives behind the products 
being developed for today?s global economy. 
2.5.2 Requirements are the Root of Many Problems 
 There is strong evidence that early stages of the system development life cycle are 
especially prone to faults. Confusion, misunderstanding, and frustration relative to 
requirements are major risks to the success of any software project. 
 Inspection statistics for NASA shuttle software showed that the density of major 
faults found during requirements inspections was seven times higher than during code 
inspections [16].   
 In a study of a US Air Force project by Sheldon [17], faults were classified by 
source. It was found that requirements faults comprised 41% of the faults discovered, 
while logic design faults made up only 28% of the total fault count. 
 Other case studies back this result as well. For example, a study by James Martin 
[18] reported that over half of all project faults could be traced to faults made during 
the requirements stage as indicated in Figure 2-5 (adapted from [18]). Further, the 
study stated that approximately 50 percent of requirements faults were the result of 
poorly written, ambiguous, unclear and incorrect requirements. The other 50 percent 
of requirements faults could be attributed to incompleteness of specification (i.e. 
requirements that were simply omitted.) 
 
20 
 
 
Figure 2-5: Distribution of Faults in Software Projects 
 Other statistics demonstrated similar problems: 
? 70-85 percent of application rework was related to faults in requirements [18] 
? 44% of projects were cancelled due to problems with requirements [18] 
? 54% of initial project requirements were actually realized [15] 
? 45% of realized requirements ended up actually being used [15] 
 A survey of the Standish Group [15] also found that of the eight main reasons 
given for project failures, five were requirements related, as presented in Figure 2-6 
(adapted from [15]). These were ?incomplete requirements?, ?lack of user 
involvement?, ?unrealistic user expectations?, ?requirements keep changing?, and 
?system no longer needed?. 
 
Requirements
 56%
 Design
 27%
 Other 
10%
 Code
 7%
 
21 
 
 
Figure 2-6: Distribution of Failure Causes of 8000+ projects 
 More recently, an analysis of the data gathered by the Software Engineering 
Institute (SEI) on 451 Capability Maturity Model (CMM) Level 1 CMM-Based 
Assessments for Internal Process Improvement conducted from 1997 through August 
2001 indicated that requirements continued to be a problem [19]. 
 Getting the requirements right is probably the single most important thing that can 
be done to achieve customer satisfaction. 
2.5.3 Faults Cost Less when Detected and Fixed in Early Stages of Development 
 The importance of requirements is further emphasized by Figure 2-7 (adapted 
from [20]), which depicts the distribution of effort needed to fix faults [20]. It can be 
clearly seen that the bulk of the effort (82%) is attributed to fixing requirement faults.  
In?complete?
 requirements
 17%
 Lack?of?user?
 involvement
 16%
 Inadequate?
 resources
 13%Unrealistic?user?
 expectations
 12%
 Lack?of?
 management?
 support
 12%
 Requirements?
 keep?changing
 11%
 Inadequate?
 planning
 10%
 System?no?
 longer?needed
 9%
 
22 
 
 
Figure 2-7: Distribution of Effort to Fix Faults  
 As accepted by the majority of the practitioners, the cost of fixing a software fault 
is lowest in the requirements phase. As the project moves into subsequent phases of 
software development, the cost of fixing a fault rises dramatically, since there are 
more deliverables affected by the correction of each fault, such as a design document 
or source code. The earlier a fault is detected, the less damage it can do to the system, 
because there are very few deliverables to correct. 
 According to the industrial data, the cost of detecting and removing a fault that is 
introduced during the earlier phases of the software development life cycle increases 
almost exponentially as we progress through the development life cycle (see Figure 
2-8, excerpted from [21]). 
 
Figure 2-8: Industry Standard Cost Ratio to Fix a Defect 
Requirements
 82%
 Design
 13%
 Other 
4%
 Code
 1%
 
23 
 
 McConnell [22] estimated that "a requirements fault that is left undetected until 
construction or maintenance will cost 50 to 200 times as much to fix as it would have 
cost to fix at requirements time." 
 Other studies, furthermore, show that requirements faults are between 10 and 100 
times more costly to fix during later phases of the software life cycle than during the 
requirements phase itself. Let us assign a unit cost of one (?1X?) to the effort required 
to detect and repair a fault during the requirement stage.  The same fault, if not found 
until integration testing or production, will cost hundreds or even thousands of times 
more (see Figure 2-9, adapted from [13]). 
 Figure 2-9: Cost Ratio vs. Development Phases in Which Faults are Found 
 The reason for this large difference is that many of these faults are not detected 
until well after they have been made. This delay in fault discovery means that the cost 
to repair includes both the cost to correct the offending fault and the cost to correct 
subsequent investments in the faults which were made in later phases. These 
investments include the cost for redesign and replacement of code, cost for 
documentation rewrite, and the cost to rework or replace software in the field. Indeed, 
the key issue is scrap and rework. If a fault was introduced while coding, one can just 
fix the code and re-compile. However, if a fault has its roots in poor requirements and 
0
 100
 200
 300
 Requirements Design Coding Unit Testing Acceptance 
Testing
 Production
 1X 3?6X 10X 15?40X 30?70X
 40?1000X
 
24 
 
is not discovered until integration testing then one must re-do the requirements, re-do 
the design, re-do the code, re-do the tests, re-do the user documentation, and re-do the 
training materials. It is all this ?re-do? work that sends projects over budget and over 
schedule.  
 This claim is supported by many studies. For instance, in a study performed at 
Raytheon, Dion [23] reported that approximately 40% of the total project budget was 
spent in rework costs. Other studies [24] indicate that for the majority of companies 
today, rework contributes between 30-40% of total project costs. Because of their 
large number, and the multiplying effect, finding and fixing requirement faults 
consumes between 70% - 85% of total project rework costs. 
 Faults are introduced in various stages of the development process, as shown in 
Figure 2-10 (excerpted from [25]). This figure shows that faults which originate in 
early stages can have a lasting influence on the quality of a system: they are the 
earliest to invade the system and the last to leave, if not fixed. This is called ?fault 
summation effect? [25], which explains why requirements faults, in comparison to 
faults introduced into projects in later development phases are usually more 
expensive to be defected and fixed. 
 
 
 
25 
 
 
Figure 2-10: Summation Effect of Faults 
 The role of software has shifted from simply generating financial or other 
mathematical data to monitoring and controlling equipment which directly affects 
human life and safety.  Software?s increasing role creates both requirements for being 
able to trust it more than before, and for more people to know how much they can 
trust their software products. As a result, methods used to achieve, predict, and assess 
the safety and reliability of software are strongly needed in academia, industry, and 
government.  This is also true since many legal issues related to software liability are 
evolving [26]. 
 
26 
 
 Different parts of the software-related industry and society face different 
challenges. For engineers and managers involved in the development of software 
systems, there is a strong need for early indicators, such as reliability, so that actions 
can be taken early to reduce cost and prevent disasters. For regulators and policy 
makers involved in the certification of software systems, practical methods and tools 
are needed to quantitatively assess the quality of the software products, including 
requirements specification, design documents, delivered source code, and user 
manual [26]. 
 Clearly, current software engineering suffers from problematic requirements 
specifications. Matured, well-defined, and quantitative assessment methods for the 
reliability of the software products are not generally applicable until later life cycle 
phases. Most engineering methods remain qualitative and depend heavily on 
engineering judgment during the requirements phase. Therefore, the need to develop 
better software requirements engineering techniques is urgent [16]. 
2.6 Virtues of Early Software Reliability Measurement  
 First, the advantage for early software reliability measurement is simple 
economics.  Requirements faults are major source of project failures and the most 
expensive ones to be fixed. Therefore, detection and removal of requirement faults in 
the early stage of the life cycle will significantly improve the quality of the product in 
a cost-effective manner. With the cost of some systems exceeding tens or even 
hundreds of millions of dollars and with development duration of more than 12 to 18 
months, early reliability measurement can significantly contribute to the success or 
early rational cancellation of the project [27]. 
 
27 
 
 Secondly, early software reliability measurement provides a solid foundation to 
perform meaningful tradeoff studies at project start. If software reliability 
measurement is performed early in the software life cycle, it is possible to determine 
what improvement, if any, can be made to the software methods, techniques, or 
organizational structure. 
 Thirdly, with recent strong emphasis on speed of development, the decisions 
made on the basis of early reliability estimation can have the greatest impact on 
schedules of software projects [27]. It was observed that early defect detection could 
significantly shorten the schedule, as shown in Figure 2-11 (Excerpted from [28]). 
This is because the future rework is minimized if requirements faults are detected and 
removed during early stages of software development [28]. 
 
Figure 2-11: Development Schedule with/without Early Fault Detection  
2.7 Previous Work on Early Reliability Measurement 
 Early software reliability measurement has attracted great interest from software 
practitioners and researchers since the early 1990?s. However, quantifying software 
 
28 
 
reliability in an early stage has been a difficult research subject that many researchers 
have attempted to solve with limited success [4]. 
 Traditional software-reliability prediction methods such as reliability growth 
models base estimates on observing failures (and fixing faults) in validation testing, 
during which operational patterns represent the product?s actual field use. 
Unfortunately, in early developmental stages of software, failure data is not available 
to determine the reliability of software. Therefore, although many techniques and 
models have been developed, only a few can be applied in early development stages, 
e.g. design phase, before an executable version of the software system is available. 
This is because only those methods/models that can provide a reasonable estimation 
without the need of any actual failure data are applicable in early development stages. 
 The pioneering early-stage reliability measurement models proposed in the early 
1990?s include: Gaffney and Davis? phase-based model [29], Agresti and Evanco?s 
Ada software defects model [30], and the US Air Force?s Rome Lab model [31]. The 
basic philosophy of these early-phase models is to obtain as much information as 
possible. This type of approach is referred to as the ?white box? approach, which 
requires detailed information usually not available in most cases. For instance, the US 
Air Force Rome Lab model consists of nine factors that are used to predict the fault 
density of the software application. There are parameters in this estimation model that 
have tradeoff capability (maximum/minimum predicted values). The analyst can 
determine where some changes can be made in the software engineering process or 
product to achieve improved fault-density estimation. However, this tradeoff is 
valuable only if the analyst has knowledge of the software development process. 
 
29 
 
 Smidts et al (1997) [32][33] proposed an architecturally based software reliability 
model to predicting software reliability based on a systematic identification of 
software process failure modes and their likelihoods. A direct consequence of the 
approach and its supporting data collection efforts is the identification of weak areas 
in the software development process. The author believed that the key characteristics 
of the approach are applicable to other software-development life-cycles & phases. 
However, it is unclear how difficult the implementation of the approach would be, 
and how accurate the predictions would be. 
 Yin et al (2000) [34] addressed early-stage system-level software reliability 
modeling issues for large-scale software products by taking a hierarchical description 
and using Petri net mechanisms. The Petri net modeling techniques were proposed for 
handling the dependency among software modules. This approach requires only a 
minimum amount of information, which is most likely to be available in early 
development stages. However, to create a Petri net model for software modules can 
be fairly complex, especially for large-scale programs. 
 Zhao (2003) [35] presented software reliability modeling issues in the early stage 
of a software development for a fault tolerant software management system. Based on 
Stochastic Reward Nets, a model of the hierarchical view for a fault tolerant software 
management system is put forward, and an approach that consists of system transient 
performance analysis was adopted. 
 Tripathi and Mall (2005) [36] developed a model based on Reliability Block 
Diagram (RBD) for representing real-world problems and an algorithm for analysis of 
these models in the early phases of software development. The simulation result 
 
30 
 
shows that reliability prediction of subsystems is a good quality indicator and 
coupling can be correlated with system reliability, which can be used for system 
design assessment. 
 By assuming the same failure rate between two similar projects, Hu (2006) [37] 
suggested to "reuse" failure data from previous releases or similar projects with ANN 
models to improve early reliability prediction for current project/release. Better 
prediction performance was observed in the early phases of testing compared with the 
original ANN model without failure data reuse. 
 Mei (2007) [38] investigated an approach to using past fault-related data with 
Wavelet Networks model to improve reliability predictions in the early testing phase. 
The wavelet-networks-based model captures the input-output (I/O) relationships of 
software system to corresponding fault and to improve the accurate of predicting the 
reliability. Numerical example was illustrated with both actual and simulated datasets. 
The analysis with example shows that the proposed approach works effectively in the 
early phase of software testing. 
 More recently, Cheung et al (2008) [39]  presented a framework for predicting 
reliability of software components during architectural design phase by exploiting 
architectural models and associated analysis techniques, stochastic modeling 
approaches, and information sources available early in the development life cycle. 
The authors agreed that the scalability of their reliability prediction techniques at the 
system level remains a challenge and further investigation is needed. 
 Our previous research [40] (to be printed) proposed an approach  estimates the 
fault contents based on data collected by Software Productivity Research Inc. [41] 
 
31 
 
that links the SEI CMM level to the number of faults per function points. The 
probability of success per demand is obtained using Musa's exponential model. 
However, the value of a critical parameter (called fault exposure ratio) in Musa's 
model was found outdated and incorrect by orders of magnitude in particular for 
safety critical applications [40]. 
 Common problems with these existing approaches are: lack of generic 
applicability and scalability, over-dependence on industry-average data, such as faults 
content per function point, and/or ignorance of product documents generated at early 
development phase. In particular, Software Requirements Specifications documents 
(SRS), the most significant documents usually available at the end of requirements 
phase, are neglected by these approaches due to difficulties in linking requirements-
 based measurement(s) to reliability. Therefore, they are inevitably unsuited to provide 
trustworthy results.  
 Apparently, a new approach is required to bridge the gap between requirements-
 based measurement(s) and reliability quantification. This probabilistic-reliability 
prediction approach should also enable software professionals to identify problematic 
requirements to reduce the risks of software projects. 
2.8 Selecting Software Measurement for Early Reliability Assessment 
 There exist more than 200 software measurements [42]. To predict software 
reliability at the end of the requirements stage with limited information about a 
system at hand, appropriate measurement(s) need to be selected before 
methods/models can be developed to bridge the gap between the measurement and 
the reliability prediction. 
 
32 
 
 The desired measurement should possess the following characteristics: 
? applicable by the end of the requirements phase; 
? involving the use of formal logic and abstract modeling to state the 
requirements in a clear, precise, and unambiguous format to facilitate the 
communication among project stakeholders, including domain experts, 
manager, end users, and developers; 
? capable of identifying requirements faults in a systematic way; 
? easy to use for all project stakeholders, not just for those with special 
mathematical training; 
? scalable for large/complex applications. 
 Based on a list of 78 measurements identified in a study conducted by Lawrence 
Livermore National Laboratory [43], the University of Maryland [12][7] reduced it to 
30 (later extended it to 40) and systematically ranked these measurements with 
respect to their ability at predicting software reliability through expert opinion 
elicitation process. These measurements were classified into three categories: high-
 ranked, medium-ranked and low-ranked. This ranking was partially validated through 
two experiments [40] [44]. Table 2-1 presents the phase-based applicability and 
ranking classification of these measurements. 
 In our previous research [40], we found that among the measurements listed in 
Table 2-1, cause-effect graphing analysis (also called cause-effect graphing) was the 
most promising candidate and was thereby selected in this study, even though it was 
ranked as ?medium? in the earlier study [12]. The two primary disadvantages keeping 
it from being widely used in the field of software reliability prediction were [26]: 
 
33 
 
1) no specific process was defined for the measurement; 
2) no method was developed to link this measurement to software reliability. 
 This study addresses these two primary issues along with others, such as usability 
and scalability, to enable software project stakeholders to effectively detect 
requirements faults and predict software reliability at the requirements analysis stage. 
Table 2-1: Phase-based Applicability and Ranking Classification of 40 Software 
Reliability Measurements1 
Index Measure 
Applicable Development Phase(s) 
Ranking 
Class Requirement Design Implementation Testing 
1 Bugs per line of code (Gaffney)     Low 
2 Cause-effect graphing     Medium 
3 Class coupling     Medium 
4 Class hierarchy nesting level     Medium 
5 Code defect density     High 
6 Cohesion     Low 
7 Completeness     Low 
8 Coverage factor     High 
9 Cumulative failure profile     High 
10 Cyclomatic complexity     Medium 
11 Data flow complexity     Medium 
12 Design defect density     High 
13 Error distribution     High 
14 Failure rate     High 
15 Fault density     High 
16 Fault-days number     High 
                                                 
 
 
1 Table legend:  = applicable ;  = not applicable. 
 
34 
 
Index Measure 
Applicable Development Phase(s) 
Ranking 
Class Requirement Design Implementation Testing 
17 Feature point analysis     Low 
18 Full function point     Low 
19 Function point analysis     Low 
20 Functional test coverage     Medium 
21 Graph-theoretic static architecture complexity     Low 
22 Lack of cohesion in methods (LCOM)     Medium 
23 Man hours per major defect detected     Medium 
24 Mean time to failure     High 
25 Minimal unit test case determination     Medium 
26 Modular test coverage     Medium 
27 Mutation score     Medium 
28 Mutation testing (error seeding)     Low 
29 Number of children (NOC)     Medium 
30 Number of class methods     Medium 
31 Number of faults remaining (error seeding)     Medium 
32 Number of key classes     Medium 
33 Requirements compliance     Low 
34 Requirements specification change requests     Medium 
35 Requirements traceability     Medium 
36 Reviews, inspections and walkthroughs     Medium 
37 Software capability maturity model     Medium 
38 System design complexity     Medium 
39 Test coverage     Medium 
40 Weighted method per class WMC)     Medium 
 
  
 
35 
 
Chapter 3:  Formalization of Cause-Effect Graphing Analysis as 
a Software Reliability Measurement 
3.1 What is Cause-Effect Graphing Analysis (CEGA) 
 The Cause-Effect Graphing Analysis technique was originally proposed by 
Elmendorf [45] to design the necessary and sufficient set of test cases that cover 100 
percent of the functional requirements by the use of a mathematically rigorous 
algorithm. The Cause-Effect Graphing Analysis (CEGA) is the process of 
transforming specifications into a graphical representation, called a cause effect 
graph. CEGA has a proven beneficial side effect which is to point out incompleteness 
and ambiguities in specifications as a result of developing cause effect graphs [45] 
[46][47][48]. 
 A Cause Effect Graph (CEG) is a visual and formal language into which a natural 
language specification is translated. More precisely, a CEG is a Boolean graph 
describing the semantic content of a written functional specification as logical 
relationships between causes (inputs or stimuli) and effects (outputs). It consists of 
causes, effects, and graphical notations expressing logical relationships and 
constraints among causes and effects. The logical operators include ?IDENTITY?, 
?AND?, ?OR?, and ?NOT?. The basic notation for the CEG logical relationships is 
shown in Figure 3-1. 
  In most systems, certain combinations of causes are impossible because of 
syntactic or environmental considerations. To account for these, the notation in 
Figure 3-2 is used. The EXCLUSIVE constraint states that it must always be true that, 
 
36 
 
at most, one of c1, c2, ?, and ck can be ?1?. The INCLUSIVE constraint states that at 
least one of c1, c2, ?, and ck must always be ?1? (c1, c2, ?, and ck cannot be ?0? 
simultaneously). The ONE-AND-ONLY-ONE constraint states that one and only one 
of c1, c2, ?, and ck must be ?1?. The REQUIRE constraint states that for c1 to be ?1?, 
c2 must be ?1? (i.e., it is impossible for c1 to be ?1? and c2 to be ?0?). Besides, there 
frequently is a need for a constraint among effects. The MASK constraint in Figure 
3-2 states that if effect e1 is ?1?, effect e2 is forced to ?0?. 
 
 
Figure 3-1: Symbols of Basic CEG Logical Relationships 
 
 
Figure 3-2: Symbols of CEG Constraints 
 In general, the following steps are taken to derive test cases using CEGA [49]: 
 
37 
 
1) A cause-effect graph is developed on the basis of the requirements 
specification. 
2) The graph is then converted to a decision table (also called ?limited-entry 
decision table?). 
3) Finally, the decision table is converted to test cases by applying certain rules. 
This is why CEGA is usually called decision table testing when it is used for 
test case design. 
This study does not discuss how to create test cases using CEGA. Instead, the 
interested reader is referred to [45] [48][50] for further information. 
3.2 Construction of CEG 
  In general, the following process is used to construct a CEG for a Software 
Requirements Specifications document (SRS) [45]: 
1) Divide the SRS into multiple workable pieces if necessary. 
2) Study the SRS to identify causes and effects. 
3) Assign a unique name to every cause and effect. 
4) Identify all of the expressed and implied logical relationships and constraints 
among causes and effects. 
 Several tools that support CEG drawing are commercially available [48]. 
BenderRBT? [51], for instance, allows project teams to quickly create cause-effect 
graphs, complete with node relationships and constraints through its add-ons for 
Microsoft Office Visio?. When the cause-effect graph is completed, users can invoke 
BenderRBT? to design test cases based on the requirements depicted in the graph. 
 
38 
 
 Myers [45 pp. 65-88] provided general CEG construction guidelines widely used 
in industry. However, there are no specific rules found in the literature on how to 
identify the elements of a CEG from an SRS, including the causes, effects, logical 
relationships, and constraints. We will revisit the topic of CEG construction in 
Chapter 7 and present our attempt to address this issue. 
3.3 CEGA as a Software Reliability Measurement 
CEGA is also recognized as a software reliability measurement. According to 
[42], CEGA ?aids in identifying requirements that are incomplete and ambiguous?, 
and ?explores the inputs and expected outputs of a program and identifies the 
ambiguities?, and ?once these ambiguities are eliminated, the specifications are 
considered complete and consistent?. The measure of CEGA is defined as: 
  (%) 100 1 ,existing
 total
 ACE A
 ? ?? ? ?? ?? ?           (Eq. 3-1) 
where 
CE(%) : the cause-effect measure 
existingA  : number of ambiguities in a program remaining to be eliminated 
totalA  : total number of ambiguities identified 
 The value of the cause-effect measure is scaled between 0 and 1. A score near 1 is 
considered better than a score near 0. A value near zero indicates a strong need to 
trace to the suspected ambiguities and make any necessary change(s) in the 
requirements specifications. As changes are made to the specifications, the 
incremental measure values can be plotted to show if improvements are being made 
and how rapidly.  
 
39 
 
 Be aware that the value of the cause-effect measure, CE(%), is subjectively 
determined. In fact, there is no standard definition for requirements ambiguity. 
According to Le [52 p. 13], software requirements ambiguities fall into categories of 
indeterminacy (vagueness and generality), linguistic (lexical, syntactic, and semantic) 
ambiguity, and software engineering (requirement domain, application domain, 
system domain, and development) ambiguity. However, there appears to be no single 
comprehensive definition of ambiguity in the software engineering literature [52 p. 
19]. Each of the following definitions highlights only some aspects of ambiguity and 
omits others: 
? IEEE?s definition [53]: ?An SRS is unambiguous if, and only if, every 
requirement stated therein has only one interpretation?. 
? Davis? definition [54]: ?Imagine a sentence that is extracted from an SRS, 
given to ten people who are asked for an interpretation. If there is more than 
one interpretation, then that sentence is probably ambiguous.? 
? Schneider, Maritin and Tsai?s definition [55]: ?An important term, phrase, or 
sentence essential to an understanding of system behavior has either been left 
undefined or defined in a way that can cause confusion and misunderstanding. 
Note that these are not merely language ambiguities such as an uncertain 
pronoun reference, but ambiguities about the actual system and its behavior.? 
? Gause and Weinberg?s definition [56]: ?Ambiguity has two sources, missing 
information and communication errors. Missing information has various 
reasons. For instance, humans make errors in observation and recall, tend to 
leave out self-evident and other facts, and generalize incorrectly. A 
 
40 
 
communication error that occurs between the author and the reader is due to 
general problems in the writing.? 
? Kamsties?s definition [57]: ?A requirement is ambiguous if it has multiple 
interpretations despite the reader?s knowledge of the context. It does not 
matter whether the author unintentionally introduced the ambiguity, but 
knows what was meant, or she intentionally introduced the ambiguity to 
include all possible interpretations. The context is important to be taken into 
account, because a requirements document cannot be expected to be self-
 contained in a way that an arbitrary na?ve reader could understand it.? 
 The definitions together form a complete overview of the current understanding 
of ambiguity in Software Engineering [52].  
 Unsurprisingly, the repeatability of the cause-effect measure, CE(%), is not 
guaranteed. The subjective or non-subjective factors, such as personal attributes and 
knowledge, would to some extent affect the inspector?s judgment for what is a SRS 
ambiguity and what is not. Therefore, it is not appropriate to use the cause-effect 
measure for quantitatively assessing the reliability of a software system. 
3.4 Advantages and Disadvantages of CEGA 
 CEGA is a proven versatile technique for test case design and requirements 
specification validation. There are several distinct advantages and disadvantages of 
using CEGA. The general benefits of CEGA when compared to other testing 
techniques are [45][46][47][48][58][59]: 
 
41 
 
 It is a rigorous method for transforming a natural language specification into a 
formal language specification. The formal characteristics of CEGA guarantee 
a complete functional coverage not easily found in the state of the practice ?ad 
hoc manner? testing.  
 The test cases generated can be used during all subsequent levels of testing 
from unit testing to system testing.  
 CEGA begins the process of integration testing. The code modules eventually 
must integrate with each other. If the requirements that describe these modules 
cannot integrate, then the code modules cannot be expected to integrate. The 
cause-effect graph shows the integration of the causes and effects. 
 The starting point for CEGA is the requirements document. The requirements 
can describe real time systems, events, data driven systems, state transition 
diagrams, object oriented systems, graphical user interface standards, etc. Any 
type of logic can be modeled using a CEG. 
 CEGA can also serve as an advance over other informal, ad-hoc specification 
of program function and combinatorial testing of interfaces. 
 CEGA provides consideration of constraints that application of other testing 
techniques do not provide. 
 CEGA also has the ability to detect defects that cancel each other out, and the 
ability to detect defects hidden by other things going right. 
 When compared to other validation techniques, the benefits of CEGA stem from 
the fact that it is semi-formally based on a graphics form of propositional logic which 
 
42 
 
gives the user some degree of confidence. This means that CEGA yields additional 
benefits, among them: 
 CEGA is helpful for creating unambiguous, concise specifications during 
requirements phase. CEGs graphically display relationships and constraints 
between application inputs and outputs. They provide detailed analysis 
information in a variety of easy-to-read formats. The analyst may get visual 
clues about missing or incorrect relationships. The project team can analyze 
every aspect of the requirements in CEGs to identify precedence problems in 
relations, logical faults, missing functionality and improperly used aliases. 
 CEGs help to uncover ambiguities and incompleteness in the specification 
during verification and validation (V&V)2. Development of the CEG from the 
specification allows a thorough inspection of the specification. Any omissions, 
inaccuracies, or inconsistencies are likely to be detected. In developing cause-
 effect graphs, project teams evaluate the requirements for completeness, 
consistency, sufficient level of detail and lack of ambiguity, often finding 
defects that otherwise would not be found until integration testing. Business 
analysts and project stakeholders collaboratively can review the natural 
language test cases generated by CEGA, enabling them to identify and correct 
any requirement faults earlier in the development cycle. 
 CEGA is easy to use. The only requirement for using and understanding CEG 
is knowledge of Boolean logical operators. 
                                                 
 
 
2 Verification and Validation (V&V) is the process of checking that a product, service, or system meets 
specifications and that it fulfills its intended purpose [28]. These are critical components of a quality 
management system such as ISO 9000. 
 
43 
 
 CEGA is more methodical and therefore more uniform, repeatable, and 
reliable. 
 CEGA requires only functional requirements specifications, which is most 
likely to be available in the early stages of the software development. 
Therefore, CEGA can be used early in the development process in conjunction 
with review procedures such as Desk Checking and Walkthroughs [60]. 
 CEGA facilitates early involvement of customers to ensure the application 
meets their needs. The client?s ability to state the right mission goals and 
needs is essential to attain a requirements specification that is complete, 
correct and consistent, which in turn is a prerequisite for the right system to be 
ordered and to enable cost-effective design, verification and validation. 
 Many aspects of the cause-effect graphing can be automated. For instance, 
conversion of the graph to a decision table is an algorithmic process, which 
could be automated by a computer program. This trait implies that CEGA as a 
testing technique would be scalable for large-scale application. 
 CEGA improves team communications and reduces risks, rework, and 
frustration. Requirements definition with visual specifications promotes 
positive communication. As a visualized representation of requirements, CEG 
can ease the build of a common understanding between the domain experts, 
end users, managers, analysts, developers, and test personnel of project needs 
and commitments. Each project role gains the same understanding of the 
expected behavior of the software before it is developed, thereby reducing the 
risk of rework occurring throughout the software development life cycle. 
 
44 
 
 CEGA closes the ?language gap? between business and IT and enables the 
?Big Picture? view of the business by facilitating a more structured dialogue 
between the two teams. Business stakeholders are often domain experts, 
speaking the ?domain language,? but lack understanding of technical 
terminology. IT stakeholders are well versed with the technical terminology, 
but often lack expertise and understanding of the problem domain. A CEG 
synchronizes the two teams by facilitating an accurate and complete 
?knowledge transfer? from analysts and business stakeholders to the technical 
team. CEGs provide both nontechnical and technical audiences with a clear 
and concise understanding of expected system behavior. Their input can be 
captured and used to iteratively improve and refine the requirements. 
 CEGA eases compliance with standards and regulations. Validation of  
requirements using CEGA and testing software using the test cases developed 
from CEGA satisfy the definition of V&V (validation and verification) in the 
Capability Maturity Model? IntegrationSM (CMMISM) [9]. 
 While there are a number of advantages to using CEGA, there are some 
disadvantages as well. Researchers and practitioners [26][48][49][50][58][59][61] 
have observed some common difficulties when using CEGA: 
 No specific rules are rigorously defined for identifying causes, effects, logical 
relationships, and constraints, although the general procedure is known. As a 
result, CEGA has to be performed by domain specialists, people who have 
expert knowledge of the problems under study. A domain specialist may be a 
 
45 
 
problem owner, an end user of the required system, a sponsor, an outside 
specialist, a manager, etc. 
 Complexity of the graph generation task. The main drawback is probably the 
up-front cost of deriving causes, effects, and constraints from a given informal 
specification, even if these up-front costs are small compared to the potential 
major downstream savings because they might avoid unnecessary rework and 
operational problems. 
? Identifying causes and effects would be very tedious. 
? Identifying the true logical relationship between the causes and the 
constraints requires domain knowledge. 
? The process of actually drawing the graphs is a very time consuming 
process even with the help of commercial CEG drawing tools. 
? Graphical depiction could be overwhelming.  In particular, developing 
a CEG can become very complicated when a system has a large 
number of causes and effects. To keep the complexity under control, 
intermediate nodes are added to represent logical combinations of 
several causes. However, an appropriate choice of intermediate nodes 
is frequently not obvious. The possible complexity of CEGs makes it 
apparent that tool support is necessary for these time-consuming tasks. 
 Difficulty of updating CEGs when the specification changes or when the 
creator realizes that some information has been overlooked. Any changes that 
occur in the specification must be translated into corresponding changes in the 
graph. If new cause(s) are added into CEG, much of its internal structure may 
 
46 
 
have to be redesigned. A simpler intermediate representation can ease the 
difficulty [61]. 
 Difficulty of verifying the correctness of requirements specifications. The 
starting point for CEGA is the requirements document.  CEGA is valid only if 
the natural language specification satisfies the customer?s intentions.  Thus, if 
the specification is incorrect one will end up with a set of incorrect test cases.  
Therefore, one must validate the specification itself before applying CEGA to 
test case design. However, as the complexity and scope of the modeled 
behavior increases, the graphs become eventually intractable. 
 Perceived inability of CEGs to model situations involving time delays and 
numerically intensive applications. 
 Though some ambiguities, incompleteness and difficulties exist in CEGA, the 
concepts of CEG used in specifying the functional behavior of a system make it 
attractive from a usability perspective.  The CEGA technique is an advance over 
informal, ad-hoc specifications of systems.  It is systematic even though subjective in 
the first stage of construction of the CEG, and therefore relatively uniform, repeatable, 
and reliable.  It is based on a graphical form of propositional logic, which gives the 
user some degree of confidence in the specification power of the graph. As a result, 
CEGA has been widely recognized as a testing technique [60][62]. 
 However, CEGA is not extensively used as a software reliability measurement, 
even though it potentially provides a very systematic and pretty thorough way of 
 
47 
 
checking the end functionality required by the user. In addition to difficulties 
mentioned previously, some possible reasons for this lack of interest are: 
 Repeatability issue. For a measurement to be useful it must be repeatable. 
When software measurement definitions are incomplete or unspecific, it is 
easy to collect invalid or incomparable measurement(s) from different data 
collectors. Thus, the primary issue is not only whether a definition for a 
measurement is theoretically correct, but also specific enough, such that 
everyone understands what is to be measured and what the measured values 
represent. Until then, the values cannot be collected consistently and other 
people, different from the collectors, can interpret the results correctly and 
apply them to reach valid conclusions. Our experience with [12] [40] has 
shown that no standard definition exists that ensures repeatability of the 
CEGA measurement. To correct this, this study begins by reviewing the 
definitions of CEGA to define more precise and rigorous measurement rules.  
 The Cause-effect measure, CE(%), is too undependable to be used as an 
indicator of software reliability. CE(%) is the ratio of the number of removed 
ambiguities,  ? ?total existingA A? , to the total number of ambiguities identified in 
SRS, totalA .  The major difficulty of counting ambiguities is that for any 
specification, there is always some who understand it differently from others. 
According to our experience [26][40], the values of existingA  and totalA  
subjectively depend on the person exercising CEGA. Other factors, such as 
the level of granularity to which an SRS should be broken up and the writing 
style of an SRS, can also have a significant influence on the value of CE(%). 
 
48 
 
 These limitations have inevitably kept CEGA from being widely adopted in the 
field of software reliability engineering. Actually, the CEGA measurement was 
ranked as ?medium? among 40 software reliability measurements by experts with 
respect to its ability at predicting software reliability [12] (also see Table 2-1). Even 
worse, CEGA has been removed from IEEE Std. 982.1-2005 [11], the latest edition of 
IEEE Std.1-1988[42], ?IEEE Standard Dictionary of Measures to Produce Reliable 
Software?. The justification of deleting CEGA includes ?CEGA is ambiguous?, 
?difficult to interpret?, and ?its low usage? [11]. 
 To enhance CEGA as a software reliability measurement, this study addresses 
these limitations by 
? Formalizing CEG in terms of mathematics. These formal definitions are 
necessary to ensure that CEGA is meaningful, true and of known accuracy 
because without specified rigorous definition and measurement rules, one runs 
the risk of collecting unrelated, meaningless data. Furthermore, compared to 
the graphical form of CEG, which is more intuitive and easier to understand, 
the mathematical form of CEG is far easier to be stored, represented, and 
implemented by computers, can be updated easily in response to frequent 
requests for requirements change in practice, and thus has better scalability. 
This rigorous form of CEG can also serve as an alternative representation of 
the graphical CEG. 
? Investigating rules to ease the task of CEG construction. 
 
49 
 
? Providing a systematic yet intuitive procedure for applying the proposed 
CEGA for identifying SRS faults. This further enables a consistent 
measurement process for CEGA. 
? Developing methods for quantifying the impact of identified SRS faults on 
software reliability. Software faults have different sizes of failure footprints. 
The impact of a fault on reliability depends on system structure, the way in 
which a system is used, and location of the fault. Using faults identified in 
products instead of the aggregated number of faults estimated from empirical 
data, such as existingA , is believed to provide a more solid foundation for 
reliability quantification [36]. 
3.5 Formal Definition of CEG 
Definition 3-1: CEG 
 Any CEG can be represented by a 4-dimensional tuple where each dimension is a 
set. Namely,  
  ??? ? ???, ??, ??, ????,  
where 
?? ? ???|? ? 1,2, ? , ??: is a set of distinct causes, and p is the number of 
distinct causes. 
?? ? ???|? ? 1,2, ? , ??: is a set of effects, and q is the number of effects. 
?? ? ???: ?? ? ???? ? 1,2, ? , ??: is a set of Boolean functions that map ?? to ?? without 
applying any constraints. The number of Boolean 
function is equal to the number of effects. 
??? ? ?????|? ? 1,2, ? , ?? : is a set of constraints imposed among causes and/or 
 
50 
 
effects , and r is the number of constraints. 
Definition 3-2: Cause 
 A cause in a CEG is a primitive input event, typically invoked by a user or 
external system(s). A primitive input event is an event that cannot be logically 
expressed by other events. All causes in a CEG are distinct. Redundant causes are not 
allowed. 
 A cause has and only has two mutually exclusive states: enabled (represented by 
?1?) or disenabled (represented by ?0?). Namely, 
1      if  is enabled;
 0     otherwise.
 cc ?? ??   
wherec is a cause in a CEG.     
Definition 3-3: Effect 
 An effect in a CEG is a system action or output, either observable or non-
 observable.  
 In contrast with a cause, an effect must be logically expressed by causes using a 
Boolean function. Moreover, an effect has and only has three mutually exclusive 
states: ?present?/?triggered? (represented by ?1?), ?absent?/?non-triggered? 
(represented by ?0?), or ?prohibited?/ ?not allowed? due to constraint(s) (represented 
by ?NA?, which is short for ?Not Allowed?). Namely,  
       if any constraint in CEG is applicable;
 1           if  is triggered (determined by its Boolean function);
 0          if  is non-triggered (determined by its Boolean function).
 NA
 e e
 e
 ??? ???
   
wheree is an effect in a CEG. 
 
51 
 
Definition 3-4: Constraint 
 A constraint in a CEG is a limitation among causes or effects due to syntactic, 
environmental, or other considerations. 
 There are five types of constraints used in a CEG. The mathematical symbols and 
explanation for these constraints are summarized in Table 3-1. 
Table 3-1: Mathematical Symbols of CEG Constraints 
Constraint 
Name Mathematical Symbol Explanation 
EXCLUSIVE ????????????, ??, ? , ??? 
At most one of the causes among 
??, ??, ? , ??can be enabled. This constraint allows simultaneous absence of all of these 
causes. 
INCLUSIVE ????????????, ??, ? , ???  
At least one of the causes among ??, ??, ? , ??  must be enabled. In contrast with the 
?EXCLUSIVE? constraint, this constraint 
does NOT allow simultaneous absence of all 
of these causes. 
ONE-AND-
 ONLY-ONE ??????, ??, ? , ??? 
One and only one of the causes among 
??, ??, ? , ?? can be enabled. This constraint does NOT allow simultaneous absence of all 
of these causes. 
REQUIRE ??????????, ??? Cause c1 cannot be enabled until cause c2 has been enabled 
MASK ???????, ??? The observance of effect e2 is disguised by effect e1 
 
 The following lemmas are helpful when determining if a constraint is applicable 
to an effect/a given input or not (see Section 5.3.4). The violation of any of these 
lemmas caused by a constraint indicates the applicability of the constraint to an effect 
or/and a given input.  
Lemma 3-1: EXCLUSIVE Constraint 
 
52 
 
 For a set of causes confined by an EXCLUSIVE constraint, an enabled cause 
implies that other causes are disenabled. Namely, 
 1 2      k i jIf EXCLUSIVE(c ,c , ,c ), c c , j  i , i k, j k.? ? ? ? ??  
Lemma 3-2: INCLUSIVE Constraint 
 For a set of causes confined by an INCLUSIVE constraint, all causes cannot be 
disenabled simultaneously. Namely, 
 
1 2
 1
        .
 k
 k i
 i
 If INCLUSIVE(c ,c , ,c ), the state combination c is not allowed
 ?
 ? ?
  
Lemma 3-3: ONE-AND-ONLY-ONE Constraint 
 For a set of causes confined by a ONE-AND-ONLY-ONE constraint, both the 
probability of any two different causes being enabled simultaneously and the 
probability of all causes not being enabled simultaneously are equal to 0. Namely, 
 
1 2
 1
 ( , , , )
                                       i j , i k,j k, Pr( ) 0,  and
                                       Pr 0 .
 k
 i j
 k
 i
 i
 ONE c c c
 c c
 c
 ?
 ?
 ? ? ? ? ? ?
 ? ? ?? ?? ?
 ?
 ?
  
Lemma 3-4: REQUIRE Constraint 
 The implication of constraint cause c1 requiring cause c2 is two-fold. First, c1 
cannot be enabled if c2 has not been enabled yet. Second, c2 must have been enabled 
if c1 is enabled. Namely, 
 1 2 2 1 1 2 ( , ),  ,  and .If REQUIRE c c c c c c? ?  
Lemma 3-5: Transitive Law for REQUIRE Constraint 
 
53 
 
 The transitive law holds for all REQUIRE constraints. Namely, 
 
).,(),(
 ),( 31
 32
 21 ccREQUIREccREQUIRE
 ccREQUIRE ?
 ??
 ?
  
Lemma 3-6: MASK Constraint 
 The implication of constraint effect e1 masking effect e2 is two-fold. First, e2 
cannot be triggered if e1 has been triggered already. Second, e1 is not triggered if e2 is 
triggered. Namely,  
 1 2 1 2 2 1 ( , ),  ,  and .If MASK e e e e e e? ?  
3.6 Example of CEG Construction 
 In this section, we illustrate a sample CEG for a system called LOCAT [63]. 
LOCAT was designed for a real-time simple projectile tracking system for the 
Army?s all weather Doppler radar system.  
 For demonstration purpose, only Section 2.1 of LOCAT?s SRS was used to 
construct the sample CEG, as shown in Figure 3-3.  
  
 
54 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Figure 3-3: Example of Identifying Causes and Effects in an SRS 
? 
 
2 Functional Requirements 
 
2.1 Function Interface 
 
2.1.1 Introduction 
Function Interface asks user for the option. The options include: calculation of projection 
range, calculation of projection speed, calculation of trajectory, and quitting LOCAT. Then 
the corresponding function is executed. 
 
2.1.2 Inputs 
Input is an alphanumeric character specified by the user through the keyboard. 
 
2.1.3 Processing 
The function displays the message: 
1. Calculate Projection Range 
2. Calculate Projection Speed 
3. Calculate Trajectory 
4. Quit 
Make your choice 
 
The user provides the choice. If the choice is ?1? the function Range will be initiated, if the 
choice is ?2? the function Speed will be initiated, if the choice is ?3?, the function Trajectory 
will be initiated, if the choice is ?4?, the  function quits.  For all other options the function 
?Error? will be initiated. 
 
2.1.4 Outputs 
Interface messages. 
? 
c2-4 
c2-1 
e1 
e2-1 
c2-2 e2-2 
c2-3 
e2-3 
e2-4 
e2-5 c2-5 
 
55 
 
3.6.1 Identified Causes, Effects, Logical Relationships, and Constraints for the Sample SRS 
 After screening and analyzing the sample SRS, six causes and six effects were 
identified, as shown in Table 3-2 and Table 3-3, respectively. The first cause in Table 
3-2 is inferred from the context of the sample SRS, which is not shown in Figure 3-3. 
The identified constraints and their explanation are summarized in Table 3-4. 
Table 3-2: Identified Causes for the Sample SRS 
Cause 
Index Explanation 
Assigned 
Identifier 
1 The user runs LOCAT.  c1 
2 The user?s choice is ?1?. c2-1 
3 The user?s choice is ?2?. c2-2 
4 The user?s choice is ?3?. c2-3 
5 The user?s choice is ?4?. c2-4 
6 The user?s choice is others except ?1?, ?2?, ?3?, and ?4?. c2-5 
 
Table 3-3: Identified Effects for the Sample SRS 
Effect 
Index Explanation 
Assigned 
Identifier 
1 Interface message is displayed on the screen to indicate that Function Interface is initiated. e1 
2 Function Range is initiated. e2-1 
3 Function Speed is initiated. e2-2 
4 Function Trajectory is initiated. e2-3 
5 Function Interface quits. e2-4 
6 Function Error is initiated. e2-5 
  
 
56 
 
Table 3-4: Identified Constraints for the Sample SRS 
Index Constraint Explanation 
1 c2-1 requires c1. The user cannot choose option ?1? until LOCAT is initiated. 
2 c2-2 requires c1. The user cannot choose option ?2? until LOCAT is initiated. 
3 c2-3 requires c1. The user cannot choose option ?3? until LOCAT is initiated. 
4 c2-4 requires c1. The user cannot choose option ?4? until LOCAT is initiated. 
5 c2-5 requires c1. The user cannot choose any other option until LOCAT is initiated. 
6 c2-1, c2-2, c2-3, c2-4, and c2-5 are mutually exclusive. The user can only choose one option at a time. 
 
3.6.2 Graphical Expression of CEG for the Sample SRS 
The figure below shows the CEG constructed for the sample SRS. 
 
Figure 3-4: Graphical Expression of CEG for the Sample SRS 
3.6.3 Mathematical Expression of CEG for the Sample SRS 
 In contrast with the graphical expression, the following is the mathematical 
expression of CEG for the sample SRS: 
 
57 
 
?????????? ? ??????????, ?????????, ?????????, ???????????, 
Where 
????????? ? ???;?????; ???2; ???3; ???4; ???5? 
????????? ? ???;?????; ???2; ???3; ???4; ???5? 
????????? ?
 ??
 ?
 ??
 ?? ?? ??;???? ?? ?? ? ????;???2 ?? ?? ? ???2;???3 ?? ?? ? ???3;???4 ?? ?? ? ???4;???? ?? ?? ? ???5 ?
 ??
 ??
  
?????????? ?
 ??
 ?
 ??
 ????????????, ???;
 ????????????, ???;
 ???????????3, ???;
 ???????????4, ???;
 ??????????????, ????, ???3, ???4, ???5???
 ?
 ??
  
Figure 3-5: Mathematical Expression of CEG for the Sample SRS 
3.7 Summary 
 This chapter focuses on exploring the advantages and disadvantages of CEGA. 
Several attempts to enhance CEGA as a scalable software reliability measurement are 
discussed. Especially, the mathematical expression of CEGs is defined in terms of 
well understood mathematical entities, such as sets and Boolean formula, whose 
semantics are formally defined, and can be easily stored and processed by computers. 
Though informal, unscalable, and unnecessary in our approach, the graphical 
expression of CEGs helps project stakeholders to find, illustrate, and analyze the 
software functional requirements, and ease the communication among different 
project roles. It is desirable to develop a tool that will allow convenient conversion 
between these two CEG formats. 
 
58 
 
Chapter 4:  Identification of Faults in Software Requirements 
Specifications Using CEGA 
 The starting point for CEGA is the Software Requirements Specifications (SRS).  
 The SRS is the first definitive representation of the capability that the provider is 
to deliver to the user or acquirer. The SRS becomes the basis for all a project's 
subsequent management, engineering, and assurance activities. As such, it is a strong 
source of potential risks that could adversely impact the project's resources, 
schedules, and products. Because of the criticality of the SRS, it is important to 
prevent or correct shortcomings in both the form and content of the SRS document 
before it is established as a project baseline. 
 Since most of software faults can be traced to faulty functional requirements, it is 
obvious that the major opportunity for improving the quality of software systems lies 
in improving the quality of SRS. It would also benefit the entire project team if there 
is one, clear, detailed, testable set of requirements that they can work from. 
 Existing SRS quality improvement methods force all SRS analyzers to rely on 
nonsystematic techniques to search for a wide variety of SRS defects. CEGA is 
broadly recognized for its ability to detect incomplete and ambiguous requirements. 
However, there were no specific rules found in the literature on how to use the power 
of CEGA for SRS faults detection.  
 The aim of the work described in this chapter is to develop CEGA-based 
techniques for natural language SRS faults detection. 
 
59 
 
4.1 Definition, Contents, and Organization of SRSs 
 The Software Requirements Specification (SRS) is defined as ?a specification for 
a particular software product, program, or set of programs that performs certain 
functions in a specific environment? [7]. It is an outcome of the requirement analysis 
process. A well-designed, well-written SRS accomplishes four major goals: 
1) It provides feedback to the customer.  
2) It decomposes the problem into component parts 
3) It serves as an input to the design specification.  
4) It serves as a product validation check.  
 Usually, SRS is assumed to be a document, although it can be a database or 
spreadsheet that contains the requirements, or information stored in a commercial 
requirements management tool. It typically consists of descriptions for functional and 
non-functional requirements of the future system: 
? Functional Requirement: a functional requirement is a requirement defining 
functions of the system under development  
? Non-functional requirement: a non-functional requirement is a requirement 
characterizing a system property such as expected performance, robustness, 
usability, maintainability, etc. Non-functional requirements capture business 
goals/objectives and product quality attributes.  
 Software requirements are usually expressed in the form of either formal language 
or natural language. Despite the remarkable advancements in the design of user-
 acceptable formal languages, the vast majority of SRSs for software projects are still 
 
60 
 
written in plain English (or in other natural languages) due to its flexibility, 
expressiveness, communicability, and ease of change. 
 There are several standards proposed for organizing the contents of SRS written 
in natural language: NASA-STD-2100-91 [64], MIL-STD-498 Section 5.3 [65], 
ISO/IEC 12207 Section 5.3.2 [66], and IEEE Std. 830-1998 [53], etc. Among them, 
IEEE Std. 830-1998 [53] is most widely adopted in industry. Several sample SRS 
outlines are presented in this standard. These sample templates are not standard and 
are provided to help the user in organizing the requirements specification document 
and to help him in improving the readability of the document.  Figure 4-1 shows a 
prototype SRS outline recommended by IEEE Std. 830-1998 [53]. 
 
61 
 
 
Figure 4-1: Prototype Outline of SRS (extracted from IEEE Std. 830-1998 [53]) 
  
Table of Contents 
1     Introduction 
      1.1     Purpose 
      1.2     Scope 
      1.3     Definitions, Acronyms, and Abbreviations 
      1.4     References 
      1.5     Overview 
2    General Description 
      2.1     Product Perspective 
      2.2     Product Functions 
      2.3     User Characteristics 
      2.4     General Constraints 
      2.5     Assumptions and Dependencies 
3    Specific Requirements 
      3.1     Functional Requirements 
                3.1.1     Functional Requirement 1 
                             3.1.1.1 Introduction 
                             3.1.1.2 Inputs 
                             3.1.1.3 Processing 
                             3.1.1.4 Outputs 
                3.1.2     Functional Requirement 2 
     ?. 
       3.2    External Interface Requirements 
                3.2.1     User Interfaces 
                3.2.2     Hardware Interfaces 
                3.2.3     Software Interfaces 
                3.2.4     Communication Interfaces 
       3.3    Performance Requirements 
       3.4    Design Constraints 
                3.4.1     Standards Compliance 
                3.4.2     Hardware Limitations 
     ?. 
       3.5    Attributes 
                3.5.1     Security 
                3.5.2     Maintainability 
                ?. 
       3.6    Other Requirements 
                3.6.1     Data Base 
                3.6.2     Operations 
                3.6.3     Site Adaptation 
    ?. 
 
62 
 
4.2 Characteristics of a ?Good? SRS 
 There is no standard definition for what is a ?good? SRS. Table 4-1 shows the 
fundamental characteristics of a ?good? SRS proposed by Hammer [67].  
Table 4-1: Ten Language Quality Characteristics of an SRS (Adapted from [67]) 
Quality 
Characteristic Explanation 
Complete SRS defines precisely all the go-live situations that will be encountered and the system's capability to successfully address them. 
Consistent 
SRS capability functions and performance levels are compatible, and the 
required quality features (security, reliability, etc.) do not negate those 
capability functions. For example, the only electric hedge trimmer that is 
safe is one that is stored in a box and not connected to any electrical cords 
or outlets. 
Accurate 
SRS precisely defines the system's capability in a real-world environment, 
as well as how it interfaces and interacts with it. This aspect of 
requirements is a significant problem area for many SRSs. 
Modifiable 
The logical, hierarchical structure of the SRS should facilitate any 
necessary modifications (grouping related issues together and separating 
them from unrelated issues makes the SRS easier to modify). 
Ranked 
Individual requirements of an SRS are hierarchically arranged according to 
stability, security, perceived ease/difficulty of implementation, or other 
parameter that helps in the design of that and subsequent documents. 
Testable 
An SRS must be stated in such a manner that unambiguous assessment 
criteria (pass/fail or some quantitative measure) can be derived from the 
SRS itself. 
Traceable Each requirement in an SRS must be uniquely identified to a source (use case, government requirement, industry standard, etc.) 
Unambiguous SRS must contain requirements statements that can be interpreted in one 
 
63 
 
Quality 
Characteristic Explanation 
way only. This is another area that creates significant problems for SRS 
development because of the use of natural language. 
Valid 
A valid SRS is one in which all parties and project participants can 
understand, analyze, accept, or approve it. This is one of the main reasons 
SRSs are written using natural language. 
Verifiable 
A verifiable SRS is consistent from one level of abstraction to another. 
Most attributes of a specification are subjective and a conclusive 
assessment of quality requires a technical review by domain experts. Using 
indicators of strength and weakness provide some evidence that preferred 
attributes are or are not present. 
4.3 Faults in SRS 
 In practice, a perfect SRS without any faults is not easy to achieve. Particularly, 
SRSs written in a natural language are frequently wordy and unstructured, making 
them vulnerable to ambiguity, incompleteness, or self-contradiction. Figure 4-2 
depicts a generic relationship between desired and actually documented 
specifications, which is commonly seen in practice. 
Omissions Surprises
 Incorrectly 
documented
 Correctly 
documentedDesired 
specifications
 Actually documented 
specifications
  
Figure 4-2: Desired vs. Actually Documented Requirements Specifications 
 
64 
 
 An SRS Fault is a fault that originates in the requirements phase (e.g., omitted 
requirement, incomplete requirements description). Typical SRS faults found in 
practice are: 
 
 Noise: the presence of text that carries no relevant information to any feature 
of the problem.  
 Silence: a feature that is not covered by any text. 
 Over-specification: text that describes a feature of the solution, rather than the 
problem. 
 Contradiction: text that defines a single feature in a number of incompatible 
ways. 
 Ambiguity: text that can be interpreted in at least two different ways. 
 Forward reference: text that refers to a feature yet to be defined. 
 Wishful thinking: text that defines a feature that cannot possibly be validated. 
 Jigsaw puzzles: e.g. distributing requirements across a document and then 
cross-referencing. 
 Inconsistent terminology: inventing and then changing terminology. 
 A more thorough taxonomy of SRS faults was defined by Hays [68], as presented 
in Table 4-2.  
Table 4-2: Taxonomy of SRS Faults (Excerpted from [68]) 
Major Fault Sub-faults Description of Sub-Faults 
Incompleteness Incomplete functional 
Decomposition 
Failure to adequately decompose a more abstract 
specification. 
Incomplete Functional 
Description 
Failure to fully describe all requirements of a 
function. 
 
65 
 
Major Fault Sub-faults Description of Sub-Faults 
Omitted/Missing Omitted functional 
Requirement 
Failure to specify one or more of the next lower 
levels of abstraction of a higher level specified. 
Missing External 
Constants 
Specification of a Missing value or variable in a 
requirement. 
Missing Description of 
Initial System State 
Failure to specify the initial system state, when that 
state is not equal to 0. 
Incorrect Incorrect External 
Constants 
Specification of an incorrect value or variable in a 
requirement. 
Incorrect Input or 
Output Descriptions 
Failure to fully describe system input or output. 
Incorrect Description 
of Initial System State 
Failure to specify the initial system state, when that 
state is not equal to 0 
Incorrect Assignment 
of Resources 
Over-or-under stating the computing resources 
assigned to a specification. 
Ambiguous Improper Translation Failure to carry detailed requirement through 
decomposition process, resulting in ambiguity in the 
specification. 
Lack of Clarity Difficult to understand or lack of clarity and 
therefore ambiguous. 
Infeasible 
(None) 
Requirement, which is unfeasible or impossible to 
achieve given other system factors, e.g., process 
speed, memory available. 
Inconsistent Internal Conflicts Requirements that are pair-wise incompatible. 
External Conflicts Requirements of cooperating systems, or 
parent/embedded systems, which taken pair-wise are 
incompatible. 
Over-
 specification (None) 
Requirements or specification limits that are 
excessive for the operational need, causing 
additional system cost. 
Not Traceable (None) Requirement which cannot be traced to previous or subsequent phases. 
Unachievable 
Item (None) 
The functional description cannot be true in the 
reasonable lifetime of the product. 
Non-Verifiable (None) The requirement description cannot be verified by any reasonable testing methods. 
Misplace (None) Information which is in a different section in requirements document. 
International 
Deviation (None) 
The Requirement that is specified at higher level but 
intentionally deviated at lower level from 
specifications. 
Redundant or 
Duplicate (None) 
Requirement was already specified elsewhere in the 
specification 
 The study of Hays [68] reported an empirical categorization percentage data, as 
illustrated in Figure 4-3. The top three categories accounted for almost 80% of the 
 
66 
 
requirement faults evaluated. These three fault categories and their percentages were: 
Incompleteness (21%), Omitted/Missing (33%), and Incorrect (24%). Most of these 
faults were related to functional requirements. 
 
Figure 4-3: Requirements Fault Categorization Percentage Data 
4.4 V&V Techniques for SRS Faults Detection  
 The quality of SRS can be improved and costs and risks can be controlled by 
performing Verification &Validation (V&V) early in the development process. 
According to IEEE Std. 1012-2004 [69], Software Verification and Validation (V&V) 
is the process of ensuring that software being developed or changed will satisfy 
functional and other requirements (validation) and each step in the process of building 
the software yields the right products (verification). The main V&V techniques for 
SRS faults detection are: 
? Inspection: SRS Inspection involves a team of people, led by a leader, which 
formally reviews the SRS. The SRS is presented in front of the inspection team. 
The bugs that are detected during the inspection are communicated to the next 
Omitted/Missing
 33%
 Incorrect
 24%
 Incompleteness
 21%
 Over-
 specification
 6%
 Ambiguous
 6%
 Inconsistent
 5%
 Infeasible
 2%
 Not Traceable
 1%
 Others
 2%
 
67 
 
level in order to take care of them. The objective of an SRS inspection is to detect 
and identify defects. An SRS inspection is a rigorous peer examination that: 
o identifies nonconformance with respect to specifications and standards; 
o uses metrics to monitor progress; 
o ignores stylistic issues; 
o does not discuss solutions. 
? Walkthroughs: Walkthroughs can be considered similar to inspections without the 
formal preparation (of any presentation or documentations) aspect. During the 
walkthrough meeting, the presenter/author introduces the material to all the 
participants in order to make them familiar with it. Even though the walkthroughs 
can help in finding potential bugs, they are used for knowledge sharing or 
communication purpose. An SRS walkthrough should attempt to identify defects 
and consider possible solutions. In contrast with other forms of review, secondary 
objectives are to educate, and to resolve stylistic problems. 
? Technical reviews: The objective of an SRS technical review is to evaluate the 
SRS, and provide management with evidence that: 
o the SRS has been produced according to the project standards and procedures; 
o changes have been properly implemented, and affect only those system areas 
identified by the change specification. 
? Buddy checks: This is the simplest type of review activity used to find out bugs in 
an SRS. In a buddy check, one person goes through the SRS prepared by another 
person in order to find out bugs which the author couldn?t find previously. 
 
68 
 
 Software inspection is one of the best practices for detecting and removing defects 
early in the software development process. In a software inspection, review is first 
performed individually by several reviewers to analyze all or part of the specifications 
and search for defects, and then by a meeting of the reviewers and author(s) to collect 
defects. Usually, reviewers use Ad Hoc, Checklist-Based Reading (CBR), or 
Perspective-Based Reading (PBR) methods to uncover defects. These methods force 
all reviewers to rely on nonsystematic techniques to search for a wide variety of SRS 
defects. 
4.5 CEGA-based Techniques for SRS Faults Detection  
 A problem arising in existing techniques mentioned previously is ?how to 
systematically cover requirements (especially functional requirements)?, or ?what 
actions have to be taken to ensure a review completely and adequately covers all the 
requirements called for by users and producers?? According to the analysis in Chapter 
3, CEGA might be a good answer to this question. 
 CEGA is broadly recognized by its ability to ?aid[s] in identifying requirements 
that are incomplete and ambiguous? [42]. However, there have been no specific rules 
found in the literature on how to use the power of CEGA for SRS faults detection. 
Addressing to this issue, we proposed a CEGA-based approach for practitioners to 
detect faults in SRS. 
 Our approach consists of a two-step process. The initial step includes CEG 
construction and an optional ambiguity review, which is performed by someone who 
is not a domain expert. This step takes place after the SRS reaches first draft. In this 
step, the SRS analyst is not reading the requirements for content, but only to identify 
 
69 
 
ambiguities in the logic and structure of the wording. This review finds all of the 
generic ambiguities such as unclear references. Since the initial reviewer is not a 
domain expert they cannot read into the specification facts that are not explicitly there. 
 Once the issues identified in the initial ambiguity review have been addressed, the 
requirement is then reviewed for content (i.e., correctness and completeness) by 
domain experts using the CEG validation algorithm and the related rules. 
4.5.1 CEGA-based SRS Faults Taxonomy 
 CEG is a model to capture the functional requirements specified in an SRS. In 
other words, a CEG should ?faithfully? (to the best knowledge of its constructor(s)) 
represent the functional requirements stated in an SRS, no matter whether they 
contain faults or not. When the functional requirements are translated into a CEG, 
the faults contained in these requirements should be ?mapped? into the CEG as well. 
These faults fall into one or more of the following fault categories in terms of the 
CEG: 
1) Missing Effect; 
2) Extra Effect; 
3) Missing Constraint; 
4) Extra Constraint; 
5) Wrong Boolean Function, including: 
i. Wrong-Boolean-Function Case 1: Missing cause(s) in a Boolean function; 
ii. Wrong-Boolean-Function Case 2: Extra cause(s) in a Boolean function; 
iii. Wrong-Boolean-Function Case 3: Wrong Boolean operator(s) in a Boolean 
function. 
 
70 
 
 The description of the faults categories is summarized in Table 4-3. 
Table 4-3: Categories of SRS Faults in Terms of CEG 
Fault Category Description 
Missing-Effect Omission of an effect in CEG. 
Extra-Effect Introduction of an effect that is not desired in CEG. 
Missing-Constraint Omission of a constraint in CEG. 
Extra-Constraint Introduction of a constraint that is not desired in CEG. 
Wrong-Boolean-Function 
Case 1: Missing-cause 
Omission of at least a cause in the expression of a 
Boolean function. 
Wrong-Boolean-Function 
Case 2: Extra-cause 
Introduction of at least a cause into the expression of a 
Boolean function.
 Wrong-Boolean-Function 
Case 3: Wrong-Boolean-
 operator  
 
Incorrect use of at least a Boolean logic operator in the 
expression of a Boolean function.
  
4.5.2 Detecting SRS Faults by CEG Construction and Optional Ambiguities Review 
 CEGA consists of a manual step of transforming an SRS into a CEG, a more 
concise and structured representation. The transformation process itself is a form of 
inspection. For example, the CEG tends to force awareness of the ?Else? conditions 
that weren?t explicitly articulated in the structured English. We will revisit the topic 
on detecting SRS faults during CEG construction in Chapter 7. 
 According to our experience, the following SRS faults are usually found when 
constructing CEG: 
? Ambiguities: functional requirements which are difficult to understand or lack 
clarity, such as ambiguous statements caused by implicit connectors or 
 
71 
 
precedence of relation, ambiguous boundary, ambiguous scope of negation, 
and ambiguous reference. 
? Redundancies: requirements that were already specified elsewhere in the 
specification, such as unnecessary aliases. 
? Inconsistencies: pair-wise incompatible functional requirements. 
? Incompleteness: failure to fully describe all requirements of a function. 
 Optionally, a technique called Ambiguity Review can be applied to eliminate 
potential ambiguities in an SRS prior to the review of requirements for content by the 
domain experts. An Ambiguity Review is a test of an SRS to insure that requirements 
are written in a clear, concise and unambiguous manner. The intent of the Ambiguity 
Review is to provide the domain experts with a better quality set of requirements to 
work from, so they can better identify missing requirements, and improve the content 
(completeness and accuracy) of all requirements. After the ambiguities are identified, 
it is the responsibility of the requirements author to correct the ambiguities, and then 
have the domain experts review the requirements for content. 
  BenderRBT? Inc. [70] developed an Ambiguity Review technique. The key of 
this technique is to define a review checklist of 15 ambiguity problems commonly 
found in an SRS. Many ambiguities referred to in the Ambiguity Review Checklist 
items can be identified by looking for key words and phrases in the requirements. The 
list of words pointing to potential ambiguities is given in Appendix A (adapted from 
[70]). 
 Ambiguity Review improves the quality of requirements so that the domain 
experts have a better quality document to work from, and help them make whatever 
 
72 
 
changes are needed to the requirements content, so that requirements are not missed. 
 It should be noted that CEG construction and ambiguities review can usually 
detect simple linguistic faults. Other methods/techniques such as CEG validation are 
needed for detecting implicit faults in an SRS. 
4.5.3 Detecting More Implicit SRS Faults by CEG Validation 
 Validating a CEG consists of checking for the existence of the types of CEG 
faults mentioned previously in Section 4.5.1. Domain knowledge is needed to 
perform CEG validation. 
 The suggested procedure for CEG validation is shown in Figure 4-4. The detailed 
rules for identifying each type of faults are listed in the following, which were also 
summarized in our previous research [40] (to be printed). 
4.5.3.1. Rules for Identifying Missing Effect(s) in CEG: 
 The knowledge required to identify missing effects is hard to define since some 
missing effects are obvious while others are obscure. Generally, the mastery of the 
operation mechanism of the system is required to find an obscure missing effect. 
 There is no way to give a concrete process or rule for identifying missing effects. 
4.5.3.2. Rules for Identifying Extra Effect(s) in CEG 
 To identify extra effects, the inspector should be capable of understanding the 
physical meaning of the effect and determining whether the effect is necessary or not. 
An unnecessary effect is an extra effect. 
 
73 
 
 
Figure 4-4: CEG Validation Algorithm 
 
74 
 
4.5.3.3. Rules for Identifying Missing Constraint(s) in CEG 
 To identify missing constraints, the inspector should be capable of understanding 
the physical meaning of all causes and effects and determining whether any constraint 
is required to confine these causes/effects or not. 
 The process for identifying missing constraints is: 
1) Arrange all causes in a time sequence. 
2) If two cause events occur in a sequential manner, the ?REQUIRE? constraint 
should have been applied to them. If not, it is a missing constraint. 
3) For those causes that occur simultaneously, examine whether ?EXCLUSIVE?, 
?INCLUSIVE?, or ?ONE-AND-ONLY-ONE? constraints might have been 
missed. 
4) Arrange all effects in a time sequence. 
5) For those effects that can occur simultaneously, examine whether there is any 
risk for their co-existence. If so, the ?MASK? constraint should have been 
applied to them. If not, it is a missing constraint. 
4.5.3.4. Rules for Identifying Extra Constraint(s) in CEG 
 To identify extra constraints, the inspector should be capable of understanding the 
physical meaning of all causes or effects in a constraint and determining whether the 
constraint is necessary or not. 
 The process for identifying extra constraints is: 
1) Arrange all causes in a time sequence. 
2) If two cause events do not occur in a sequential manner, the "REQUIRE" 
constraint should not be applied to them. If applied, it is an extra constraint. 
 
75 
 
3) If two or more events do not occur simultaneously, "EXCLUSIVE", 
"INCLUSIVE" or "ONE-AND-ONLY-ONE" constraints should not be 
applied to them. If applied, it is an extra constraint. 
4) Examine the "MASK" constraints one by one and determine if each is 
necessary or not. If not, it is an extra constraint. 
4.5.3.5. Rules for Identifying Wrong Boolean Function(s) in CEG 
 To identify a wrong Boolean function, the inspector should be capable of 
understanding the physical meaning of all causes or effects. In addition, the inspector 
should have mastered the operation mechanism of the system to determine what 
logical relationships should be applied to the causes. 
 The process for identifying extra constraints is: 
1) Consider one Boolean function at a time. 
2) Check the causes in the Boolean function one-by-one and determine whether a 
cause is necessary or not. An unnecessary cause is an extra cause in a Boolean 
function. 
3) Consider the remaining causes in the CEG. If any cause should have been 
involved in the Boolean function, it is a missing cause. 
4) Consider other possible causes not included in the CEG. If any cause should 
have been involved in the Boolean function, it is a missing cause. 
5) Check all Boolean operators in the Boolean function to identify incorrect 
one(s). 
 
76 
 
4.6 Summary 
 The SRS is a model of what the user wants. A consistent, complete, precise, and 
understandable SRS is the basic premise for the product lifecycle activities, such as 
analysis, design, coding, testing, use, and maintenance. A software program might be 
unreliable if it is an implementation of an imperfect SRS. Especially, ambiguous 
requirements will not yield a satisfactory final product and will likely lead to cost 
overruns, extended schedules, and missed deliverable deadlines.  
 In recent years, many semiformal and formal languages such as UML [71], Z-
 Notation [72], and B-Method [73] have been developed in an attempt to reduce 
ambiguity, inconsistency, and incorrectness in requirements descriptions. A drawback 
to these languages, however, is that they are difficult for non-experts to understand, 
which limits their practical application. Natural language, despite its inherent 
ambiguity, continues to be the most common way to express software requirements 
because natural language SRSs can be shared easily among various people involved 
in the software development process and used in several product development phases. 
 Empirical studies, such as [74], indicate that the overall SRS inspection 
performance can be improved when individual reviewers use systematic procedures 
to address to a small set of specific issues. This contrasts with the usual practice, in 
which reviewers have neither systematic procedure nor clearly defined 
responsibilities. The disciplined methods proposed in this chapter can be used for the 
systematical analysis of natural language SRS and the detection of SRS faults. 
 
77 
 
4.6.1 Advantages of our Methods 
 Existing techniques used for natural language SRS fault detection fail to ensure a 
complete and adequate coverage of all functional requirements specified in an SRS. 
Our approach distinguishes itself from other SRS fault detection methods by its 
CEGA-based attribute, which is rigid, systematic, and with 100 percent coverage of 
functional requirements. 
 Using our CEGA-based methods, faults residing in an SRS can be detected not 
only by CEG construction and an optional ambiguity review, but also by 
systematically validating the constructed CEG. 
 Compared with commonly used SRS reading techniques, such as ad hoc, and 
checklist-based reading techniques, our approach provides a more systematic and 
clearer path for inspectors to follow. This is because CEG is uniform, repeatable, and 
reliable (when CEG is expressed in mathematical form), and gives a better way for 
people to communicate (when CEG is expressed in graphical form). 
 Realistically, one cannot expect to identify types of SRS faults that he or she 
never ever has thought about or come across. The contribution of our approach (in 
particular, the CEGA-based SRS faults taxonomy) lies in providing a systematic way 
to explore this implicitly existing knowledge by using heuristics and in increasing the 
requirements engineer?s awareness of the problematic areas in an SRS. 
4.6.2 Limitations of our Methods 
 Similar to other reading methods, the effectiveness of our approach highly 
depends on the inspector?s knowledge of the system. The more he/she knows the 
system, the higher the probability that he/she finds fault(s) in an SRS. Any relevant 
 
78 
 
resources, such as the user specification document, an end-user, an analyst and so on, 
help the inspector improve his/her understanding of the system and identify fault(s) in 
a CEG. Training is also helpful. 
 Besides, CEG construction, ambiguity review, and CEG validation are carried out 
by human reviewers who read SRSs, look for faults, and document the results. The 
clerical activities are boring, time consuming, and often ineffective. It is desirable to 
develop an automated tool which 
? allows requirements engineers to perform an initial parsing of requirements by 
automatically detecting potential linguistic defects that can cause ambiguity 
problems at later stages of software product development. 
? extracts structured information and metrics for detecting linguistic 
inaccuracies and defects 
? provides support for the consistency and completeness analysis of the 
requirements. 
 
79 
 
Chapter 5:  Quantification of the Impact of Faults on Software 
Reliability 
 Knowing that software is sufficiently reliable is necessary before we can make 
intelligent decisions about its use. This is clear for safety-critical and mission-critical 
systems, where we need to be sure that software failures will not incur unacceptable 
loss of human life. It is less clear, but also important, in more mundane applications 
where it must be decided whether the trade-off between new functionality and 
possible loss of reliability is cost-effective. 
 Quantification of software reliability can help organizations make informative 
decisions about corrective actions, about their ability to stay on target, and reach 
goals. This chapter describes techniques proposed for quantifying software reliability 
on the basis of CEGA. 
 Since the value of the cause-effect measure, CE(%), is subjectively determined 
and using faults identified in products instead of the aggregated number of faults was 
believed to provide a more solid foundation for reliability quantification [40], our 
reliability quantification method is based on the faults identified in SRS during the 
CEGA measurement, but not on the value of CE(%) obtained from the CEGA 
measurement. 
 
5.1 Basic Notations and Definitions 
 The following notations are used throughout the remainder of this dissertation: 
 
80 
 
A-CEG = Actually-implemented Cause Effect Graph, constructed from SRS 
B-CEG = Benchmark Cause Effect Graph, constructed by removing all 
identified faults in A-CEG. 
? AC  = the cause set of A-CEG 
?AE  = the effect set of A-CEG 
?AF  = the Boolean function set of A-CEG 
? ACON  = the constraint set of A-CEG 
?BC  = the cause set of B-CEG 
?BE  = the effect set of B-CEG 
?BF  = the Boolean function set of B-CEG 
?BCON  = the constraint set of B-CEG 
A
 je  = the jth effect in A-CEG. AAj Ee ? , mj ,,2,1 ?? . 
m =
 the number of distinct effects in ? AE . This is also the number of 
distinct effects in ?BE . 
B
 je  = the peer effect in B-CEG corresponding to Aje .  
n = the number of distinct causes in ? ?A BC C?  
A
 jf  = a Boolean function in AF corresponding to Aje .  
B
 jf  = a Boolean function3 in BF corresponding to Bje .  
                                                 
 
 
3 In mathematics, a Boolean function is a function of the form f : Bk ? B, where B = {0, 1} is a 
Boolean domain and k is a nonnegative integer called the arity of the function. 
 
81 
 
X???  = a state combination of all distinct causes in 
? ?A BC C? . 1 2( , , , )nX c c c?
 ??? ? , where
 1     if the  cause is enabled;    1, 2, , n.0    otherwise,
 th
 i
 ic i??? ????
 ?  
kX???  
= the kth  state combination of all distinct causes in 
? ?A BC C? . n 11, 2, , 2 , and ( , , , ),k k k ki nk X c c c? ?
 ???? ? ?  
1     if the  cause is enabled;where     1, 2, , n.0    otherwise,
 thki
 ic i??? ????
 ?  
Definition 5-1: Input 
 An input in this study refers to a combination of states of all causes in either A-
 CEG or B-CEG. 
 Because any cause can take only two values of either ?0? or ?1?, there are 2n  
inputs for a given pair of A-CEG and B-CEG, where n is the total number of distinct 
causes in the input space ? ?A BC C? . 
 Any input falls into one of two mutual exclusive categories: failure-relevant and 
failure-irrelevant. 
Definition 5-2: Failure-relevant input 
 A failure-relevant input is such an input that there exists at least an effect in A-
 CEG, whose outcome in response to this input is different from that of its counterpart 
in B-CEG.  
Definition 5-3: Failure-irrelevant input 
 In contrast with failure-relevant inputs, a failure-relevant input is such an input 
that outputs of all effects in A-CEG against this input are identical to those of their 
 
82 
 
counterparts in B-CEG.  
Definition 5-4: Failure of A-CEG 
 In software testing, a software system is said to ?fail? for a given input if one of 
the system?s actual outputs is in disagreement with expectation. Similarly, a system 
represented by an A-CEG fails if one or more effects in the A-CEG behave 
differently from expectation for a given input. 
5.2 Fundamental Lemma and Overall Algorithm for Quantifying Software Reliability 
Lemma 5-1: Fundamental Lemma 
 Given an A-CEG, the failure probability of a software system is equivalent to the 
occurrence probability of all failure-relevant inputs. Namely, 
 ? ?Probability(  ) Probability -  system fails failure relevant inputs?    (Eq. 5-1) 
Prove: 
 We may infer from the definitions of failure-relevant input and failure-irrelevant 
input that 
? ? ? ?-  -   (  )failure relevant inputs failure irrelevant inputs universal set? ?? , and 
? ? ? ?-  -   (  )failure relevant inputs failure irrelevant inputs empty set? ? ? . 
 According to the Law of Total Probability [75 p. 159], we have 
? ?
 ? ?
 ? ?
 ? ?
 Probability( ) Probability -
 Probability -
 Probability -
 Probability - .
 system fails system fails failure relevant inputs
 failure relevant inputs
 system fails failure irrelevant inputs
 failure irrelevant inputs
 ?
 ?
 ?
 ?  
 
83 
 
(Eq. 5-2) 
 
 Moreover, we may also infer from the definitions of failure-relevant input and 
failure-irrelevant input that 
? ?Probability - 1system fails failure relevant inputs ? , and 
? ?Probability - 0system fails failure irrelevant inputs ? . 
 Therefore, the equation (Eq. 5-2) turns out to be 
? ?
 ? ?
 ? ?
 Probability( ) 1 Probability -
 0 Probability -
 Probability - .
 system fails failure relevant inputs
 failure irrelevant inputs
 failure relevant inputs
 ? ?
 ? ?
 ?
  
 Lemma 5-1 indicates that quantifying a system?s failure probability is equivalent 
to performing the following two sub-tasks: 
1) Determining failure-relevant inputs. 
2) Calculating the occurrence probability of all failure-relevant inputs. 
 Determining failure-relevant inputs can be achieved by examining all possible 
inputs, one input at a time, and comparing the actual outputs of effects to the expected 
outputs for a given input. However, generating expected outputs for all inputs is non-
 trivial. Since there are as many as 2n inputs (n is the number of causes), the use of a 
human expert to create all expected outputs is not only difficult, time-consuming, and 
non-scalable, but also very error-prone. To aid this task, two concepts, B-CEG and 
virtual effect, are introduced in this study.  The use of B-CEG and virtual effects are 
the two pillars of the proposed reliability quantification algorithm. Details about these 
two concepts are given in Section 5.3.1 and Section 5.3.3, respectively. 
 
84 
 
 The task of calculating the occurrence probability of all failure-relevant inputs is 
accomplished by employing the BDD techniques to represent the Boolean logic of all 
relevant inputs and applying a recursive algorithm to calculate the probability of a 
BDD?s top node, as further discussed in Section 5.3.4. 
 The overall algorithm for predicting software reliability is shown in Figure 5-1. 
 
Figure 5-1: CEGA-based Software Reliability Prediction Algorithm 
 
85 
 
5.3 Determination of Failure-relevant Inputs 
5.3.1 Introduction of B-CEG 
 The idea of using B-CEG for failure-relevant inputs identification was inspired by 
software testing automation. In software testing, the mechanism used to generate 
expected results is called an oracle. An oracle is any program, process, or data that 
provides the test designer with the expected result of a test [62]. The oracle provides 
the ability to automatically determine whether tests have passed or failed. Typical 
oracles are [76]: 
? Manual verification of results (human ?eye-ball? oracle) 
? Separate program implementing the same algorithm 
? Simulator of the software system to produce parallel results 
? Debugged hardware simulator to emulate hardware and software operations 
? Earlier version of the software 
? Check of specific values for known responses 
 The use of test oracle in software testing is depicted in Figure 5-2. The test oracle 
is usually costly and difficult to create. 
 
Figure 5-2: Software Testing Using Test Oracle 
 
86 
 
 Similarly, we introduce an artifact, called Benchmark Cause-Effect-Graph (B-
 CEG), to facilitate the process of distinguishing failure-relevant from failure-
 irrelevant inputs. A B-CEG is a ?faultless? CEG to the best knowledge of SRS 
analyst(s). It is ?closer? than A-CEG to the O-CEG (Oracle Cause-Effect Graph), a 
?perfect? CEG representing the desired system (refer to Figure 4-2), which is very 
hard to obtain in practice as pointed out in many studies. 
 Analogous to a test oracle, B-CEG enables the automation of distinguishing failure-
 relevant inputs from failure-irrelevant. The use of B-CEG for identifying failure-
 relevant inputs is depicted in Figure 5-3.  
 
Figure 5-3: Identifying Failure-relevant Inputs Using B-CEG 
 B-CEG could be constructed either from scratch (called ?addition approach?) or 
by making a copy of A-CEG and then rectifying all identified faults (called 
?subtraction approach?).  In most cases, the ?subtraction approach? is far more 
efficient than the ?addition approach? since there are usually only a few faults in A-
 CEG. To ease the task of BCEG construction, we developed a set of rules (described 
in Section 5.3.2) for the ?subtraction approach?. By following these rules one can 
easily construct a B-CEG provided that an A-CEG and the faults in the A-CEG are 
 
87 
 
known. 
5.3.2 Rules for B-CEG Construction and A-CEG Revision 
 Similar to an A-CEG, a B-CEG is defined by four sets: a cause set ?BC , an effect 
set ?BE , a Boolean function set ?BF , and a constraint set ?BCON . A B-CEG is 
determined if and only all of these four sets are determined. 
 To ease the task of BCEG construction, we developed the rules for determining 
these four sets, as described below. 
5.3.2.1. Determination of ?BC  
 The process for determining ?BC is: 
1) Put all causes in ? AC into ?BC . 
2) If a cause does not appear in ?BF , remove this cause from ?BC . 
3) If a cause does not appear in ? AF and appears in ?BF ,  add this cause into ?BC . 
5.3.2.2. Determination of ?BE  
 The process for determining ?BE is: 
1) Put all effects in ? AE into ?BE . 
2) If there are any detected ?introduction-of-an-undesired-effect? faults in A-
 CEG, for each of these extra effects, update the corresponding Boolean 
 
88 
 
function in ?BF such that it always yields ?0? for any given input. All of these 
extra effects are virtual effects in B-CEG. 
 All extra effects are intentionally left in B-CEG. As such, each of extra 
effects in A-CEG has its counterpart in B-CEG. 
3) If there are any detected ?missing-an-effect? faults in A-CEG, for each of 
these missing effects,  
i. add a new effect identifier into ?BE . 
ii. add an appropriate Boolean function into ?BF . 
iii. add an effect identifier (identical to its counterpart in B-CEG) into  
? AE . This is a virtual effect in A-CEG. As such, the missing effect 
in A-CEG has its counterpart in B-CEG. 
5.3.2.3. Determination of ?BF  
 The process for determining ?BF is: 
1) Put only those Boolean functions of ? AF that are not corresponding to any 
virtual effects in ? AE  into ?BF . 
2) If there are any detected ?wrong-Boolean-function? faults in A-CEG, correct 
these Boolean-functions for ?BF . 
 
89 
 
3) If there are any virtual effects in ?BE , add a new Boolean function for each of 
these virtual effects into ?BE . These Boolean functions should always yield ?0? 
for any input. 
5.3.2.4. Determination of ?BCON  
 The process for determining ?BCON is: 
1) Put all constraints of ? ACON into?BCON . 
2) If there are any detected ?introduction-of-an-undesired-constraint? faults in A-
 CEG, remove these extra constraints from?BCON . 
3) If there are any detected ?missing-a-constraint? faults in A-CEG, add 
appropriate constraints into?BCON . 
 In addition to the rules mentioned above, there are actions that should also be 
taken for A-CEG in case that an effect is missing to enable the automation of 
determining failure-relevant inputs. These actions are summarized in Table 5-1. 
Table 5-1: Faults vs. Actions that should be taken for A-CEG or B-CEG 
Fault in A-CEG Actions taken for B-CEG Actions taken for A-CEG 
Omission of an 
effect 
? Add a new effect identifier 
into ?BE . 
? Add an appropriate Boolean 
function into ? BF . 
? Add an effect identifier 
(identical to its counterpart 
in B-CEG) into  ? AE  . This is 
a virtual effect in A-CEG. 
? Add a Boolean function into 
? AF . This Boolean function 
should always yield ?0? for 
any given input. 
 
90 
 
Fault in A-CEG Actions taken for B-CEG Actions taken for A-CEG 
Introduction of an 
undesired effect 
? Update the corresponding 
Boolean function in ?BF , 
such that it always yields 
?0? for any given input. This 
extra effect is a virtual effect 
in B-CEG and purposely left 
in B-CEG. 
(None) 
Omission of a 
constraint 
? Adding  an appropriate extra 
constraint into ? BCON  (None) 
Introduction of an 
undesired 
constraint 
? Removing this extra 
constraint from ? BCON  (None) 
a wrong Boolean 
function- missing 
a cause 
? Correct this Boolean 
function?s expression for ? BF
 ? If the cause is not in ? BC , 
add a new cause identifier 
into ? BC . 
(None) 
a wrong Boolean 
function- 
containing an 
extra cause 
? Correct this Boolean 
function?s expression for ? BF
 ? If none of Boolean functions 
in ? BF contains this cause, 
remove this extra cause 
from ? BC . 
(None) 
a wrong Boolean 
function- 
containing an 
incorrect logic 
operator  
? Correct this Boolean 
function?s expression for ? BF (None) 
 
5.3.3 Introduction of Virtual Effect for Mating Missing or Extra Effects 
 Determination of the category of an input is achieved by pair-wisely mating all 
effects in A-CEG and B-CEG and pair-wisely comparing their outputs against the 
given input. However, there are two special cases in which there is an effect (of either 
A-CEG?s or B-CEG?s) that could not be mated: 
 
91 
 
Case 1: A-CEG is missing an effect. To rectify this fault, the missing effect 
should be added into B-CEG because B-CEG is the ?faultless? version of 
A-CEG. However, the newly added effect in B-CEG does not have any 
counterpart in A-CEG. 
Case 2: A-CEG has an extra effect. To rectify this fault, the extra effect should be 
removed from B-CEG. However, this extra effect has to be kept in A-
 CEG because A-CEG should truly represent the faulty SRS. Thus the 
extra effect in A-CEG does not have any counterpart in B-CEG after the 
rectification action has been taken. 
 To handle these two special cases, the concept of the ?virtual effect? is introduced. 
A virtual effect is an artifact added into A-CEG or B-CEG such that each effect in A-
 CEG and B-CEG has its counterpart. A virtual effect in A-CEG is corresponding to a 
missing-effect fault; a virtual effect in B-CEG is corresponding to an extra-effect fault. 
The use of virtual effects plays a key role in unifying the process of determining 
failure-relevant inputs (see Section 5.3.4 for details).  
 An example of adding virtual effects into ACEG and BCEG is illustrated in Figure 
5-4. In this example, there are two assumed faults in A-CEG: a missing effect (e2) and 
an extra effect (e3).   
 According to Table 5-1, a virtual effect (e2) is added into ? AE , and the 
corresponding Boolean function ( 2 : 0e ? ) is added into ? AF . Similarly, a virtual effect 
(e3) is added into ?BE , and the corresponding Boolean function ( 3 : 0e ? ) is added into  
?BF . 
 
92 
 
? ? ? ?? ? ? ? ? ?? ?
 ? ? ? ? ? ?
 ? ? ? ? ? ?
 ?
 1 2 3 1 2 3
 1 3 4 1 2 4
 1 1
 3 2 3
 4
 , , ,                , , ,   
, ,                                        , ,
 , ,                                        , ,
 : ;
 : ;
 :
 A A B BA A B BA B
 A B
 A B
 A
 CEG C E F CON CEG C E F CON
 C c c c C c c c
 E e e e E e e e
 e c
 F e c c
 e c
 ?
 ? ?
 ?
 ? ?
 ? ?
 ? ?
 ? ?
 ? ?
 1 1
 2 1 2
 4 33
 : ;
                               : ;
 :
  (  )                           (  )
 B
 A B
 e c
 F e c c
 e c
 CON empty set CON empty set
 ?? ? ? ?? ? ? ?? ?? ? ? ?? ? ? ??? ?? ?
 ? ?
 ?
 ? ?
  
a) before Adding Virtual Effects into A-CEG and B-CEG 
? ? ? ?? ? ? ? ? ?? ?
 ? ? ? ? ? ?
 ? ? ? ? ? ?
 ?
 1 2 3 1 2 3
 1 3 4 1 2 4
 1
 2 3
 2
 1
 3 2
 , , ,                , , ,   
, ,                                        , ,
 , , ,                                   , , ,
 : ;
 : ;
 :
 0
 A A A B B BA BA B
 A B
 A B
 A
 e e
 CEG C E F CON CEG C E F CON
 C c c c C c c c
 E e e e E e e e
 e c
 F e
 e
 c c
 ?
 ?
 ?
 ?
 ? ?
 ? ?
 ? ?
 ? ?
 ? ?
 3
 1 1
 2 1 2
 3
 4 3 4 3
 : ;
 : ;                              ;
 : :
  (  )                           (  
:
 )
 0;
 B
 A B
 e c
 e c cF
 e c e c
 CON empty set CON empty se
 e
 t
 ?? ? ? ?? ? ? ?? ?? ? ? ?? ? ? ?? ? ? ?? ? ? ?? ?? ? ? ?
 ? ?
 ??
 ? ?
  
b) after Adding Virtual Effects into A-CEG and B-CEG 
Figure 5-4: Example of Adding Virtual Effects into A-CEG and B-CEG 
5.3.4 Determination of an Effect?s Output 
 When determining the response (output) of a regular (neither missing nor extra) 
effect, constraints have higher precedence than the corresponding Boolean function. 
If any constraint is applicable to the effect (in case of ?MASK? constraint) or to the 
given input (in case of ?REQUIRE?, ?INCLUSIVE?, ?EXCLUSIVE?, or ?ONE-
 ONLY-ONE? constraint), the effect should yield ?NA? (short for ?Not Allowed?); 
otherwise, the effect?s response should be determined by its Boolean function, taking 
 
93 
 
a value of either ?0? or ?1?. 
 The constraint lemmas (Lemma 3-1 to Lemma 3-6) are very helpful when 
determining if a constraint is applicable to an effect or a given output or not. The 
violation of any of these lemmas caused by a constraint indicates the applicability of 
the constraint to the effect and/or the given input. 
 Apparently, a virtual effect should not be triggered under any circumstance 
because it is not a physical entity. Therefore, the Boolean function corresponding to a 
virtual effect should always yield ?0? for any given input, unless any constraint is 
applicable to the given input (in this case, ?NA? is assigned as the output for the 
virtual effect).  In contrast with a non-virtual effect, which can take a value of ?0?, 
?1?, or ?NA?, a virtual effect can only take a value of either ?0? or ?NA?. Thus, we 
have the following two lemmas in regard to determining the output for a virtual 
effect: 
Lemma 5-2: Missing Effect?s Output  
 If Aje  is a virtual effect in ? AE ,  
 ? ? 0Ajf X ???? , and 
 
? ? ?    if any constraint in  is applicable;0       otherwise.
 AAj NA CONe X
 ??? ???
 ??? ??
  
where X???  is any given input. 
Lemma 5-3: Extra Effect?s Output 
 If Bje  is a virtual effect in ?BE ,  
 
94 
 
 ? ? 0Bjf X ???? , and 
 
? ? ?    if any constraint in  is applicable;0       otherwise.
 BBj NA CONe X
 ??? ???
 ???  
where X???  is any given input. 
 
 With the help of Lemma 5-2 and Lemma 5-3, the process of determining the 
output of an effect (of either A-CEG?s or B-CEG) against a given input can be 
unified, as depicted in Figure 5-5. 
? ? ? ?k kj je X f X???? ??? ? ?kje X NA????
 kX???jekX???je
 ?CON kX???je
  
Figure 5-5: Unified Process for Determining the Output of an Effect 
5.3.5 Algorithm for Determining the Category of an Input 
 An input falls into the category of either failure-relevant or failure-irrelevant. To 
determine the category of an input, outputs of all effects in A-CEG and B-CEG are 
pair-wisely compared. If there is one (or more) effect pair(s) that yields different 
output, the input is failure-relevant; otherwise, it is failure-irrelevant.  The detailed 
 
95 
 
algorithm for determining the category of an input is shown in Figure 5-6.  
 Usually it does not matter in which sequence the effect pairs are chosen to 
examine the category of a given input. One convenient way is to select effect pairs by 
the ascending/descending order of the effect identifiers? subscript. However, in case 
of the presence of any ?MASK? constraint(s) in CON?A  orCON?B , the output of 
?Masker? effect must be determined before the output of the ?Maskee? effect can be 
determined. Otherwise, there is no way to correctly judge if a ?MASK? constraint is 
applicable for a ?Maskee? effect or not.  For instance, in case of ???????, ??? (e1 
masking e2), e1 is a ?Masker? effect and e2 is a ?Maskee? effect. The output for e1 
should be determined before determining that of e2?s. For this case, the selection 
precedence of e1 pair is higher than that of the e2 pair?s. 
 It should be noted that the algorithm shown in Figure 5-6 is ready for automation 
since there are many techniques and automation tools [77][78] available for 
evaluation of Boolean logic formula. These techniques and tools are shown to have 
excellent scalability when being applied to VLSI (Very Large Scale Integrated logical 
circuits) design and test, where there are usually millions of variables (a variable in 
VLSI design and test is equivalent to a cause in this study). It is unlikely that an SRS 
will contain millions of causes, even for a very large-scale system, such as Windows 
Vista?. Therefore, we believe that there should be no scalability issue in determining 
the failure-relevant inputs if we have developed tools based on the algorithm shown 
in Figure 5-6 and taking advantages of the existing tools for Boolean logic formula 
reduction. 
 
 
96 
 
( )kBje X
 ???( )kAje X
 ???
 B
 jeAje
 kX X???? ???
 ? ? ? ?k kA Aj je X f X???? ??? ? ?kAje X NA????
 ? ACON kX??? Aje ? BCON kX??? Bje
 ? ?kBje X NA???? ? ? ? ?k kB Bj je X f X???? ???
 ? ? ? ? ?k kA Bj je X e X???? ???
 kX??? kX???
  
Figure 5-6: Algorithm for Determining the Category of a Given Input kX???   
5.3.6 Examples of Identifying Failure-relevant Inputs 
 In this section, we use a sample A-CEG (shown in Figure 5-7) to illustrate how to 
identify failure-relevant inputs for the seven basic types of A-CEG faults (see Section 
4.5.1 for definitions of the A-CEG fault categories). For the sake of simplicity and 
without loss of generality, assume that there is only one type of faults in each case. 
 
97 
 
? ? ? ?? ?
 ? ? ?
 ? ? ?
 ? ? ?
 ? ? ?
 1 2 3
 1 2
 1 1
 2 1 2 3
 1 2
 , , ,    
, ,  
,  
: ;
 :
 ( , )
 A A AAA
 A
 A
 A
 A
 CEG C E F CON
 C c c c
 E e e
 e cF e c c c
 CON REQUIRE c c
 ?? ?? ?? ?? ? ?? ?? ?
 ?
 ?
 ?
 ?
 ?  
Figure 5-7: Mathematical Expression of the Sample A-CEG
  
5.3.6.1. Case 1: A-CEG is missing an effect 
 Assume that e3 is the missing effect and its Boolean function is 3 3:e c? . 
 According to Table 5-1, e3 and its Boolean function 3 3:e c? are added into B-
 CEG. Besides, a virtual effect with identity e3 is added into A-CEG, and 3 : 0e ?  is 
assigned as its Boolean function. The revised A-CEG and B-CEG for this case is 
illustrated in Figure 5-8. 
? ? ? ?? ? ? ? ? ?? ?
 ? ? ? ? ? ?
 ? ? ? ? ? ?
 ? ? ?
 1 2 3 1 2 3
 1 2 1 2 3
 1 1
 2
 3
 2 3
 3
 1
 , , ,                , , ,   
, ,                                        , ,
 , ,                                        , ,
 : ;
 : ;
 A A A B B BA BA B
 A B
 A B
 A
 CEG C E F CON CEG C E F CON
 C c c c C c c c
 eE e e E e e e
 e c
 F e c c
 e
 c
 ?
 ? ? ?
 ? ?
 ? ?
 ? ?
 ? ? ? ?
 ? ? ? ? ? ?
 3 3
 1 1
 2 1 2 3
 1 2 1 2
 : ;
                   : ;
 ( , )                    ( ,
 : 0 :
 )
 B
 A B
 e c
 F e c c c
 CON REQUIRE c c CON REQ R c
 c
 c
 e
 UI E
 ?? ? ? ?? ? ? ?? ? ?? ? ? ?? ? ? ?? ? ?? ? ?
 ?
 ? ?  
Figure 5-8: Revised A-CEG and B-CEG for Case 1 
 
98 
 
 By applying the unified effect output determination process shown in Figure 5-5, 
we determined and summarized all effects? outputs for this case, as shown in Table 
5-2. In this table, the output that is in disagreement with its counterpart is highlighted 
with a shadow box. 
 Apparently, the failure-relevant inputs for this case are 1 2 3c c c? ?  (k = 2), and 
1 2 3c c c? ?  (k = 4). 
Table 5-2: Effects? Outputs for Case 1 
index, 
k 
Input ??? e1  e2  e3  
c1 c2 c3 of A-CEG?s
 of B-
 CEG?s
 of A-
 CEG?s
 of B-
 CEG?s
 of A-
 CEG?s 
of B-
 CEG?s 
1 0 0 0 0 0 0 0 0 0 
2 0 0 1 0 0 0 0 0 1 
3 0 1 0 0 0 0 0 0 0 
4 0 1 1 0 0 1 1 0 1 
5 1 0 0 NA NA NA NA NA NA 
6 1 0 1 NA NA NA NA NA NA 
7 1 1 0 1 1 1 1 0 0 
8 1 1 1 1 1 1 1 0 1 
 
5.3.6.2. Case 2: A-CEG has an extra effect 
 Assume e1 is the extra effect.  
 According to Table 5-1, a virtual effect with identity e1 is added into B-CEG and 
1 : 0e ?  is assigned as its Boolean function. The A-CEG and B-CEG for this case is 
illustrated in Figure 5-9. 
 
99 
 
? ? ? ?? ? ? ? ? ?? ?
 ? ? ? ? ? ?
 ? ? ? ? ? ?
 ? ? ?
 1 2 3 1 2 3
 1 2 1
 1 1
 2
 2 1 2 3
 , , ,                , , ,   
, ,                                        , ,
 ,                                             ,
 ;
 :
 :
 A A A B B BA BA B
 A B
 A B
 A
 CEG C E F CON CEG C E F CON
 C
 e c
 c c c C c c c
 E e e E e e
 F e c c c
 ??? ? ? ?
 ?
 ?
 ? ?
 ? ?
 ? ?
 ? ? ? ?
 ? ? ? ? ? ?
 2 1 2 3
 1 2 1 2
 1 ;                     :
 ( , )                    (
 : 0
 , )
 B
 A B
 F e c c c
 CON REQUIRE c c CON REQUIRE c c
 e? ? ?? ? ?? ? ?? ? ?? ? ? ??
 ?
 ? ?
 ?
 ? ?
  Figure 5-9: Revised A-CEG and B-CEG for Case 2 
 By employing the unified effect output determination process shown in Figure 
5-5, we determined and summarized all effects? outputs for this case, as shown in 
Table 5-3. In this table, the output that is in disagreement with its counterpart is 
highlighted with a shadow box. 
 Apparently, the failure-relevant inputs for this case are 1 2 3c c c? ?  (k = 7) and 
1 2 3c c c? ?  (k = 8). 
Table 5-3: Effects? Outputs for Case 2 
index, 
k 
Input ??? e1  e2  
c1 c2 c3 of A-CEG?s
 of B-
 CEG?s
 of A-
 CEG?s
 of B-
 CEG?s 
1 0 0 0 0 0 0 0 
2 0 0 1 0 0 0 0 
3 0 1 0 0 0 0 0 
4 0 1 1 0 0 1 1 
5 1 0 0 NA NA NA NA 
6 1 0 1 NA NA NA NA 
7 1 1 0 1 0 1 1 
8 1 1 1 1 0 1 1 
5.3.6.3. Case 3: A-CEG is missing a constraint 
 Assume 1 2( , )MASK e e  is the missing constraint.  
 
100 
 
 According to Table 5-1, 1 2( , )MASK e e  is added into B-CEG. The A-CEG and B-
 CEG for this case is illustrated in Figure 5-10. 
? ? ? ?? ? ? ? ? ?? ?
 ? ? ? ? ? ?
 ? ? ? ? ? ?
 ? ? ?
 1 2 3 1 2 3
 1 2 1 2
 1 1
 2 1 2 3
 , , ,                , , ,   
, ,                                        , ,
 ,                                             ,
 : ;
 :
 A A A B B BA BA B
 A B
 A B
 A
 CEG C E F CON CEG C E F CON
 C c c c C c c c
 E e e E e e
 e cF e c c c
 ???? ? ? ??
 ? ?
 ? ?
 ? ?
 ? ? ? ?
 ? ? ? ?
 1 1
 2 1 2 3
 1 2
 1 2
 1 2
 : ;                     :
 ( , );( , )                 ( ) ,  
B
 A B
 e cF e c c c
 REQUIRE c cCON REQU MASK e eIRE c c CON
 ?? ? ?? ? ?? ? ?? ? ?? ? ? ?? ? ?
 ? ?? ?? ?
 ?
 ? ?
  
Figure 5-10: Revised A-CEG and B-CEG for Case 3 
 By employing the unified effect output determination process shown in Figure 
5-5, we determined and summarized all effects? outputs for this case, as shown in 
Table 5-4. In this table, the output that is in disagreement with its counterpart is 
highlighted with a shadow box. 
 Apparently, the failure-relevant inputs for this case are 1 2 3c c c? ?  (k = 7) and 
1 2 3c c c? ?  (k = 8). 
Table 5-4: Effects? Outputs for Case 3 
index, 
k 
Input ??? e1  e2  
c1 c2 c3 of A-CEG?s
 of B-
 CEG?s
 of A-
 CEG?s
 of B-
 CEG?s 
1 0 0 0 0 0 0 0 
2 0 0 1 0 0 0 0 
3 0 1 0 0 0 0 0 
4 0 1 1 0 0 1 1 
5 1 0 0 NA NA NA NA 
 
101 
 
index, 
k 
Input ??? e1  e2  
c1 c2 c3 of A-CEG?s
 of B-
 CEG?s
 of A-
 CEG?s
 of B-
 CEG?s 
6 1 0 1 NA NA NA NA 
7 1 1 0 1 1 1 NA 
8 1 1 1 1 1 1 NA 
 
5.3.6.4. Case 4: A-CEG has an extra constraint 
 Assume 1 2( , )REQUIRE c c is the extra constraint.  
 According to Table 5-1, 1 2( , )REQUIRE c c  is removed from B-CEG. The A-CEG 
and B-CEG for this case is illustrated in Figure 5-11. 
? ? ? ?? ? ? ? ? ?? ?
 ? ? ? ? ? ?
 ? ? ? ? ? ?
 ? ? ?
 1 2 3 1 2 3
 1 2 1 2
 1 1
 2 1 2 3
 , , ,                , , ,   
, ,                                        , ,
 ,                                             ,
 : ;
 :
 A A A B B BA BA B
 A B
 A B
 A
 CEG C E F CON CEG C E F CON
 C c c c C c c c
 E e e E e e
 e cF e c c c
 ???? ? ? ??
 ? ?
 ? ?
 ? ?
 ? ? ? ?
 ? ? ? ?1
 1 1
 2 1 2 3
 2( , )  (empty s
 : ;                     :
                   et)
 B
 A B
 e cF e c c c
 CON COREQUIR NE c c
 ?? ? ?? ? ?? ? ?? ? ?? ? ? ?? ? ?
 ?
 ?
 ? ?
  Figure 5-11: Revised A-CEG and B-CEG for Case 4 
 By employing the unified effect output determination process shown in Figure 
5-5, we determined and summarized all effects? outputs for this case, as shown in 
Table 5-5. In this table, the output that is in disagreement with its counterpart is 
highlighted with a shadow box. 
 Apparently, the failure-relevant inputs for this case are 1 2 3c c c? ?  (k = 5) and 
1 2 3c c c? ?  (k = 6). 
 
102 
 
Table 5-5: Effects? Outputs for Case 4 
index, 
k 
Input ??? e1  e2  
c1 c2 c3 of A-CEG?s
 of B-
 CEG?s
 of A-
 CEG?s
 of B-
 CEG?s 
1 0 0 0 0 0 0 0 
2 0 0 1 0 0 0 0 
3 0 1 0 0 0 0 0 
4 0 1 1 0 0 1 1 
5 1 0 0 NA 1 NA 1 
6 1 0 1 NA 1 NA 1 
7 1 1 0 1 1 1 1 
8 1 1 1 1 1 1 1 
 
5.3.6.5. Case 5: A-CEG has a wrong Boolean function that is missing a cause 
 Assume the desired Boolean function for effect e1 is 1 1 4:e c c??  rather than
 1 1:e c?  (c4 is the missing cause).  
 According to Table 5-1, the desired Boolean function for e1 is updated in B-CEG. 
The A-CEG and B-CEG for this case is illustrated in Figure 5-12. 
? ? ? ?? ? ? ? ? ?? ?
 ? ? ? ? ? ?
 ? ? ? ? ? ?
 ? ? ?
 41 2 3 1 2
 1 1
 3
 1 2 1 2
 2 1 2 3
 , , ,                , , ,   
, ,                                        , ,
 ,                                             ,
 ;
 :
 ,
 :
 A A B BA A B BA B
 A B
 A B
 A
 CEG C E F CON CEG C E F CON
 C c c c C cc c c
 E e e E e e
 F e c c c
 e c?
 ? ?
 ?
 ?
 ? ?
 ? ?
 ? ?
 ? ? ? ?
 ? ? ? ? ? ?
 2 1 2 3
 1 2 1
 1
 2
 1 4: ;                     :
 ( , )                    ( , )
 B
 A B
 F e c c c
 CON REQUIRE c c CON REQUI
 e c c
 RE c c
 ? ? ?? ? ? ?? ? ? ?? ? ?? ? ? ?? ?
 ? ?
 ? ?
 ?
 ? ?
  Figure 5-12: Revised A-CEG and B-CEG for Case 5 
 By employing the unified effect output determination process shown in Figure 
5-5, we determined and summarized all effects? outputs for this case, as shown in 
 
103 
 
Table 5-6. In this table, the output that is in disagreement with its counterpart is 
highlighted with a shadow box. 
 Apparently, the failure-relevant inputs for this case are 1 2 3 4c c c c? ? ?  (k = 2), 
1 2 3 4c c c c? ? ?  (k = 4), 1 2 3 4c c c c? ? ?  (k = 6), and 1 2 3 4c c c c? ? ?  (k = 8). 
Table 5-6: Effects? Outputs for Case 5 
index, 
k 
Input ??? e1  e2  
c1 c2 c3 c4 of A-CEG?s
 of B-
 CEG?s
 of A-
 CEG?s
 of B-
 CEG?s 
1 0 0 0 0 0 0 0 0 
2 0 0 0 1 0 1 0 0 
3 0 0 1 0 0 0 0 0 
4 0 0 1 1 0 1 1 1 
5 0 1 0 0 0 0 0 0 
6 0 1 0 1 0 1 0 0 
7 0 1 1 0 0 0 0 0 
8 0 1 1 1 0 1 1 1 
9 1 0 0 0 NA NA NA NA 
10 1 0 0 1 NA NA NA NA 
11 1 0 1 0 NA NA NA NA 
12 1 0 1 1 NA NA NA NA 
13 1 1 0 0 1 1 1 1 
14 1 1 0 1 1 1 1 1 
15 1 1 1 0 1 1 1 1 
16 1 1 1 1 1 1 1 1 
 
5.3.6.6. Case 6: A-CEG has a wrong Boolean function which contains an extra cause 
 Assume the desired Boolean function for effect e2 is 2 2 3:e c c? ?  rather than 
? ?12 2 3: ce c c?? ?   (c1 is an extra cause). 
 
104 
 
 According to Table 5-1, the desired Boolean function for effect e2 is updated in B-
 CEG. The revised A-CEG and B-CEG for this case is illustrated in Figure 5-13. 
? ? ? ?? ? ? ? ? ?? ?
 ? ? ? ? ? ?
 ? ? ? ? ? ?
 ? ? ?
 1 2 3 1 2 3
 1 2 1 2
 1 1
 2 1 2 3
 , , ,                , , ,   
, ,                                        , ,
 ,                                           ,
 :
   
: ;
 A A A B B BA BA B
 A B
 A B
 A
 CEG C E F CON CEG C E F CON
 C c c c C c c c
 E e e E e e
 e c
 e cF c c
 ?
 ? ?
 ?
 ? ?
 ??
 ? ?
 ? ?
 ? ?
 ? ?
 ? ? ? ? ? ?
 1 1
 1 2 1
 2 2
 2
 3
 : ;                     
( , )                    ( , )
 :
 B
 A B
 e cF
 CON
 e
 REQUIRE c c CON REQUIR
 c
 E c c
 c
 ? ?? ?
 ? ?
 ?? ? ?? ? ? ??
 ?
 ? ?
  Figure 5-13: Revised A-CEG and B-CEG for Case 6 
 By employing the unified effect output determination process shown in Figure 
5-5, we determined and summarized all effects? outputs for this case, as shown in 
Table 5-7. In this table, the output that is in disagreement with its counterpart is 
highlighted with a shadow box. 
 Apparently, the only failure-relevant input for this case is 1 2 3c c c? ?  (k = 7). 
Table 5-7: Effects? Outputs for Case 6 
index, 
k 
Input ??? e1  e2  
c1 c2 c3 of A-CEG?s
 of B-
 CEG?s
 of A-
 CEG?s
 of B-
 CEG?s 
1 0 0 0 0 0 0 0 
2 0 0 1 0 0 0 0 
3 0 1 0 0 0 0 0 
4 0 1 1 0 0 1 1 
5 1 0 0 NA NA NA NA 
6 1 0 1 NA NA NA NA 
7 1 1 0 1 1 1 0 
8 1 1 1 1 1 1 1 
 
105 
 
5.3.6.7. Case 7: A-CEG has a wrong Boolean function which contains incorrect logic 
operators 
 Assume the desired Boolean function for effect e2 is ? ?2 1 2 3:e c c c? ??  rather 
than ? ?2 1 2 3:e c c c? ?? . 
 According to Table 5-1, the desired Boolean function for effect e2 is updated in B-
 CEG. The A-CEG and B-CEG for this case is illustrated in Figure 5-14. 
? ? ? ?? ? ? ? ? ?? ?
 ? ? ? ? ? ?
 ? ? ? ? ? ?
 ? ? ?
 1 2 3 1 2 3
 1 2 1 2
 1 1
 2 1 2 3
 , , ,                , , ,   
, ,                                        , ,
 ,                                           ,
 :
   
: ;
 A A A B B BA BA B
 A B
 A B
 A
 CEG C E F CON CEG C E F CON
 C c c c C c c c
 E e e E e e
 e c
 e cF c c
 ?
 ? ?
 ?
 ? ?
 ??
 ? ?
 ? ?
 ? ?
 ? ? ? ?
 ? ? ? ? ? ?
 1 1
 1 2 1 2
 2 1 2 3
 : ;                     
( , )                    ( ,
 :
 )
 B
 A B
 e cF
 CON REQUIRE c c CON REQUIRE c
 e c c
 c
 c
 ?? ? ?? ? ?? ? ?? ? ? ?? ? ?? ? ?
 ?
 ? ?
  Figure 5-14: Revised A-CEG and B-CEG for Case 7 
 By employing the unified effect output determination process shown in Figure 
5-5, we determined and summarized all effects? outputs for this case, as shown in 
Table 5-8. In this table, the output that is in disagreement with its counterpart is 
highlighted with a shadow box. 
 Apparently, the only failure-relevant input for this case is 1 2 3c c c? ?  (k = 4). 
  
 
106 
 
Table 5-8: Effects? Outputs for Case 7 
index, 
k 
Input ??? e1  e2  
c1 c2 c3 of A-CEG?s
 of B-
 CEG?s
 of A-
 CEG?s
 of B-
 CEG?s 
1 0 0 0 0 0 0 0 
2 0 0 1 0 0 0 0 
3 0 1 0 0 0 0 0 
4 0 1 1 0 0 1 0 
5 1 0 0 NA NA NA NA 
6 1 0 1 NA NA NA NA 
7 1 1 0 1 1 1 1 
8 1 1 1 1 1 1 1 
5.4 Calculation of the Occurrence Probability of Failure-relevant Inputs 
 According to Lemma 5-1, the event ?An A-CEG fails? is equivalent to the union 
of identified failure-relevant inputs. Therefore, this event can be expressed in the 
form of a fault tree, as depicted in Figure 5-15. 
( ) ( )k kA Bj je X e X?
 ??? ???
 1
 kc
 ? ?1 , ,k k knX X c c? ???? ??? ?
 k
 ic knc
 ( ) is the counterpart of ( )B Aj je X e X
 ??? ???
  
Figure 5-15: Generic Fault Tree Model for A-CEG  
 
107 
 
 Thus the task of calculating the occurrence probability of the event ?an A-CEG 
fails? can be decomposed into two sub-tasks: 
Sub-task 1: Constructing a fault tree representing the union of all failure-relevant 
inputs. 
Sub-task 2: Calculating the occurrence probability of the top event of the fault 
tree. 
 Binary Decision Diagram (BDD) techniques are widely used for fault tree 
analysis. With the help of BDD techniques, the task of calculating the occurrence 
probability of the event ?an A-CEG fails? is achieved by accomplishing the following 
two sub-tasks: 
Sub-task 1: Constructing a BDD for the fault tree representing the union of all 
failure-relevant inputs; 
Sub-task 2: Calculating the probability of the BDD?s top node. 
5.4.1 Representation of a Boolean Expression Using BDD Techniques  
 The use of Binary Decision Diagrams as a representation of Boolean expressions 
is regarded the most powerful approach for fault tree analysis [79] [80]. The BDD 
method does not analyze the fault tree directly, but converts the tree to a binary 
decision diagram, which represents the Boolean equation for the top event [80]. 
 A Binary Decision Diagram (BDD) is a data structure that is used to represent a 
Boolean function. A Boolean function can be represented as a rooted, directed, 
acyclic graph, which consists of decision nodes and two terminal nodes called 0-
 terminal and 1-terminal. Each decision node is labeled by a Boolean variable and has 
 
108 
 
two child nodes called low child and high child. The edge from a node to a low (high) 
child represents an assignment of the variable to 0 (1). Such a BDD is called 'ordered' 
if different variables appear in the same order on all paths from the root. It is called 
'reduced' if the graph is reduced according to two rules [81]: 
? Merge any isomorphic sub-graphs.  
? Eliminate any node whose two children are isomorphic.  
 In popular usage, the term BDD almost always refers to Reduced Ordered Binary 
Decision Diagram (ROBDD) in the literature. The advantage of an ROBDD is that it 
is canonical (unique) for a particular functionality. This property makes it useful in 
functional equivalence checking and other operations like functional technology 
mapping. Figure 5-16 shows an example of a BDD for the Boolean function 
f xy z? ? . 
 
Figure 5-16: Example of BDD for the Boolean Function f xy z? ?  
 BDD-based algorithms offer advantages in terms of accuracy and efficiency [80]: 
? Efficient manipulation of logic. 
? Straightforward treatment of incoherent logic. 
 
109 
 
? Exact quantification?: no need to use rare-event type approximations. 
? Graphical representation of Boolean Expressions. 
 There are several BDD generation software packages for free, such as CUDD (by 
Colorado University), CAL (by UC Berkeley), and BuDDy (by IT-University of 
Copenhagen).  This study uses the BuDDy package [82] to convert the Boolean 
expression of the union of all failure-relevant inputs into a BDD. 
 BuDDy is a powerful library for Boolean expression manipulation; it combines as 
easily as a C++ interface and is an efficient implementation based on the novel BDD 
data structure. Although one can work with BuDDy without understanding what 
BDDs are, it is worth understanding the concept of BDD structure and usage when 
developing large applications that are heavily based on Boolean expression. An 
example of using the BuDDy package to construct a BDD for the Boolean logic of 
?an A-CEG Fails? is given in Appendix B. 
5.4.2 Recursive Algorithm for Calculating the Occurrence Probability of a BDD?s Top Node 
After the BDD that represents the Boolean logic of the union of failure-relevant 
inputs is constructed, the recursive algorithm shown in Figure 5-17 can be used to 
calculate the probability of a BDD?s top node. This recursive algorithm is derived 
from [80][26]. 
The operational profile (see Section 5.4.3) is required to do the calculation, and 
only the operational profile for causes that appears in failure-relevant inputs is 
needed. 
We developed a tool to calculate the occurrence probability of a BDD?s top node. 
 
110 
 
This tool is based on the BuDDy package [82] and the recursive algorithm shown in 
Figure 5-17. The sample source code of using this tool is illustrated in Appendix B. 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Figure 5-17: Recursive Algorithm for Calculating the Occurrence Probability of a 
BDD?s Top Node 
Bdd_Prob_Cal(X) 
/* X = ite (xi, T, F), 
    T = ?True? branch of node xi 
    F = ?False? branch of node xi 
    PT = Probability of ?True? branch reach terminal node ?1? 
    PF = Probability of ?False? branch reach terminal node ?1?  
 */ 
{ 
 /*Consider ?True? branch*/ 
 If T is terminal node ?1? 
  PT = 1.0 
 else if T is terminal node ?0? 
  PT = 0.0 
 else 
 /*Go deeper to find the probability of T by recursively calling this function*/ 
  PT = Bdd_Prob_Cal (T) 
 /*Consider ?False? branch*/ 
 If F is terminal node ?1? 
  PF = 1.0 
 else if F is terminal node ?0? 
  PF = 0.0 
 else 
 /*Go deeper to find the probability of F by recursively calling this function*/ 
  PF = Bdd_Prob_Cal (F) 
  
 Probability[X] = Probability[xi] ? PT + (1- Probability[xi]) ? PF 
 Return (Probability[X]) 
} 
 
111 
 
5.4.3 Operational Profile (OP)  
 It is obvious that the usage of the software is a very important constituent element 
in software reliability quantification. Therefore, the expected usage must be taken 
into consideration when estimating/predicting software reliability [3].  
 There exist several techniques to specify the usage, including Operational Profile 
(OP), and Markov usage model etc [83]. These techniques are different approaches to 
model the software usage in order to specify the same. 
 An operational profile (OP) is the estimated relative frequency for each 
?operation? that a system under test supports [84]. It associates a set of probabilities 
to the program input space and therefore describes the behavior of the system [84]. 
 OP is traditionally evaluated by enumerating field inputs and evaluating their 
occurrence frequencies. Musa [84] pioneered a five-step approach to develop OP. His 
approach is based on collecting information on customers and users, identifying the 
system modes, determining the functional profile, and recording the input states and 
their associated occurrence probabilities experienced in field operation. 
 Musa's approach has been widely utilized and adapted in the literature to generate 
OP [85]. For instance, Elbaum and Narla [86] refined Musa's approach by addressing 
heterogeneous user groups. They discovered that a single operational profile only 
"averages" the usage and "obscures" the real information about the operational 
probabilities. They utilized clustering to identify groups of similar customers. 
Sandfoss [85] suggests that estimation of occurrence probabilities could be based on 
numbers obtained from project documentation, engineering judgment, and system 
development experience. Gittens et al [87] proposed an extended OP model which is 
 
112 
 
composed of the process profile, structural profile, and data profile. The process 
profile addresses the processes and associated frequencies. The structural profile 
accounts for the system structure, the configuration or structure of the actual 
application, and the data profile covers the inputs to the application from different 
users. 
 Musa's approach and other extended approaches all use an assumption that field 
data or historic usage data cover the entire input domain. This assumption is not 
always true and their approaches are not always successful simply because some 
input data may not be available, especially for safety critical control systems [40]. 
 Addressing to this issue, the UMD research team [40] extended these approaches 
and generated a systematic method to identify those environmental variables and 
estimate all the environmental inputs. 
 Generating OP at earlier development phases is even more challenging. However, 
instead of discussing this open topic, this study assumes that the associated OP has 
already been collected before using the algorithm shown in Figure 5-1 to predict 
reliability for a software system. Moreover, it is assumed that OP is given in the form 
of a set of occurrence probabilities for all distinct causes that appear in the failure-
 relevant inputs. 
5.5 Summary 
 This chapter presents an automation-oriented algorithm for quantifying the impact 
of detected faults on software reliability applicable in requirements analysis stage. 
The proposed quantification algorithm is also applicable during later development 
phases, such as coding and testing phase, where the potential savings are less. 
 
113 
 
Chapter 6:  Examination of the Applicability of the Proposed 
CEGA Techniques for Early-stage Software 
Reliability Prediction by Case Studies 
 A rigorous definition of software measurement does not guarantee its applicability 
(in terms of feasibility and scalability) in practice. 
 Feasibility is an indispensable attribute of a technique, which indicates its 
capability of being used or dealt with successfully. 
 Scalability is a desirable attribute of a technique, which indicates its ability to 
either handle growing amounts of work in a graceful manner, or to be readily 
enlarged. One serious drawback of past and current software engineering research is 
lack of scalability. Researchers have developed techniques that work only on small 
systems. Mathematical techniques have, for the most part, been used only on very 
limited properties and on unrealistically small problems. Most any analysis technique 
works on a toy problem. There is reason to believe that software development in the 
large is so different than the toy problems found in most research papers that many 
published techniques may not apply to real projects. We need to find a balance of 
formal and informal techniques that scale by considering problems of realistic size 
and complexity from the start. Given the complexity of the systems we are attempting 
to build, the only convincing argument that an approach will work in practice is to 
validate techniques on real systems. 
 To evaluate the feasibility and scalability of the proposed CEGA techniques for 
quantification of software reliability at an early development stage, we conducted two 
 
114 
 
case studies. The first case study was carried out against a smaller application, whose 
SRS has 32 pages and 402 sentences4. The second case study was against a larger 
application, whose SRS has 289 pages and 3492 sentences. 
 This chapter reports the procedure, results and analysis of these two case studies. 
6.1 Applications Used Case Studies 
 The smaller application used in the first case study is Personal Access Control 
System (PACS) [88]. PACS is a simplified version of an automated personnel entry 
access system used to provide privileged physical access to rooms/buildings. PACS 
system provides physical access to a restricted area to authorized users based on a 
personal ID card and personal identification number (PIN). In order to get access, the 
user swipes an ID card which contains user?s name and social security number (SSN) 
through a card reader. After using its database of user?s names and SSNs to validate 
user?s privileges, PACS system instructs the user to enter a four-digit PIN number. If 
the entered PIN matches a stored PIN, the system allows the user to enter the area 
through a gate. PACS guides the user of the system with messages written on a 
single-line display screen. A security officer monitors and controls the PACS using a 
console with another single-line display screen, an alarm, a reset button, and a gate 
override button. In its current form, requirements specification for PACS originated 
from a US government agency.  
 The larger application used in the second case study is called ?SXXX?, because 
this application is copyrighted and its real name is not allowed to be exposed to the 
                                                 
 
 
4 Note that the numbers of sentences measured in this study were all rough estimates because we did 
not adopt any strict rules for sentence counting. 
 
115 
 
public. SXXX [40] is a safety-critical and real-time software system used in the 
nuclear domain. The SXXX System is based on the SXXX Processor Module. The 
SXXX Processor Module contains both discrete and high level analog input and 
output circuits. These circuits read input signals from the plant and send outputs that 
can be used to provide trips or actuations of safety system equipment, control a 
process, or provide alarms and indications. The transfer functions performed between 
the inputs and outputs are dependent on the software that is installed in the module. 
The SXXX system was installed in 1995 to partially upgrade an existing analog 
reactor protection system. 
6.2 Procedure 
 The two case studies are called Case Study A and Case Study B. 
Case Study A: Applying the proposed CEGA techniques to PACS. The primary 
purpose of this case study is to examine the feasibility of the 
proposed techniques. 
Case Study B: Applying the proposed CEGA techniques to SXXX. The primary 
purpose of this case study is to examine the scalability of the 
proposed techniques. 
 
 We hired a graduate student to carry out these two case studies. Case Study A was 
conducted first. After the feasibility of the proposed techniques was confirmed in 
Case Study A, Case Study B was conducted to examine the scalability of the 
proposed techniques. Administrative measures were taken to ensure the quality of 
results, such as partially validating the results by the author. 
 
116 
 
 The same steps were taken for both case studies. These steps are as follows: 
Step 1: Construct an A-CEG for the SRS and document any detected faults. The 
A-CEG is expressed in the mathematical format. 
Step 2: Detect faults in the SRS. 
Step 3: Quantify the impact of detected faults on software reliability. 
Step 4: Document all results. 
 The techniques and tools required to perform these steps are summarized in Table 
6-1. 
Table 6-1: Steps vs. Required Techniques/Tools? 
Steps Task Sub task Required techniques/tools 
Step 1 A-CEG construction 
Identifying causes
 ? The general CEG construction procedure 
and the general CEG construction 
guidelines (described in Section 3.2)  
? Mathematical expression of CEG 
(described in Section 3.5) 
Identifying 
effects 
Identifying 
logical 
relationships 
Identifying 
constraints 
Step 2 Detection of SRS faults  
Detecting SRS 
ambiguities 
? CEGA-based ambiguities review list 
(described in Section 4.5.2) 
Validating A-
 CEG 
? CEG validating technique (described in 
Section 4.5.3) 
Step 3 
Quantification 
of the impact 
of detected 
faults on 
software 
reliability 
A-CEG revision  ? A-CEG revision and B-A-CEG 
Construction Rules (described in Section 
5.3.2) B-CEG construction 
Identifying 
failure-relevant 
inputs 
? Failure-relevant identification algorithm 
(described in Section 5.3) 
                                                 
 
 
? In this table, the ?General CEG construction procedure and the general CEG construction 
guidelines?, ?CEGA-based ambiguities review list?, and ?BDD techniques? were developed 
by others rather than the author. 
 
 
117 
 
Steps Task Sub task Required techniques/tools 
BDD construction
 ? BDD techniques for fault tree expression 
(described in Section 5.4.1) 
? the BuDDy package [82] 
Calculating the 
occurrence 
probability of 
BDD?s top node 
? The Fundamental Lemma (Lemma 5-1) 
? Recursive algorithm (shown in Figure 
5-17) 
? BDD top node occurrence probability 
calculation tool. The sample source code 
of using this tool is illustrated in 
Appendix B. 
Step 5 Documentation 
of results (None) (None) 
6.3 Results and Findings 
 Both case studies (Case Study A and Case Study B) clearly confirmed the 
technical feasibility of the proposed techniques for software reliability prediction at 
an early development stage. 
 Table 6-2 summarizes the tasks, the required techniques to perform the tasks, and 
the scalability of the techniques in these two experiments. 
Table 6-2: Scalability of the Proposed Techniques? 
Task Sub task Proposed techniques Comments on the scalability 
A-CEG 
construction 
Identifying 
causes 
? General A-CEG 
construction guidelines 
? Mathematical 
expression of CEG  
Scalability of using 
Mathematical expressions for 
A-CEG was confirmed. 
 
Scalability of identifying A-
 CEG elements (causes, effects, 
logical relationships, and 
constraints) using the general 
guidelines was not scalable.  
Identifying 
effects 
Identifying 
logical 
relationships 
Identifying 
constraints 
Detection of 
SRS faults  
Detecting 
SRS 
? CEGA-based 
ambiguities review list 
Scalability was confirmed. 
Commercial tools are 
                                                 
 
 
? In this table, the ?General CEG construction procedure and the general CEG construction 
guidelines?, ?CEGA-based ambiguities review list?, and ?BDD techniques? were developed 
by others rather than the author. 
 
 
118 
 
Task Sub task Proposed techniques Comments on the scalability 
ambiguities available. 
Validating 
A-CEG 
? CEG validating 
algorithm  
? CEG validating rules 
Not scalable. Domain 
knowledge is required to 
perform this sub-task.  
Quantification 
of the impact of 
detected faults 
on software 
reliability 
A-CEG 
revision  ? A-CEG revision and B-CEG Construction 
Rules 
Scalability was confirmed. No 
tools are required for these two 
sub-tasks B-CEG construction 
Identifying 
failure-
 relevant 
inputs 
? Failure-relevant 
identification 
algorithm 
? Generic A-CEG fault 
tree model 
Scalability was confirmed. 
Ready for automation. 
BDD 
construction ? BDD techniques 
Scalability was confirmed. 
Free tools are available. 
Calculating 
the 
occurrence 
probability 
of BDD?s 
top node 
? Recursive BDD?s top 
node occurrence 
probability calculation 
algorithm 
Scalability was confirmed. 
Tools were developed. 
 The size of the SRSs, the A-CEGs, and the cause-effect measures for these two 
experiments are summarized in Table 6-3. 
Table 6-3: A-CEGs and CE(%) for PACS and SXXX 
Application 
Size of SRS Size of A-CEG  
CE 
(%) Number of 
Pages 
Number 
of 
sentences
 Number 
of 
Causes 
Number 
of 
Effects 
Number of 
Logical 
relationships
 Number of 
Constraints 
PACS 32 402 13 14 47 6 78.45 
SXXX 289 3492 255 506 1608 30 85.71 
  The detailed results of Case Study A (for PACS) are given in Appendix C. For 
comparison purpose, the graphical expressions of A-CEG and B-CEG are also 
included in Appendix C, although they are not required by the experiment. Refer to 
our technical report [40] for the detailed results of Case Study B (for SXXX).  In this 
 
119 
 
section, we only report the high-level findings that are most relevant to our study 
objectives. 
Finding 6-1: There is no obvious pattern in the distribution of detected faults against 
the CEG fault categories, as shown in Figure 6-1 and Figure 6-2. 
 Figure 6-1: Distribution of Detected Faults in Case Study A (for PACS) 
 Figure 6-2: Distribution of Detected Faults in Case Study B (for SXXX) 
Finding 6-2: An overwhelming presence of the identity logical relationship as the 
relationship between the cause and effect as shown in Figure 6-3 and Figure 6-4. This 
0
 1
 2
 Missing 
Effect
 Extra 
Effect
 Missing 
Constraint
 Extra 
Constraint
 Missing 
cause(s) in 
a Boolean 
function
 Extra 
cause(s) in 
a Boolean 
function
 Wrong 
Boolean 
operator(s) 
in a 
Boolean 
function
 2
 0 0 0
 2
 0 0Number 
of 
fault
 s
 0
 1
 2
 Missing 
Effect
 Extra 
Effect
 Missing 
Constraint
 Extra 
Constraint
 Missing 
cause(s) in 
a Boolean 
function
 Extra 
cause(s) in 
a Boolean 
function
 Wrong 
Boolean 
operator(s) 
in a 
Boolean 
function
 1
 0 0 0
 2 2 2
 Nu
 mb
 er 
of 
fau
 lts
 
120 
 
is a rather interesting finding since this might be useful in a future development of a 
tool. 
 
Figure 6-3: Distribution of Logical Relationships in PACS? A-CEG 
 
Figure 6-4: Distribution of Logical Relationships in SXXX?s A-CEG 
Finding 6-3: CEGA is effective in finding critical SRS faults (such as missing effects, 
wrong Boolean functions). Table 6-4 shows the number of (both critical and non-
 critical) faults detected in SRSs of PACS?s and SXXX?s and the efforts in detecting 
these faults.  For information purpose, other SRS-based measurements results from 
our previous research [40][42] are included in Table 6-4, and illustrated in Figure 6-5 
and Figure 6-6. These measurements include Completeness measurement, Defect 
Density measurement (using Requirements Inspection technique), and Requirements 
Traceability (RT) measurement. It should be noted that the human factor should be 
taken into account when interpreting the data on the table, because these four 
measurements were implemented by different persons. 
  
IDENTITY
 72%
 AND
 4%
 NOT
 9%
 OR
 15%
 IDENTITY
 65%
 AND
 20%
 NOT
 8%OR
 7%
 
121 
 
Table 6-4: Number of Detected Faults vs. Efforts in Using SRS-related 
Measurements5 
Application CEGA 
Requirements 
Completeness 
Defect Density 
(Requirements 
inspection)6
 Requirements 
Traceability7 
Detected 
faults8 
Effort, 
Staff-hr 
Detected 
faults 
Effort, 
Staff-hr 
Detected 
faults 
Effort, 
Staff-hr 
Detected 
faults 
Effort, 
Staff-hr 
PACS 4 (3) 30 (not implemented) 9 (4) 40 2 (2) 35 
SXXX 7 (7) 385.5 29 (4) 285.5 8 (5) 450 5 (3) 417 
Average 
Staff-hour 
/fault 
37.8 9.8 28.8 64.6 
 
Figure 6-5: Number of Detected Faults vs. Effort for PACS 
 
Figure 6-6: Number of Detected Faults vs. Effort for SXXX 
                                                 
 
 
5 These measurements were implemented by different persons. 
6 Three inspectors and two moderators participated in PACS? SRS inspection. Two inspectors and one 
moderator participated in SXXX?s SRS inspection.  
7 RT cannot be implemented until the end of coding phase.  
8 The numbers in parentheses on this table represent the numbers of critical SRS faults. 
0
 10
 20
 30
 40
 CEGA Requirements?
 Inspection
 Requirements?
 Traceability
 4 9 2
 30
 40 35
 Number?of?detected?faults Effort?(in?staff?hours)
 0
 100
 200
 300
 400
 500
 CEGA Completeness Requirements?
 Inspection
 Requirements?
 Traceability
 7 29 8 5
 385.5
 285.5
 450 417
 Number?of?detected?faults Effort?(in?staff?hours)
 
122 
 
Finding 6-4: Implementation of CEGA is very time-consuming. A considerable 
amount of time was spent in manually "parsing" the natural language SRS documents 
to construct an A-CEG and identifying SRS faults. This pattern is more obvious for 
an SRS with larger size (SXXX, in this case), as shown in Figure 6-7 and Figure 6-8. 
The time spent in identifying failure-relevant inputs and documenting results, which 
account for about 29% of the total effort in Case Study A and 23% in Case Study B,  
could be significantly reduced if automation tools had been available. We think that it 
is a relatively easy task to develop automation tools for reducing time spent in these 
two activities based on the automation-oriented algorithm proposed in Chapter 5 (see 
Section 5.3). However, it is very challenging to reduce the time spent in identifying 
SRS faults since the effectiveness and efficiency of our CEGA-based SRS faults 
detection method is highly dependent on the ability of the person(s) using this method. 
Relevant resources, such as the user specification document, an end user, and so on, 
help the inspector improve his/her understanding of the system and identify faults in 
A-CEG. Training is also very helpful. Up to this point, it is not clear why A-CEG 
construction is so time-consuming and how to reduce the time spent in this activity. 
We will revisit this topic in Chapter 7. 
  
 
123 
 
 
 
Figure 6-7: Distribution of Efforts in Case Study A (for PACS) 
 
Figure 6-8: Distribution of Efforts in Case Study B (for SXXX) 
Finding 6-5: Process of manually parsing SRS is error-prone. The effectiveness and 
efficiency of the CEGA measurement is highly dependent on the ability of the person 
exercising the measurement. It is highly recommended to assign an analyzer who 
knows the software very well to perform CEGA. This is mainly because it is not very 
easy to identify the true logical relationship between the causes and the constraints. 
Without prior knowledge of the system, the defects found through CEGA may not be 
correct and the final reliability prediction may not be very meaningful. A two-week-
 Constructing A-
 CEG
 36%
 Identifying SRS 
faults
 27%
 Identifying 
failure-relevant 
inputs
 17% Documenting 
results
 12%
 Constructing B-
 CEG
 5%Calculating A-CEG's Failure 
probability
 3%
 Constructing A-
 CEG
 42%
 Identifying SRS 
faults
 32%
 Identifying 
failure-relevant 
inputs
 15% Documenting 
results
 8%Constructing B-
 CEG
 2%Calculating A-
 CEG's Failure 
probability
 1%
 
124 
 
long training on the measurement and the domain knowledge is suggested before the 
inspector may carry out this measurement.  
Finding 6-6: Metrics used early, such as the cause-effect measure, can aid in 
detection and correction of requirement faults that will lead to prevention of errors 
later in the life cycle. Finding these problematic areas in the requirements phase 
decreases the cost and prevents potential ripple effects from the changes, later in the 
development life cycle. The benefits of finding and correcting problems in the 
requirements phase has been demonstrated in the CEGA measurement, making a 
strong argument for pursuing this approach and building in reliability starting at the 
requirements phase. 
Finding 6-7: The primitives (for example, the number of ambiguities in SRS) of the 
cause-effect measure are somewhat subjective. Repeatability of the cause-effect 
measure is not guaranteed. The domain knowledge and other subjective factors, to 
some extent, highly affect the inspector?s judgment. Therefore, it is not appropriate 
for quantitatively assessing the quality of the SRS. 
Finding 6-8: The reliability of this measurement usually under-estimates the 
reliability of the final source code since SRS faults may be detected and fixed during 
later development phases. In practice, many of the ambiguities are identified and 
avoided during later development activities, such as design, coding, and testing. 
 
 
125 
 
6.4 Summary 
 The CEGA process is an old concept whose usage can be expended from merely a 
testing tool to a useful SRS validation and software reliability prediction tool.  
 The feasibility and scalability of our approach for early software reliability 
prediction has been examined against two real applications. Although feedbacks from 
the use of the proposed techniques have been encouraging, there are a number of 
areas that require further investigation.  
 According to the results of Case Study A and Case Study B, the feasibility of our 
approach is clearly confirmed and the scalability is the top issue that needs to be 
addressed. The scalability bottleneck of our approach lies in A-CEG construction and 
SRS faults detection, which account for more than 60% of the effort spent in 
implementing CEGA. The SRS faults detection requires domain knowledge. This part 
is less likely to be scalable. Although Case Study B showed that A-CEG construction 
using the general A-CEG construction guides was not scalable, we hypothesized that 
this was caused by the A-CEG contraction method (the general A-CEG construction 
guidelines). Two questions thus arise: ?Is A-CEG construction scalable?? and ?Are 
there other techniques that enable the scalability of A-CEG construction?? The 
answers to these questions are uncertain and can only become clear after extensive 
exploration of the nature of natural language SRSs. 
 
126 
 
Chapter 7:  Exploration of the Scalability of A-CEG 
Construction 
 The starting point for our approach is the SRSs, which are typically expressed in 
the form of natural language statements. However, the uses of natural language for 
specifying requirements, which are so important for human communication, represent 
an obstacle to automatic analysis of SRS. Especially, the results of Case Study B (see 
Chapter 6) showed that one of the scalability barriers in our approach lied in A-CEG 
construction. We wanted to answer the question of whether A-CEG construction is 
scalable or not and attempted to provide solution(s), if possible, to overcome the 
scalability barrier caused by A-CEG construction. Because quantitative research 
requires large sample sizes and such a sampling was not feasible for our study, a 
qualitative case-study approach was employed to understand the nature of SRSs and 
A-CEG construction. 
 During the 2006 Fall Semester, we designed an empirical study (called Empirical 
Study C) and offered an independent study project titled ?Rule-based A-CEG 
Elements Extraction for Software Requirements Documents? to senior 
undergraduates at The University of Maryland pursuing Electrical Engineering or 
Computer Engineering related majors. The focus of this project was to provide insight 
about characteristics of SRSs, determine the feasibility of automatic A-CEG elements 
extraction, and lay the ground work for developing SRS-specific information retrieval 
and/or text mining tools for automatic A-CEG elements extraction. The project was 
 
127 
 
part of Empirical Study C and served for educational purposes in accordance with 
regulations of the university. 
 This chapter reports the objectives, procedure, detailed findings and analysis 
pertinent to Empirical Study C. 
7.1 Objectives 
 The objectives of Empirical Study C were: 
1) to study the characteristics of SRS, gain insight into the nature of SRS related 
to A-CEG construction, and obtain empirical information that leads to greater 
understanding of A-CEG construction. 
2) to observe, collect, and distill the ?patterns? in A-CEG construction. 
3) to identify factors impacting the A-CEG elements identification. 
4) to explore method/rules to enhance the scalability of A-CEG construction. 
5) to provide SRS writers with caveats to avoid some common problems found 
in SRSs which were specified in plain English. These problems not only add 
difficulties in identifying A-CEG elements, but also might lead to increased 
risks of unreliable software products. 
 While many generic and successful information retrieval and text mining 
techniques/tools exist, we wanted to explore the possibility of an SRS-specific 
method, and lay the ground work for developing information retrieval and/or text 
mining tools for automatic A-CEG elements extraction. 
7.2 Methodology and Procedure 
 Empirical Study C had three steps: 
 
128 
 
 Step 1: Experiment preparation. 
 Step 2: Implementation of the independent study project. 
 Step 3: Postmortem analysis and improvement. 
7.2.1 Step 1: Experiment Preparation 
 We selected nine publicly available SRSs from different sectors (government, 
military, industry, and academia) used for Step 2. All of these SRSs were written in 
English and followed the format recommended by IEEE standard IEEE Std. 830-1998 
[53].  These SRSs were: 
SRS1: for MRC-II System [89], a system software as a replacement of the current 
MRC software providing a real-time robot control system for research in 
Computer-Integrated Surgery. 
SRS2: for DPUFSW [90], a major component of the data processing unit in an 
airplane control software. 
SRS3: for Long Range Advanced Scout Surveillance System [91], a system 
operable in both a stationary vehicle mounted configuration and in an 
autonomous dismounted configuration, which determines far target 
location coordinates, and provides a real-time target detection, 
recognition, and identification capability to the scout while permitting 24-
 hour adverse weather operations. 
SRS4: for Qheadache [92], a computerized game that displays an interface used 
to solve a specific headache (puzzle). 
SRS5: for The Graph Editor [93], an interactive application that allows the user 
to create, edit, layout, save, and print arbitrary graphs commonly used in 
 
129 
 
software engineering. It uses the GXL graph notation standard for storing 
graphs to files. 
SRS6: for ?Software Engineering Tool? [94], an application for aiding the 
employees in the process of developing software. 
SRS7: for the BTS (Bus Tracking System) [95], a system intended to assisting 
passengers with route planning, inform passengers of delayed busses, 
improve inter-bus transfers by informing bus drivers of connecting busses 
that are running behind schedule, help transit management produce 
accurate schedules, and help transit management allocate resources more 
efficiently. 
SRS8: for FloristExchange [96], an e-commerce project that deals with selling 
flowers online. The resulting website contains a catalog, a shopping cart, 
and other features that enable the webmaster to effectively manage. 
SRS9: for PICASSO [97], a major component of the Requirements Assistant 
System that provides an environment in which a group of developers can 
collaborate on the production of a set of software requirements. 
 Due to the time limitation, the student only worked on two pre-selected segments 
(about 10 pages in total) for each SRS, although the entire SRS was provided to him. 
Each SRS segment contained a complete section for a functional module, including 
?Input Section?, ?Processing Section?, and ?Output Section?. 
 
130 
 
7.2.2 Step 2: Implementation of the Independent Study Project 
 This step lasted 16 weeks/3.5 months (not including two holiday breaks): 4 weeks 
for the training sessions, 10 weeks for the SRS analysis sessions, and 2 weeks for the 
finalization session. Figure 7-1 depicts the entire timeline for this step. 
 
Figure 7-1: Timeline for Implementing the Independent Study Project  
 The focus of this step was to parse the selected SRSs segments and analyze how 
the A-CEG elements, including the causes, effects, logical relationships, and 
constraints, are found. The observation on the exploration process of the student in 
Case Study A and Case Study B (see Chapter 6) suggests that A-CEG elements are 
identified using the so-called ?pattern-matching? approach [98]. One of the Natural 
Language Processing (NLP) techniques closest to our research purposes Pattern-
 matching is the act of checking for the presence of the constituents of a given pattern. 
Pattern-matching is usually achieved on the basis of a set of pre-defined rules. A rule 
is a generalized statement that describes what is and what is not an A-CEG element in 
most or all cases. A ?pattern? can be represented by a sequence of indicators which 
would point us to A-CEG element(s). Indicators are signals to us that aid us identify 
A-CEG elements in a SRS. These indicators are often verbs, nouns, prepositions that 
seem to be associated with the rules. This study preferred the rule-based pattern-
 
131 
 
matching approach for our research. Therefore, an attempt to establish a database of 
rules and indicators was also made in this step. 
 Before entering the SRS analysis sessions, the selected student was trained for 4 
weeks on IEEE Std. 830-1998 [53] and knowledge of A-CEG construction. Self-
 study materials were assigned and help sessions were also provided to the student. 
Finally, the student was given a quiz in which he was asked to identify A-CEG 
elements in a sample SRS. After the quiz, we discussed with the student the list of 
known A-CEG elements and what A-CEG elements he had actually found. We 
accounted for correctly identified A-CEG elements (true positives), missed A-CEG 
elements (false negatives), and wrongly identified A-CEG elements (false positives), 
and analyzed the cause of the false negatives and false positives, and tried to identify 
improvements9. All of these steps were taken to ensure that the student had mastered 
the technique on A-CEG construction. 
 After we were convinced that the student was capable of performing the SRS 
analysis tasks, Empirical Study C was shifted to SRS analysis sessions. There were 
10 such sessions, each of which lasted one week. Each session consisted of an 
individual analysis of SRS performed by the student (about 6 hours), and a following 
face-to-face discussion (about 2 hours) joined by the student and the author.  
 The student first followed the workflow shown in Figure 7-2 to extract rules and 
indicators for identification of A-CEG elements. It should be pointed out that it is 
rather subjective for the student to determine whether the reasoning in identifying the 
                                                 
 
 
9 The student got 70% for Accuracy, 80% for Recall, and 84% for Precision in the quiz. See Chapter 8 
for definitions of Accuracy, Recall, and Precision. 
 
132 
 
A-CEG elements is covered by existing rule(s) or not, since his judgment was 
primarily based on common sense and could be limited to his domain knowledge, 
experience, and other subjective factors. To ensure the quality of experimental results, 
we asked the student to record down information pertinent to his judgment, such as 
the detected CEG element(s), and the reasoning that led to the identification of the 
CEG element(s), the extracted rules and indicators, and so on. He was also required to 
document any issues that he had encountered during the individual session. 
He then illustrated the documented results and worked through the collected rules 
and indicators with the author during the discussion session. In addition, problems, 
difficulties, and other issues regarding performance of the analysis tasks were 
exposed and discussed as well. As such, we could make sure that the project was 
always on the right track. 
 
133 
 
 
Figure 7-2: Workflow for Extracting Rules and Indicators for identification of A-
 CEG Elements 
 After completing all of the ten SRS analysis sessions, the student was asked to 
finalize the results (including the database containing the extracted rules and 
indicators) and submit a report for the independent study project. The database is 
expressed in the form of a Microsoft Excel? worksheet containing related 
information, such as the SRS segment, the page number and sentence number on the 
SRS segment, the detected CEG element(s), the reasoning that led to the 
 
134 
 
identification of the CEG element(s), the rule(s) description, and/or the extracted 
indicator(s), and so on. 
 There were quite a few implied A-CEG elements (especially causes and logical 
relationships) in the parsed SRS segments. We were very cautious to make a rule 
when implication was used, since the process by which the implied A-CEG element 
was found was rather subjective.  
 Note that there were two analysis sessions assigned for SRS4 (including a regular 
and a re-work session) because of the degree of difficulty and peculiarity of the text. 
The difficulties with analyzing SRS4 the first time can be attributed to the fact that it 
contained a large number of implied causes and the student had no idea how to handle 
implied A-CEG elements. 
7.2.3 Step 3: Postmortem Analysis and Improvement 
 In this step, the author further improved the rules and indicators obtained in Step 
2 by making the rules more general and unified, and distilled a set of A-CEG 
Construction Rules. Moreover, the potential influencing factors for both A-CEG 
construction and the use of the A-CEG Construction Rules were identified and 
analyzed. Suggestions that would ease the task of A-CEG construction and help avoid 
some common problems in SRSs were also made during this step. 
 The process used to distill the A-CEG Construction Rules is shown in Figure 7-3. 
It should be noted that we did not define any strict criteria for classifying 
rules/indicator categories. In addition, there might be other better ways to extract and 
express the rules for A-CEG construction based on the database obtained in Step 2. 
 
135 
 
 
Figure 7-3: Process Used to Distill the A-CEG Construction Rules 
Begin
 End
 IndicatorIs this a rule or an indicator ?Rule 
Take an item in the rules and 
indicators database
 Put this rule into 
the existing rule 
category
 Does the rule belong to any known rule 
category ? 
Have all items in the database been analyzed ?
 Yes 
No
 Create a new rule category 
and put this rule into the
 new rule category
 Put this indicator 
into the existing 
indicator category
 Does the indicator belong to any known 
indicator category ? 
Yes
 No
 Create a new indicator 
category and put this rule 
into the new rule category
 Count the number of times that the rule categories 
have been hit (hit count) 
Rank the rule categories according to the hit count
 (from high to low ) 
No 
Add the rule category into the A -
 CEG Construction Rules set
 Is the rule category hit by different SRSs ?Yes
 Rephrase rules and assign names 
for the rule categories
 Link indicators to the appropriate A -
 CEG Construction Rule .
 Yes
 Select a rule category according to the hit count
 ranking (from high to low )
 No
 Mark the rule category 
as ?to be determined? 
(for future research) 
Yes
 Have all rule categories been analyzed ?
 Yes
 No
 
136 
 
7.3 Results and Discussion 
7.3.1 Database of Rules and Indicators for A-CEG Elements Identification  
 Through Step 2, we acquired a fairly large database containing rules and 
indicators for identification of A-CEG elements. Below are the figures (Figure 7-4 
and Figure 7-5) summarizing the numbers of rules and indicators we found for each 
SRS segment. 
 Figure 7-4: Number of Rules Extracted from Selected SRSs  
 Figure 7-5: Number of Indicators Extracted from Selected SRSs  
 Figure 7-4 shows that in the first 6 SRS documents we developed a good amount 
of rules; after that the identification of rules dropped off. Figure 7-5 shows that the 
indicators stay fairly constant across all SRS?. These statistics suggest that we would 
11 12 9 4 3 6 4 2 1
 11 23 32 36
 39 45 49 51
 0
 10
 20
 30
 40
 50
 SRS1 SRS2 SRS3 SRS4 SRS5 SRS6 SRS7 SRS8 SRS9
 Nu
 mb
 er 
of 
Ru
 les
 New Known
 20 27 22 21 15 28 29 18 19
 20 47 69
 90
 105 133
 162 180
 0
 50
 100
 150
 200
 SRS1 SRS2 SRS3 SRS4 SRS5 SRS6 SRS7 SRS8 SRS9
 Nu
 mb
 er 
of 
Ind
 ica
 tor
 s New Known
 
137 
 
just continue adding indicators to our list and that on the other hand few new rules 
would be found for the database if we had analyzed more SRS segments.  
7.3.2 A-CEG Construction Rules  
 The following rules were developed and distilled on the basis of a database of 
rules and indicators obtained from Step 2. These rules often depend on prepositions, 
punctuation, and sentence structure. Additionally, very often they need to be used in 
conjunction with indicators to determine A-CEG elements. These indicators were 
loosely grouped to maximize hit potential. 
Rule 7-1: Action-word Rule (for identifying events) 
 An action word in a requirements statement indicates at least an event (or events, 
if Rule 7-2 is applicable). 
 Action words are those that are used to specify control of physical input/output 
exchanges in a program, or implementation of application actions, such as reading a 
screen, submitting a query to a database, opening or closing a file. An action word is 
usually a verb which indicates some activities performed. Typical action words 
include: 
call calculate check close 
create delete display grant 
halt initiate notify open 
prevent print prohibit prompt 
protect provide quit read 
remove require reset retrieve 
return save send set 
 
138 
 
share store transfer transmit 
use validate verify write 
Example: ?The function displays the message on the screen? is identified as an event 
because it contains ?displays? that is regarded as an action word. 
Rule 7-2: Atomic-event Rule 
 An event must be a non-divisible activity.  In other words, a complex activity, 
typically signified by an ?and? or an ?or?, must be decomposed into several atomic 
events. All events should be mutually exclusive in the sense that no one event is part 
of another. 
Example:  ?If the user presses down the right mouse key and then press left mouse 
key, ?? contains two events ?the user presses down the right mouse key? and ?then 
(the user) press left mouse key?. 
Rule 7-3: Lowest-level-event Rule 
 In a cause-effect graph, an event must be the lowest level of activity. If an event is 
reiterated by other more detailed events, the following steps are applied: 
1. This event should not be identified as a complex (non-atomic) event. 
2. This event should be replaced by the lowest-level events. 
3. All lowest-level events should be identified as events. 
4. The constraint EXCLUSIVE should be applied to these lowest-level (events). 
Example:  ?The user provides the input from the keyboard. There are four options 
provided by the user: option 1, option 2, option 3, and option 4.? 
 
139 
 
 In this case, ?the user provides the input from the keyboard? should not be 
identified as an event. Instead, ?the user provides option 1?, ?the user provides 
option 2?, ?the user provides option 3?, and ?the user provides option 4? are 
identified as four distinct events. Besides, the EXCLUSIVE constraint is applied 
among these four events. 
Rule 7-4: No-duplicate-event Rule 
 In a cause-effect graph, every event is unique. In other words, duplicate events 
should be removed from a cause-effect graph. 
Rule 7-5: Logical-relationship-pattern Rule 
 The logical relationships among events are identified by matching the sentence to 
one of the following four basic patterns: 
1. IDENTITY pattern: IF event a THEN event b. 
2. NOT pattern: IF NOT event a THEN event b. 
3. AND pattern: IF event a1 AND event a2 THEN event b. 
4. OR pattern: IF event a1 OR event a2 THEN event b. 
 The functional requirements specifications can be casted into sentences of one of 
the above forms. Therefore, the problem of identifying logical relationships should be 
one of finding the keywords: IF, THEN, AND, OR, and NOT. Unfortunately we have 
to deal with the real world of specifications and specification writers, where clarity 
ranges from elusive, through esoteric, into incomprehensible. It takes intelligence to 
disentangle intentions that are hidden by ambiguities inherent in English and by poor 
English usage. 
 
140 
 
 Here is a sample of phrases that have been or can be used (and abused) for the 
indicators we need. Be aware that this is not a list of recommended synonyms for 
specification writers. Several entries appear in more than one sub-list indicating a 
source of danger. Besides, there are other dangerous phrases, such as ?respectively,? 
?similarly,? ?conversely,? ?and so forth,? and ?etc.? 
IF 
based on  based upon because but 
if if and when only if only when 
provided that when when or if whenever 
THEN 
consequently implies that infers that means that 
shall should then will 
would    
AND  
all and as well as both 
but in conjunction 
with 
coincidental with consisting of 
comprising either or furthermore 
in addition to including jointly moreover 
mutually plus together with total 
with    
OR 
and and if then and/or 
 
141 
 
alternatively any of anyone of as well as 
but case contrast depending upon 
each either either . . . or except if 
conversely failing that furthermore in addition to 
nor not only . . . but although other than 
otherwise or or else on the other hand 
plus    
NOT 
but but not by contrast besides 
contrary conversely contrast except if 
excluding excepting fail failing 
less neither never no 
not other than   
Rule 7-6: Single-If Rule 
 For a group of expressions in form of ?If Expression 1, then Expression 2; 
otherwise Expression 3?, the following steps are applied: 
1. Expression 1 consists of at least an event. 
2. Expression 2 consists of at least an event. 
3. Expression 3 consists of at least an event. 
4. Omission of Expression 2 or Expression 3 is regarded as a fault. 
Rule 7-7: Nesting-If Rule  
 
142 
 
 For a group of expressions in form of ?If Expression 1-1, then Expression 2-1; If 
Expression 1-2, (then) Expression 2-2; If Expression 1-3, (then) Expression 2-3;  ??, 
the following steps are applied: 
1. Expression 1-1, Expression 1-2, Expression 1-3, ?, consists of at least an 
event, respectively. 
2. Expression 2-1, Expression 2-2, Expression 2-3, ?, consists of at least an 
event, respectively. 
3. Constraint EXCLUSIVE should be applied to event(s) in Expression 1-1 and 
event(s) in Expression 1-2, Expression 1-3, etc. 
4. Omission of any of Expression 2-1, 2-2, 2-3, etc., is regarded as an SRS fault. 
Example: ?If the choice is ?1?, the function Range will be initiated. If the choice is 
?2?, the function Speed will be initiated. If the choice is ?3?, the function Trajectory 
will be initiated. If the choice is ?4?, the function quits.  For all other options the 
function ?Error? will be initiated.? 
 In this case, there are ten events, including ?the choice is ?1??, ?the choice is 
?2??, ?the choice is ?3??, ?the choice is ?4??, ?all other options?, ?the function 
Range will be initiated?, ?the function Speed will be initiated?, ?the function 
Trajectory will be initiated?, ?the function quits?, and ?the function ?Error? will be 
initiated?. Besides, the constraint EXCLUSIVE is applied to the first five events. 
Rule 7-8: Sequentially-triggered-event Rule 
 For sequential events, e1, e2, ?, en,  that are triggered by an event (either atomic 
or non-atomic), e, the following steps are applied: 
 
143 
 
1. The IDENTITY logical relationships should be applied to the triggering event 
and the triggered events in form of 1 2: ,  : ,  ,  : .ne e e e e e= = =?  
2. A series of REQUIRE constraints should be applied to the triggered events in 
form of REQUIRE (en,  en-1,), REQUIRE (en-1,  en-2,), ?, REQUIRE (e2,  e1,). 
Example: If event A leads to four events, event 1, event 2, event 3, and event 4 and 
these four events should occur one after another, the following logical relationships 
and constraints should be applied to them: 
 
 
 
 
 
 Alternatively, the A-CEG snippet for this requirements segment can be 
graphically expressed as follows: 
 
Rule 7-9: External-Actor Rule (for identifying causes) 
event 1:= event A  
event 2:= event A  
event 3:= event A  
event 4:= event A  
 
REQUIRE (event 4, event 3), 
REQUIRE (event 3, event 2)  
REQUIRE (event 2, event 1) 
 
144 
 
 An event performed by external entities is a cause. An external entity (called 
external actor) is a human or external system/function/application that communicates 
with the system, function, or application under discussion. Typical indicators for 
external actor(s) are: 
alarm card card-reader client 
CPU database file keyboard 
LED message microprocessor monitor 
RAM record screen supervisor 
timer user   
Example:  ?If the input value is greater than 0 and in the format F10.4? contains two 
events ?the input value is greater than 0? and ?the input value is in the format 
F10.4?. These two events are identified as causes because their actor is the user who 
provides the software with the input values.  
Rule 7-10: Is-Are Rule (for identifying causes) 
 Even without any action words, the status description of external entities signifies 
a cause(s). 
Example:  ?If the input value is greater than 0 and in the format F10.4? contains two 
causes ?the input value is greater than 0? and ?the input value is in the format 
F10.4?. 
Rule 7-11: Internal-Actor Rule (for identifying effects) 
 An event performed by the function under discussion (called internal actor) is an 
effect. Typical indicators for internal actor(s) are: 
 
145 
 
this algorithm this application this function this module 
this system    
Example:  ?This function should initiate Function Interface after its execution? is 
identified as an effect. 
Rule 7-12: Default Actor Rule (for identifying causes) 
 If not specified, an event by default is performed by the function under discussion 
and thus is identified as an effect. 
Example: ?the projection angle is validated?. 
 In this case, there is an action word ?validate?. According to Rule 7-1, ?the 
projection angle is validated? is identified as an event. Besides, there is no explicit 
actor mentioned in the statement. According to Rule 7-12, this event is identified as 
an effect. 
 The suggested workflow for using the A-CEG Construction Rules is shown in 
Figure 7-6. 
 
 
146 
 
Check the existence of duplicate events and remove 
them (refer to Rule 7-4)
 Begin
 End
 Select a sentence
 No
 Yes
 Is there  any indicator, such as ?IF?, ?ELSE?, 
?THEN?, ?AND?, ?OR?, and ?NOT?, in the sentence? 
(refer to Rule 7-5, 7-6, 7-7, and 7-8)
 Is there any action word in the sentence?
 (refer to Rule 7-1)
 No
 Yes
 Have all sentences been  analyzed?
 No
 Document the potential event(s)
 Check the existence of high-level events and 
decompose them into a lowest level
 (refer to Rule 7-3 )
 Check the existence of non-atomic events and 
decompose them into atomic events
 ( refer to Rule 7-2)
 Document the potential events 
and/or constraints.
 For each of the identified  events, use Rule 7-9, 7-10, 
and 7-11 to determine if it is a cause or an effect.
 Document the results
  
Figure 7-6: Suggested Workflow for Using the A-CEG Construction Rules 
 In addition to the A-CEG Construction Rules presented previously, the following 
guidelines, which were also summarized in our previous research [40] (to be printed), 
are helpful for CEG construction:  
 
147 
 
? To identify causes, read the specification carefully, underlining words or 
phrases that describe causes. Any distinct input condition or equivalence 
class 10  of input conditions should be considered causes. Only functional 
events in the specification are considered. Each cause is assigned to a unique 
number. None of the descriptive specifications are considered in identifying 
causes. 
? Effects can be identified by reading the specification carefully and underlining 
words or phrases that describe effects. Only functional events in the 
specification are considered. To avoid redundancy, all the descriptive 
specifications are not considered in identifying effects. Each effect is assigned 
to a unique identifier. 
? The logical relationship between causes and effects can be identified by 
analyzing the semantic content of the specification linking the causes with the 
effects. Those keywords like "not", "or", "and" etc. usually act as indicators of 
logical relationships. Other words having the logical meaning, such as "both", 
"neither" also need to be paid much attention to. The logical relationships are 
mainly found in functional specifications. However they could be found in 
some descriptive specifications. In order to identify all logical relationships 
between causes and effects, both functional and descriptive specifications 
need to be analyzed. 
                                                 
 
 
10  An equivalence class is a portion of a component's input or output domains for which the 
component's behavior is assumed to be the same from the component's specification [44]. 
 
 
148 
 
? The external constraints among causes can be identified by checking for the 
occurrence of related causes specified in SRS. The external constraints among 
effects can be identified by checking for the occurrence of related effects 
specified in SRS. As with the logical relationships, the external constraints 
could be specified in both functional specifications and descriptive 
specifications. In order to identify all external constraints, both functional and 
descriptive specifications need to be analyzed. 
 The database and the A-CEG Construction Rules set had showed promise of 
pointing and extracting the correct A-CEG elements when we tentatively applied 
them to several SRSs. However, there are multiple issues that need to be addressed: 
? The database is a small start and only scratched the surface. As expected, the 
database is extremely limited in the way of having enough indicators to be 
adequate when covering SRSs that have not been analyzed yet.  
? We need to establish a priority system or algorithm to determine which rules 
are used first and smartly combine the rules in order to get the most accurate 
result.  
? For indicators in the database to be more effective, it would seem desirable to 
develop a domain-specific list of indicators and their synonyms to strengthen 
the accuracy of identifying A-CEG elements using our approach. This is 
especially true for action words used to identify events. For example, the 
following list of action words is commonly found in most privacy-policies-
 related functional requirements in the internet security domain: 
 
149 
 
advise aggregate allow collect 
comply customize disallow discipline 
disclose ensure improve keep 
limit notify opt-in opt-out 
prevent prohibit protect provide 
recognize remove report require 
retrieve sell send share 
store track transmit transfer 
use    
7.3.3 Potential Influencing Factors of A-CEG Construction 
 There were several major problems that were impeding efficient identification of 
A-CEG elements in this experiment: 
1. Redundancy: This issue mostly stems from the overviews and general 
description section, in which author(s) of an SRS reiterate the 
algorithms/procedure for some functional requirements.  While overviews 
and general description in an SRS are important for the understanding of 
the reader, they often act as ?trouble-makers? (leading to problems such as 
redundancy and inconsistence) rather than ?trouble-shooters?. 
Distinguishing between those and the real functional requirements can be 
difficult and may be a major obstacle precluding the A-CEG Construction 
Rules from becoming efficient. This could also be a potential obstacle for 
automatic extraction of A-CEG elements. 
2. Ambiguities: It might sound surprising that CEGA itself is a victim of 
SRS ambiguities, even when CEGA has a proven powerful ability in 
 
150 
 
detecting SRS ambiguities. However, it is true that SRS ambiguities 
prevent the A-CEG Construction Rules from becoming efficient. They are 
also obstacles for automatic extraction of A-CEG elements. 
3. Incompleteness/implication. This issue usually arises when functional 
requirements are implicitly stated. Often the authors of SRSs do not feel 
the need to explicitly express causes, while the effects are usually stated 
well. For instance, SRS4 that we had trouble with had a lot of implied 
causes. In this situation we often had to either imply a cause, or imply a 
relationship to a previously found cause. In addition, there were quite a 
few implied logical relationships in the parsed SRS segments. Besides, it 
was extremely hard to find an explicitly expressed logical relationship 
when the procedure of an algorithm/function was described in a 
chronological manner. This was especially true when the descriptions 
were based on dataflow, where one had to imply the relationship between 
a found cause(s) and effect(s). The logical relationship was just shown by 
sequential descriptions, which did not give us a true explicitly stated 
relationship. 
4. Understandability of identified events. This issue is usually related to 
inappropriate words and/or phrases used for the statements of functional 
requirements. While we may be able to extract some of those that are not 
easily understood, we do so with risk of identifying incorrect A-CEG 
elements. Even when the right A-CEG elements are extracted, it would not 
have a lot of meaning to the reader of the graph, if he/she did not read the 
 
151 
 
entire SRS. This will significantly reduce the benefits from the CEGA 
approach. 
 In addition, our observations in this experiment suggest that several variables 
might have important influence on performing A-CEG construction as well as the use 
of the A-CEG construction Rules: 
? The SRS? writing style. The presentation style of the specifications is a 
critical factor in the ease of A-CEG construction. A specification written 
as a logical design document is not suitable for A-CEG construction. The 
measure of the SRS? writing style is defined in Chapter 8. 
? The SRS? application type, which is defined in Chapter 8. 
? Complexity of the system under study. 
? The size of SRS (in terms of the number of sentences). 
? The standard that the SRS follows to organize the content. 
? Person-related factors, including: 
a. Capability of the person performing the task. 
b. Domain knowledge on the system under study. 
c. Prior industrial experience or other job related experiences in 
writing requirements. 
d. Educational background, such as majors of university degrees, and 
level of education (BS, MS, Ph.D.). 
 
152 
 
 Especially, it is important to understand the variations among the individuals that 
make them more or less effective in A-CEG construction and identifying the 
characteristics that make an individual particularly effective. 
7.3.4 Suggestions for Writing an SRS  
 Natural language's extensive vocabulary and commonly understood syntax 
facilitate communication and make it an inviting choice to express requirements. The 
informality of the language also makes it relatively easy to specify high-level general 
requirements when precise details are not yet known. However, because of 
differences among formal, colloquial, and popular definitions of words and phrases 
and the effort required to produce detailed information, these same attributes also 
contribute to documentation problems. The use of natural language to prescribe 
complex, dynamic systems has at least three common and severe problems: ambiguity, 
inaccuracy, and inconsistency. These problems not only add difficulties in identifying 
A-CEG elements, but also might lead to increased risks of unreliable software 
products. Fortunately, these defects can be prevented through a more disciplined and 
consistent approach to document design, formulation of specification statements, and 
selection of key words and phrases. 
 Poorly structured specification statements result in confusing requirements that 
are prone to incorrect interpretations. Start rewriting the specification by getting rid of 
ambiguous terms, words, and phrases and expressing it all as a long list of ?IF . . . 
THEN. . .? statements. The main point of translating the specification into 
unambiguous English that uses ?IF?, ?THEN?, ?AND?, ?OR?, and ?NOT? is that this 
form is less likely to be misinterpreted. 
 
153 
 
 The use of imprecise terms usually indicates that the specifications author was 
either lazy, incompetent, or did not have sufficient time to determine the exact 
requirements. Some writers seem to be afraid that their audience will be bored or will 
think them lazy if they use simple words and repeat themselves. When writing 
documents or software, being too fancy complicates things and make the resulting 
products hard to understand. 
 Specifications could be further strengthened through a better selection of words 
and phrases. The precise meaning of many words and phrases depends entirely on the 
context in which they are used. Attention must be given to the role of each word and 
phrase when formulating specification statements. Words and phrases that are 
carelessly selected or carelessly placed produce statements that are ambiguous and 
imprecise. The simplest word that is appropriate to its intended purpose in the 
specification is the one to use. In particular, we have the following suggestions for 
writing an SRS: 
? Use the correct imperative and use it consistently. For instance, the word "shall" 
prescribes, "will" describes, "must" and "must not" constrain, and "should" 
suggests. 
? Avoid weak phrases such as "as a minimum," "be able to," "capable of," and "not 
limited to." These phrases are subject to different interpretations and also set the 
stage for future changes to the requirements. 
? Do not use words or terms that give the reader an option regarding the extent to 
which the requirement is to be satisfied, such as "may," "if required," "as 
appropriate," or "if practical." 
 
154 
 
? Avoid using immaterial words or phrases, such as ?independent of?, ?regardless?, 
?irrespective?, ?irrelevant?, ?regardless?, ?but not if?, and ?whether or not?. 
? Do not use generalities when numbers are required, for example, "large," "rapid," 
"many," "timely," "most," and "close." Avoid imprecise words that have relative 
meanings such as "easy," "normal," "adequate," or "effective."  
? If a specification statement contains three or more punctuation marks, it probably 
needs to be restructured. 
7.4 Summary 
 In this chapter, we focus on exploring the scalability of A-CEG construction. We 
present an empirical experiment designed to understand how an undergraduate 
developed A-CEG elements and his exploration process. We briefly describe our 
goals and the experiment procedure, and give clues about data collection and analysis. 
We obtain a database of rules and indicators for identification of A-CEG elements 
and finally develop the A-CEG Construction Rules set on the basis of the database. 
We conclude with lessons learnt from this experiment and discuss the potential 
influencing factors on A-CEG construction. Caveats to avoid some common 
problems found in the practice of specifying SRSs were also discussed. Putting these 
suggestions into practice not only eases the task of identifying A-CEG elements, but 
also avoids some common problematic requirements and finally leads to reduced 
risks of unreliable software products. 
 Even when the A-CEG Construction Rules set is still open for criticism and 
improvement, we believe that using the indicators together with the A-CEG 
Construction Rules will maximize the possibility of recognizing a good amount of A-
 
155 
 
CEG elements. We also feel sure that A-CEG construction using our approach is 
somewhat automatic. 
 
156 
 
Chapter 8:  Validation of the Usability of the A-CEG 
Construction Rules 
 A methodology must be usable for people other than its developer and must be 
able to be incorporated into practice for its users. Usability is usually employed for 
measuring the capability of a methodology to be understood, learned, used, and 
attractive to the user, or the effort needed for use, when used under specified 
conditions. We wanted to explore the usability of the A-CEG Construction Rules. 
Moreover, we were very interested in investigating whether the rules set succeeds in 
its goals of providing the same or improved benefits for A-CEG construction, with 
what cost, and under what circumstances it makes the most sense. A controlled 
experiment seemed the ideal way to provide empirical evidence for our research 
interest.  
 During the Spring 2007 Semester, we conducted a small-scale controlled 
experiment (called Experiment D) at University of Maryland with the intention 
1) To compare and hence evaluate how well the A-CEG Construction Rules set 
performs in comparison to other A-CEG construction methods. Especially, we 
wanted to compare the A-CEG Construction Rules set to the widely used 
general A-CEG construction guidelines. 
2) To formulate hypotheses about the relationships between other factors (such 
as SRS? writing style and application type) and the effectiveness/efficiency of 
an A-CEG constructor in identifying A-CEG elements (including the causes, 
effects, logical relationships, and constraints). 
 
157 
 
 This chapter provides the pertinent information about the experiment. It describes the 
definitions, research questions, hypotheses, the variables measured, the experimental 
design (including the subjects, experiment materials, and procedures), the data collection 
process, threats to validity, and finally a detailed discussion of the experimental results.  
8.1 Definitions  
 The following definitions are used throughout the remainder of this chapter. 
Definition 8-1: Usability 
 Usability is a term used to denote the ease with which people can employ a 
technique in order to achieve a particular goal. According to Shneiderman [99], 
usability mainly observes learnability, effectiveness, efficiency, and user satisfaction. 
Definition 8-2: Learnability 
 Learnability is a measure of how rapidly a new user can start using the technique 
and also how an infrequent user can re-learn the technique after periods of not using 
it. Learning time is the typical measure for learnability. 
 In this study, we measured the learning time by adding the time spent in training 
sessions, time spent during the help sessions and the time spent on study materials. 
The time spent on study materials was recorded from the log sheets of the students. 
Definition 8-3: Effectiveness 
 Effectiveness is the measure of how easily a user can achieve basic tasks 
according to specific goals. Effectiveness is a measure of strategic performance: the 
ability to create an intended outcome. 
 
158 
 
 In this study, the effectiveness of a subject in identifying A-CEG elements is 
evaluated by three indicators: accuracy, recall, and precision. All of these three 
indicators are calculated based on the confusion matrix, a typical result counting 
technique. A confusion matrix contains information about actual and predicted 
classifications done by a subject, as shown in Figure 8-1. 
 
Predicted 
Positive (?yes?) Negative (?no?) 
Actual 
Positive (?yes?) True Positive (TP)
 False Negative 
(FN) 
Negative (?no?) False Positive (FP)
 True Negative 
(TN) 
 
Figure 8-1: Confusion Matrix 
 True Positives (TPs), True Negatives (TNs), False Positives (FPs), and False 
Negatives (FNs), are the four different possible outcomes of a single prediction for a 
two-class case with classes ?positive? (?yes?) and ?negative? (?no?). A false positive, 
FP, is when the outcome is incorrectly classified as ?positive? (or ?yes?) when it is in 
fact ?negative? (or ?no?). A false negative, FN, is when the outcome is incorrectly 
classified as negative when it is in fact positive. True positives and true negatives are 
obviously correct classifications. 
Definition 8-4: Accuracy 
 The Accuracy is the proportion of the total number of predictions that were 
correct. It is determined using the below equation: 
 
Accuracy TPs TNsTPs FPs TNs FNs
 ?? ? ? ?            (Eq. 8-1) 
 
159 
 
 In this study, the accuracy is the primary measure of effectiveness. However, the 
accuracy determined using (Eq. 8-1) may not be an adequate effectiveness measure 
when the number of negative cases is much greater than the number of positive cases. 
Suppose there are 1000 cases, 995 of which are negative cases and 5 of which are 
positive cases. If the system classifies them all as negative, the accuracy would be 
99.5%, even though the classifier missed all positive cases. Accounting for this, the 
recall and precision are included as subsidiary measures of effectiveness. 
Definition 8-5: Recall 
 The Recall or true positive rate is the proportion of positive cases that were 
correctly identified, as calculated using the equation below: 
Recall TPsTPs FNs? ?               
(Eq. 8-2) 
Definition 8-6: Precision 
 Precision is the proportion of the predicted positive cases that were correct, as 
calculated using the equation below: 
Precision TPsTPs FPs? ?              (Eq. 8-3) 
Definition 8-7: Efficiency 
 Efficiency is a measure of how fast a user can achieve goals. It is an operationally-
 oriented measure of productivity. The efficiency is determined using the below 
equation: 
Efficiency TPs TNsEffort
 ??               (Eq. 8-4) 
 
160 
 
 
 The time to find the A-CEG elements was considered as the measure of effort. 
This value was directly available from the log-sheets collected in the experiment. 
Definition 8-8: User Satisfaction 
 User satisfaction, also called User Appeal or Subjective Satisfaction, is the degree 
to which users like the method. This is a more ?subjective? factor which refers to 
attitude, perceptions, and feelings that a user experiences when interacting with a 
technique. 
 In this study, we divide the user satisfaction into five categories, with Category 1 
being the most satisfactory and Category 5 the least satisfactory. 
Definition 8-9: SRS Writing Style 
 Stylometry quantifies aspects of writing style. This study adopts Labbe?s Relative 
Inter-textual Distance [100] as the SRS stylometric characterization since its accuracy 
in text classification was confirmed by applications [101]. 
 The Relative inter-textual distance measures the degree of proximity between 
texts. The relative inter-textual distance between text A and B is: 
( )
 ( ),
 ,
 iA iA
 i A B
 A B
 iA iA
 i A i B
 F E
 F Ed
 ?
 ? ?
 -
 = +
 ?
 ? ?              (Eq. 8-5) 
where 
( ),A Bd  = the relative inter-textual distance between text A and B. 
iAF  = the absolute frequency of type i in text A. 
iAE  = Expected frequency of type i in text A occurring in text B, 
 
161 
 
    AiA iB
 B
 NE F N= ? . 
iBF  = the absolute frequency of type i in text B. 
AN  = size of text A, in number of tokens. 
BN  = size of text B, in number of tokens ( )B A N N> . 
 The values of relative distance vary evenly between 0 and 1.  
 To determine an appropriate threshold relative inter-textual distance to classify 
SRS? writing style into two levels, we conducted a preliminary study by taking the 
following steps: 
1) Collect a set of publicly available SRSs that follow IEEE Std. 830-1998 [53]. 
Forty eight SRSs were selected in this study, including the eight SRSs that 
were used in our previous research. These eight SRSs were written by a Ph.D. 
graduate student using a consistent writing style, i.e. consistently using ?IF?, 
?THEN?, ?AND?, ?OR?, and ?NOT?. These eight SRSs were judged to be 
easy to understand. 
2) Calculate the relative inter-textual distance among the eight SRSs. LOCAT 
SRS (one of the eight SRSs) is used as the benchmark SRS (Text A in (Eq.8-
 5)) because of its size (in terms of the number of tokens). LOCAT SRS is the 
smallest among these eight SRSs, as required by (Eq.8-5). The average value 
of the relative inter-textual distance among the eight SRSs is 0.32. 
3) Calculate the relative inter-textual distance, ( ),LOCAT Bd , between the remaining 
40 SRSs and LOCAT?s SRS.  
4) Judge whether an SRS is easy to understand or not. 
 
162 
 
5) We found that an SRS is easy to understand, whenever ( )LOCAT,Bd  is less than 
0.40. An SRS differs from the 8 SRSs in terms of writing style and is difficult 
to understand, whenever ( )LOCAT,Bd  is greater than 0.40. Therefore, we decided 
to use 0.40 as the threshold relative inter-textual distance to classify SRS? 
writing style into two levels: Style I ( ( )LOCAT,B 0.40d ? ) and Style II (
 ( )LOCAT,B 0.40d > ).  
 The results of the preliminary study for determining the threshold relative inter-
 textual distance are summarized in Table 8-1. 
Table 8-1: Results of the Preliminary Study for Determining the Threshold Relative 
Inter-textual Distance 
SRS 
index ( )LOCAT,Bd  Easy to Understand? Writing Style 
SRS-1 0.372 Yes Style I 
SRS-2 0.351 Yes Style I 
SRS-3 0.661 No Style II 
SRS-4 0.780 No Style II 
SRS-5 0.557 No Style II 
SRS-6 0.721 No Style II 
SRS-7 0.368 Yes Style I 
SRS-8 0.337 Yes Style I 
SRS-9 0.598 No Style II 
SRS-10 0.278 Yes Style I 
SRS-11 0.662 No Style II 
SRS-12 0.713 No Style II 
SRS-13 0.311 Yes Style I 
SRS-14 0.377 Yes Style I 
 
163 
 
SRS 
index ( )LOCAT,Bd  Easy to Understand? Writing Style 
SRS-15 0.307 Yes Style I 
SRS-16 0.251 Yes Style I 
SRS-17 0.741 No Style II 
SRS-18 0.791 No Style II 
SRS-19 0.634 No Style II 
SRS-20 0.655 No Style II 
SRS-21 0.345 Yes Style I 
SRS-22 0.487 No Style II 
SRS-23 0.596 No Style II 
SRS-24 0.698 No Style II 
SRS-25 0.731 No Style II 
SRS-26 0.371 Yes Style I 
SRS-27 0.512 No Style II 
SRS-28 0.371 Yes Style I 
SRS-29 0.337 Yes Style I 
SRS-30 0.301 Yes Style I 
SRS-31 0.476 No Style II 
SRS-32 0.381 Yes Style I 
SRS-33 0.551 No Style II 
SRS-34 0.671 No Style II 
SRS-35 0.708 No Style II 
SRS-36 0.805 No Style II 
SRS-37 0.357 Yes Style I 
SRS-38 0.351 Yes Style I 
SRS-39 0.366 Yes Style I 
SRS-40 0.578 No Style II 
 It should be noted that our process in which the threshold relative inter-textual 
distance is determined is rather subjective. Further research is required for validating 
 
164 
 
the use of Labbe?s Relative Inter-textual Distance as the SRS stylometric 
characterization. 
Definition 8-10: Application Type 
 The software projects were classified by Jones [102] as six application types. The 
definitions are given below: 
Application 
Type Explanation 
Commercial 
software 
Applications that are produced for large-scale marketing to hundreds 
or even millions of clients. Examples of commercial software are 
Microsoft Word, Excel, etc. 
End-user 
software 
Applications written by individuals who are neither professional 
programmers nor software engineers. 
Management 
information 
system (MIS) 
Applications that enterprises produce in support of their business and 
administrative operations, such as payroll systems, accounting 
systems, front- and back-office banking systems, insurance claims 
handling systems, airline reservation systems, and the like. 
Military software Software produced for a uniformed military service and constrained to follow the standards laid down for this purpose. 
Outsourced and 
contract software 
 
Software produced under a blanket contract by which a software 
development organization agrees to produce all, or specific categories, 
of software for the client organization. Contract software is a specific 
software project that is built under contract for a client organization. 
System software 
(SYSTEM) 
Software that controls physical devices. They include the operating 
systems that control computer hardware, network switching systems, 
automobile fuel injection systems, and other control systems. 
 
8.2 Research Questions and Hypotheses 
 This study had a primary major research question and a secondary research question. 
The primary and secondary research questions were: 
1. Are persons who use the CEC Construction Rules more effective or more efficient 
in A- CEG construction than persons using the general A-CEG construction 
guidelines? (Primary research question) 
 
165 
 
2. Do other factors (writing styles of SRS and Type of applications) impact the 
effectiveness of an inspector? (Secondary research question) 
 To our knowledge, these questions have not been previously investigated, and then 
there are no past findings to be used as hypotheses to be confirmed or rejected. 
 To investigate these two questions, a more detailed set of six hypotheses were 
defined. For each hypothesis, the null hypothesis (HX0) is presented, followed by the 
alternative hypothesis (HXa). 
H10  There is no difference in effectiveness between the subjects applying the 
A-CEG Construction Rules and the subjects using the general A-CEG 
construction guideline. 
H1a  The subjects applying the A-CEG Construction Rules significantly 
outperform the subjects using the general A-CEG construction guidelines 
in terms of effectiveness. 
  
H20  There is no difference in efficiency between the subjects applying the A-
 CEG Construction Rules and the subjects using the general A-CEG 
construction guideline. 
H2a  The subjects applying the A-CEG Construction Rules significantly 
outperform the subjects using the general A-CEG construction guidelines 
in terms of efficiency. 
  
H30  SRS? writing style does not affect subjects? effectiveness in identifying A-
 CEG elements. 
H3a  SRS? writing style significantly affects subjects? effectiveness in 
identifying A-CEG elements. More specifically, Style I is better than Style 
II in terms of effectiveness. 
  
H40  SRS? writing style does not affect subjects? efficiency in identifying A-
 CEG elements. 
 
166 
 
H4a  SRS? writing style significantly affects subjects? efficiency in identifying 
A-CEG elements. More specifically, Style I is better than Style II in terms 
of efficiency. 
  
H50 The application type does not affect subjects? effectiveness in identifying 
A-CEG elements. 
H5a The application type significantly affects subjects? effectiveness in 
identifying A-CEG elements. More specifically, SRSs of type SYSTEM 
can be handled more effectively than SRSs of type MIS. 
  
H60 The application type does not affect subjects? efficiency in identifying A-
 CEG elements. 
H6a The application type significantly affects subjects? efficiency in 
identifying A-CEG elements. More specifically, SRSs of type SYSTEM 
can be handled more efficiently than SRSs of type MIS. 
8.3 Variables 
 Three types of variables were defined for the experiment, independent, controlled, 
and dependent variables.  Because we did not have enough subjects, we focused our 
usability test on effectiveness and efficiency, and restricted the target SRSs to be of 
either SYSTEM or MIS. Moreover, we adopted the Randomized Block Designs 
techniques [103] to eliminate the experimental error due to nuisance factors. For 
randomized block designs, there is one factor that is of primary interest and several 
other nuisance factors that may affect the measured result, but are not of primary 
interest. The primary factor for this experiment was the A-CEG construction method 
used. The nuisance factors were the SRS? writing style and the SRS? application type. 
8.3.1 Independent Variables  
 Experiment D manipulated three independent variables: 
 
167 
 
o the A-CEG construction method used. The experiment groups used either the 
general A-CEG construction guidelines (excerpted from [45 pp. 65-88], for 
Group I), or the A-CEG Construction Rules (for Group II). 
o the SRS? writing style, with two values: Style I ( ( ), 0.40LOCAT Bd ? ) and Style 
II ( ( ), 0.40LOCAT Bd > ). 
o the SRS? application type, with two values:  MIS and SYSTEM. 
8.3.2 Controlled Variables  
 The controlled variables were  
o Standard that an SRS complies with: one level (all preselected SRSs follow 
IEEE Std. 830-1998 [53]). 
o Size of SRSs (SRS? size is measured in terms of number of sentences): one 
level (small) 
o Educational Background (field in which a subject?s advanced degrees were 
awarded): one level (all subjects? educational background is non computer-
 related). 
o Industrial experience: one level (all subjects have no industrial experience on 
software development) 
8.3.3 Dependent Variables:  
 Experiment D measured the following dependent variables: 
o Time to learn, measured in minutes ( 1T ) 
o Time spent on finding A-CEG elements, measured in minutes ( 2T ) 
o Number of true positives (TPs) 
 
168 
 
o Number of true negatives (TNs) 
o Number of false positives (FPs) 
o Number of false negatives (FNs) 
o Accuracy, measured as: Accuracy TPs TNsTPs FPs TNs FNs
 ?? ? ? ?  
o Recall, measured as:
  Recall
 TPs
 TPs FNs? ?  
o Precision, measured as: Precision TPsTPs FPs? ?  
o Efficiency, measured as: 
2
 Efficiency TPs TNsT
 ??  
 Among these dependent variables, the first six were direct measures. The last four 
were indirect measures and were calculated using the direct measures. 
8.4 Subjects 
 The subjects were four PhD students of a graduate level course on Software 
Quality Analysis at the University of Maryland. The experiment was performed as a 
13-week term class project mandatory for the course, ensuring the necessary 
motivation. The experiment served the educational objective of teaching students a 
black-box test cases design technique, as required by the course curriculum. 
 The subjects neither were notified about the experiment nor knew what the 
experimental variables were to ensure that they would not be influenced by the 
knowledge of the experiment. Preventive steps were taken to ensure that the students 
had no unwanted communications during the course. 
 
169 
 
8.5 Experiment Materials 
 The experiment materials were: 
? IEEE standard IEEE Std. 830-1998 [53]  
? eighteen SRS segments adapted from the functional requirements sections of 
ten preselected SRSs 
? general experiment instructions 
? the general A-CEG construction guidelines used in industry (excerpted from 
[45 pp. 65-88], for Group I only) 
? the A-CEG Construction Rules (for Group II only) 
? CEG report forms and time log-sheets (shown in Appendix D) 
? A questionnaire to assess subjects? background (shown in Appendix E) 
? A questionnaire to assess the ease of use of  the A-CEG construction Rules 
(shown in Appendix F) 
 All preselected SRSs were written in plain text English natural language and 
adhered to the IEEE specification standard IEEE Std. 830-1998 [53]. The SRSs were 
analyzed for defects prior to the experiment by an independent inspector (the author). 
This was necessary because the requirements were assumed to be correct for the 
purpose of the experiment. These SRSs were the requirements documents for the 
following systems: 
1. CHAIRMAN Conference Management System (CCMS) [104]: a web 
application that supports every aspect of the conference organization process. 
This includes paper submission, reviewer assignments, revised and camera-
 ready paper submission, registration handling of the conference participants. 
 
170 
 
2. Invisible Meeting Scheduler (IMS) [105]: a software application to assist in 
the scheduling of meetings among individuals whose schedules are available 
in an online calendar.  
3. LOCAT [63]: a real-time simple projectile tracking system for the Army?s all 
weather Doppler radar system called TRAC. The software is part of a host 
software subsystem called COMP running on a Sparc 4 system at 0.08 MIPS. 
4.  PACS [88]: a personal access control system. 
5. Student Registry Query System (SRQS) [106]: an application designed for 
students to create and manage their accounts online. Registry DB is a database 
that maintains student SSN, student login ID, student password, course 
information, and registration information. 
6. Search PUBS (SSP) [107]: an application designed for generating queries in 
order to interact with the PUBS database. PUBS is a database consisting of 
information on authors listing fields of last name, first name, publications and 
the city, which they belong to. 
7. SXXX [40]: a part of a digital protection system used in nuclear power plants. 
8. Tellerfast [108]: a software package performing as a part of the Automated 
Teller Machine (ATM) system described in the system requirements 
specifications of the Bank of HESUS. This software product provides the 
control necessary for the ATM system to perform its activities. 
9. The Energy Management System (THEMAS) [109]: an energy management 
system that operates independent of any other system, or any components of 
the heating and cooling system to which it is attached. 
 
171 
 
10. Word Processor Unit (WPU) [110]: an application designed to perform word 
processing using functions such as adding text, deleting text, word and 
character counting based on the user input. 
 Each SRS segment used in Experiment D contained a complete set of sections for 
a functional module, including ?Input Section?, ?Processing Section?, and ?Output 
Section?. Table 8-2 presents some basic information for the SRS segments used in the 
experiment. 
Table 8-2: Basic Information on SRS Segments Used in Experiment D 
Index of SRS 
Segment SRS 
Writing 
Style 
Application 
Type 
Experiment Phase in which 
the SRS segment was used 
S-Training1 LOCAT Style I SYSTEM Phase I  
(Preparation and Training) S-Training2 Tellerfast Style II MIS 
S1 SRQS Style I MIS 
Phase II  
(Implementation) 
S2 
S3 SSP Style I MIS S4 
S5 PACS Style I SYSTEM S6 
S7 WPU Style I SYSTEM S8 
S9 CCMS Style II MIS S10 
S11 IMS Style II MIS S12 
S13 SXXX Style II SYSTEM S14 
S15 THEMAS Style II SYSTEM S16 
 
 
172 
 
 In addition, the SRS segments used in Phase II (S1 to S16) were thoroughly 
examined and a list of causes, effects, constraints, and logical relationships was 
produced. This list was prepared by an individual (the author) who was very familiar 
with the applications and the CEGA techniques. The aggregation numbers for this list 
are shown in Table 8-3. 
Table 8-3: Data on SRS Segments Used in Experiment Phase II 
Index 
of SRS 
segment 
Size of SRS 
segment Number of Total number of 
A-CEG 
elements 
Number 
of pages 
Number 
of 
sentences 
Causes Effects Constraints Logical relationships11 
S1 3 35 11 8 6 19 44 
S2 3 34 8 9 5 18 40 
S3 3 33 8 9 4 19 40 
S4 3 32 7 11 5 22 45 
S5 3 37 9 7 6 17 39 
S6 3 31 8 9 5 20 42 
S7 3 34 8 10 5 21 44 
S8 3 36 10 6 5 18 39 
S9 3 32 11 8 6 19 44 
S10 3 34 9 9 5 18 41 
S11 3 35 10 9 5 20 44 
S12 3 32 8 11 5 20 44 
S13 3 36 10 7 6 16 39 
S14 3 35 9 9 4 20 42 
S15 3 34 10 8 6 17 41 
S16 3 36 8 11 3 23 45 
 
                                                 
 
 
11 The number of logical relationships was counted in terms of the four basic logical relationships: 
?IDENTITY?, ?AND?, ?OR?, and ?NOT?. 
 
173 
 
8.6 Procedure 
 The experiment consists of two phases: Training and Preparation (Phase I), and 
Implementation (Phase II).  
 During Phase I, all subjects were prepared with a set of training lectures on IEEE 
standard for SRS and A-CEG construction. Apart from the theory presentations, the 
sessions consisted of an in-class questionnaire and an in-class quiz, and practical 
demonstrations of the techniques. Care was taken to avoid any biases that were 
suspected to be present. 
 During Phase II, all subjects were given four sets of assignment packages to 
complete the experimental tasks. 
8.6.1 Training and Preparation (Phase I) 
 The first phase of the experiment lasted three weeks. 
 The subjects were first given a questionnaire with ten questions to appraise their 
knowledge on CEGA, requirements analysis, and industry experience. The 
questionnaire showed that students had same type of backgrounds. Therefore, it was 
not necessary to take effort to mitigate the effect of the background factor from the 
experiment.  
 We then gave a 1.5-hour lecture on the IEEE standard for SRS and taught A-CEG 
construction. A sample SRS (S-Training1 in Table 8-2) was presented and an 
assignment was given for finding A-CEG elements. The results were discussed in 
class and a list of known A-CEG elements was written out according to the schema of 
A-CEG elements report forms. We then introduced a new SRS (S-Training2 in Table 
8-2). As an in-class quiz, students were asked to individually read the SRS and record 
 
174 
 
A-CEG elements on the CEG report forms to be used in this experiment. Subjects 
took the quiz in a classroom with enough space to avoid plagiarism. After the quiz, 
we discussed with students the list of known A-CEG elements and what A-CEG 
elements that they had actually found. 
 The subjects were then ordered by expected performance and randomly assigned 
to the two groups (Group I and Group II) in such a manner that one out of any two 
subjects with similar expected performance would be assigned to each group. This 
step was taken to avoid bias. Since no better information was available, we used the 
scores from the quiz assignments for estimating the subjects? expected performance. 
We do not claim that this arrangement provides perfect matches, but other studies 
found that this usually results in groups with reasonably balanced average subject 
ability. 
 Another two lectures were given to Group I and Group II, separately. Subjects in 
Group I were instructed how to use the general A-CEG construction guidelines while 
subjects in Group II were instructed how to use the A-CEG Construction Rules. Each 
lecture lasted 2-hour long. 
 Finally, all subjects were given a lecture on the whole process of Phase II 
experiment, explaining the goals and the specific process to be used in the experiment. 
In addition, the students were instructed to work independently and record every 
event. The students were also given log-sheets and were demonstrated how to use 
them. The experiment details were recorded on log-sheets. The students recorded the 
time taken, nature and the possible cause of any events in the log-sheets. There were 
extra credits for using log-sheets which provided them the necessary motivation. 
 
175 
 
While designing the log-sheet, it was ensured that it is very easy to fill in and that it is 
not ambiguous. This ensured that the extra burden on the subjects because of the log-
 sheets was minimal. 
 Questions were encouraged during the class but no interactions were allowed 
among students outside the class. Students were strictly instructed to avoid outside-
 class communications. We were always present to answer questions and preventing 
unwanted communication. All questions to the instructor, outside the class were 
through help sessions. Events in lecture and the help sessions were recorded for the 
learning time measure. 
8.6.2 Running the Experiment (Phase II) 
 After the last lecture, all students were assigned four assignment packages. This 
part of the experiment lasted 8 weeks/2 months. 
 Each assignment package contained: 
1. instructions for the assigned task 
2. an SRS segment 
3. either the general A-CEG construction guidelines (for Group I) or the A-CEG 
Construction Rules set  (for Group II) 
4. blank CEG report forms  
5. a log-sheet for A-CEG elements finding time 
 The instructions for the students were: 
1. The assignment is due in two weeks. 
2. No communication with other students in regard to the assignment is allowed. 
3. The textual requirements are assumed to be correct. 
 
176 
 
4. Read through all documents briefly before starting to work. 
5. The main task is to identify and record A-CEG elements (including causes, 
effects, logical relationships, and constraints) in the assigned SRS segment. 
6. Log all clock times about the activities. 
7. When finished, verify that the logged data seem to be correct and hand them 
in. 
 The four assignment packages were assigned one after another rather than being 
assigned all at once. The next package was assigned when the previous package was 
submitted. After each student turned in the assignments and log-sheets, the data was 
briefly examined for errors and missing information in the record in order to get as 
accurate data as possible. Table 8-4 shows the assignment information on the SRS 
segments. 
  
 
177 
 
Table 8-4: Assignments of SRS Segments 
SRS Segment 
Index SRS 
Writing 
Style 
Type of 
Application 
Student 
Assigned 
Group 
Assigned 
S1 
SRQS Style I MIS 
Student A Group I 
S2 Student B Group II 
S3 
SSP Style I MIS 
Student C Group I 
S4 Student D Group II 
S5 
PACS Style I SYSTEM 
Student A Group I 
S6 Student B Group II 
S7 
LOCAT Style I SYSTEM 
Student C Group I 
S8 Student D Group II 
S9 
CCMS Style II MIS 
Student A Group I 
S10 Student B Group II 
S11 
IMS Style II MIS 
Student C Group I 
S12 Student D Group II 
S13 
SXXX Style II SYSTEM 
Student A Group I 
S14 Student B Group II 
S15 
THEMAS Style II SYSTEM 
Student C Group I 
S16 Student D Group II 
 
 A postmortem questionnaire (see Appendix F) with nine questions was sent to the 
students at the end of the experiment to assess the subjective measures of usability, 
satisfaction and ease, and to help us understand the opinion of the participants toward 
the A-CEG technique. 
 The entire design for Experiment D is provided in Table 8-5. 
  
 
178 
 
Table 8-5: Entire Design of Experiment D 
Group  Activity 
I Q1 T1 Q2 M1 R 
T2-1 M2-1 S Q3 M3 II T2-2 M2-2 
Where 
Q1 : Delivering and collecting a questionnaire to distinguish subjects? 
background of knowledge and industry experience. 
T1 : Training all subjects on IEEE standard IEEE Std. 830-1998 for SRS 
and A-CEG construction (1.5 hours) 
Q2 : Assigning a quiz to distinguish student?s expected performance of 
A-CEG construction (0.5 hours) 
M1 : Measuring the students? expected performance. 
R : Randomization according to subjects? expected performance 
T2-1 : Training Group I on using the general A-CEG construction 
guidelines (2 hours) 
T2-2 : Training Group II on using the A-CEG Construction Rules (2 hours) 
M2-1 : Measuring time that Group I students need to master the guidelines. 
M2-2 : Measuring time that Group II students need to master the A-CEG 
Construction Rules. 
S : SRS assignments (12 weeks) 
Q3 : Delivering and collecting a postmortem questionnaire asking for 
subjective judgments on the usability, satisfaction and ease of the A-
 CEG construction method. (0.5 hours) 
M3 : Measuring subjects? performance and analyzing results 
 
 
179 
 
8.7 Experiment Results and Discussion 
 Table 8-6 presents the experiment data used in statistical tests. 
Table 8-6: Experiment Data Used for Hypotheses Testing 
SRS 
Segment 
Index 
Independent Variable Dependent Variable 
A-CEG 
Construction 
Method 
Writing 
Style 
Type of 
Application TP TN FP FN 2
 T , 
minutes
 S1 Group I Style I MIS 37 0 0 7 84.5 
S2 Group II Style I MIS 34 0 2 6 75.5 
S3 Group I Style I MIS 33 0 1 7 83 
S4 Group II Style I MIS 40 0 1 5 76 
S5 Group I Style I SYSTEM 32 0 1 7 87 
S6 Group II Style I SYSTEM 38 0 2 4 77.5 
S7 Group I Style I SYSTEM 36 0 2 8 85 
S8 Group II Style I SYSTEM 33 0 1 6 81.5 
S9 Group I Style II MIS 33 0 3 11 84.5 
S10 Group II Style II MIS 34 0 2 7 85 
S11 Group I Style II MIS 32 0 2 12 87.5 
S12 Group II Style II MIS 36 0 3 8 91.5 
S13 Group I Style II SYSTEM 28 0 3 11 95 
S14 Group II Style II SYSTEM 35 0 2 7 89.5 
S15 Group I Style II SYSTEM 28 0 2 13 85 
S16 Group II Style II SYSTEM 32 0 3 13 90 
 
8.7.1 Statistical Analysis 
 In this section, we focus on descriptive analysis and statistical tests for proposed 
hypotheses. All hypotheses are analyzed taking the following steps: 
Step 1: Calculating the descriptive statistical data. These data are then displayed 
using a box plot for the comparison analysis of the commonalities and 
 
180 
 
differences between two populations. A box plot (also known as a box-and-
 whisker diagram) is a convenient way of graphically depicting groups of 
numerical data through five-number summaries, including minimum, 25% 
quartile, median, 75% quartile, and maximum [111]. 
Step 2: Performing one-tailed F-test (two-sample for variances). The F-test is used 
to test for differences among sample variance [111]. This test can be a two-
 tailed test or a one-tailed test. The two-tailed version tests against the 
alternative hypothesis that the standard deviations are not equal. The one-
 tailed version only tests the standard deviation from the first population is 
either greater than or less than (but not both) the second population standard 
deviation. 
Step 3: Performing two-sample Student?s t-test. Student's t-test is one of the most 
commonly used techniques for testing a hypothesis on the basis of a 
difference between sample means for small samples, usually less than thirty 
[111]. It is applied when the population is assumed to be normally 
distributed but the sample sizes are small enough that the statistic on which 
inference is based is not normally distributed because it relies on an 
uncertain estimate of standard deviation rather than on a precisely known 
value [111]. 
Step 4: Performing Mann?Whitney U test (also called Wilcoxon rank-sum test, or 
Wilcoxon-Mann-Whitney test). The Mann-Whitney U test is a non-
 parametric alternative to the two-sample Student's t-test when the population 
 
181 
 
cannot be assumed to be normally distributed [111]. In Experiment D, we 
use the Mann-Whitney U test as a subsidiary to the Student?s t-test. 
Step 5: Drawing a conclusion about the hypothesis test by either accepting the null 
hypothesis or rejecting the null in favor of the alternative hypothesis. The 
significance value of rejecting the null hypotheses is 0.05 for all tests. 
8.7.1.1. Impact of A-CEG Construction Method on Effectiveness (Hypothesis H1) 
 Table 8-7 presents the descriptive statistics for the impact of the A-CEG 
Construction method (independent variable) on the effectiveness (dependent variable). 
This independent variable was defined with two levels: either using the general A-
 CEG construction guidelines (Group I) or using the A-CEG Construction Rules set 
(Group II). 
Table 8-7: Descriptive Statistics for the Impact of A-CEG Construction Method on 
Effectiveness 
A-CEG 
Construction 
Method  
Dependent 
Variable Mean 
Standard 
Deviation Min 
Lower 
Quart Median 
Upper 
Quart Max 
Group I 
Accuracy, 
% 74.3 7.07 65.1 68.9 74.2 80.1 84.1 
Recall, 
% 77.29 6.03 68.3 72.5 78.4 82.2 84.1 
Precision, 
% 94.8 3.20 90 92.8 94.5 97.0 100.0 
Group II 
Accuracy, 
% 80.1 5.80 68.8 78.5 80.3 83.5 87.0 
Recall, 
% 83.5 5.84 71.1 82.6 84.0 86.0 90.5 
Precision, 
% 94.9 1.89 92 94 94.5 95.5 98 
 
 The box plots in Figure 8-2 graphically show the impact of the A-CEG 
construction method on the effectiveness. 
 
182 
 
  
 (a) Accuracy                          (b) Recall                          (c) Precision 
Figure 8-2: Impact of A-CEG construction method on the Effectiveness 
 Table 8-8 presents the results obtained from the F-test, Student?s t-test, and Mann-
 Whitney U test using SPSS Statistics? [112]. In this table, ?N? represents observations, 
?df? short for ?degree of freedom?, and ?t-Stat? for ?t-Statistic?. 
Table 8-8: Statistical Testing Results for Hypothesis H1 ( 0.05a = ) 
Independent 
Variable 
Dependent 
Variable N 
F-test Student?s t-test 
Mann-
 Whitney 
U test 
df F P(F<=f)1-tailed df 
t-
 Stat 
P(T<=t) 
1-tailed12 
P-value 
1-tailed 
A-CEG 
Construction 
Method 
Accuracy 8 7 1.55 0.289 14 -1.77 0.049 0.0805 
Recall 8 7 1.07 0.468 14 -2.10 0.027 0.019 
Precision 8 7 3.50 0.060 14 -0.16 0.438 0.323 
 The results of F-tests ( ( ) 0.289P F f? = , ( ) 0.468P F f? = , and 
( ) 0.060P F f? = ) indicate that there is no significant difference in variances 
between samples for the accuracy, recall, and precision measure, respectively.  
                                                 
 
 
12 The value of ?P(T <= t) 1-tailed? indicates the False Negative Rate ?, the probability of failing to 
reject a null hypothesis. 1 ? ? is the power of a test. 
60
 65
 70
 75
 80
 85
 90
 95
 100
 Group?I Group?II
 60
 65
 70
 75
 80
 85
 90
 95
 100
 Group?I Group?II
 60
 65
 70
 75
 80
 85
 90
 95
 100
 Group?I Group?II
 
183 
 
 
 The results of both Student?s t-test ( ( ) 0.438P T t? = ) and Mann-Whitney U test 
( 0.323P = ) show that there is no significant effect of the A-CEG construction 
method on the precision at 0.05a = . However, a main effect of the A-CEG 
construction method on accuracy and recall was observed ( ( ) 0.049P T t? = ,
 0.0805P = for accuracy, and ( ) 0.027P T t? = , 0.019P =  for recall), suggesting that 
Group II (using the A-CEG construction rules set) produced more accurate CEG 
elements, and missed less CEG elements than Group I. These allow H10 to be rejected 
in favor of H1a. 
8.7.1.2. Impact of A-CEG Construction Method on Efficiency (Hypothesis H2) 
 Table 8-9 follows the presentation style used in Table 8-7, but deals with 
efficiency instead of effectiveness. 
Table 8-9: Descriptive statistics for the Impact of A-CEG Construction Method on 
Efficiency 
A-CEG 
Construction 
Method 
Dependent 
Variable Mean
 Standard 
Deviation Min 
Lower 
Quart Median 
Upper 
Quart Max 
Group I Efficiency, A-CEG 
elements/hr 
22.6 2.84 17.7 21.4 22.8 24.8 26.3 
Group II 25.6 3.45 21.3 23.6 24.2 27.6 31.6 
 The box plot in Figure 8-3 graphically shows the impact of the A-CEG 
construction method on the efficiency. 
 
184 
 
 
Figure 8-3: Impact of A-CEG Construction Method on Efficiency 
 Table 8-10 presents the results obtained from the F-test, Student?s t-test, and 
Mann-Whitney U test using SPSS Statistics? [112]. In this table, ?N? represents 
observations, ?df? short for ?degree of freedom?, and ?t-Stat? for ?t-Statistic?. 
Table 8-10: Statistical Testing Results for Hypothesis H2 ( 0.05a = ) 
Independent 
Variable 
Dependent 
Variable N 
F-test Student?s t-test 
Mann-
 Whitney 
U test 
df F P(F<=f)1-tailed df 
t-
 Stat 
P(T<=t) 
1-tailed13 
P-value 
1-tailed 
A-CEG 
Construction 
Method 
Efficiency 8 7 0.68 0.309 14 -1.91 0.038 0.0525 
 The results of the F-test ( ( ) 0.309P F f? = ) indicate that there is no significant 
difference in variances between samples for the efficiency measure.  
 The results of both Student?s t-test ( ( ) 0.038P T t? = ) and Mann-Whitney U test 
( 0.0525P = ) show that there is a significant effect of the A-CEG construction 
method on the efficiency at 0.05a = , suggesting that Group II (using the A-CEG 
                                                 
 
 
13 The value of ?P(T <= t) 1-tailed? indicates the False Negative Rate ?, the probability of failing to 
reject a null hypothesis. 1 ? ? is the power of a test. 
15
 20
 25
 30
 35
 Group?I Group?II
 
185 
 
construction rules set) were more efficient in identifying CEG elements. These allow 
H20 to be rejected in favor of H2a. 
8.7.1.3. Impact of SRS? Writing Style on Effectiveness (Hypothesis H3) 
 Table 8-11 presents the descriptive statistics for the impact of the SRS? writing 
style (independent variable) on the effectiveness (dependent variable). This 
independent variable was defined with two levels: Style I and Style II. 
Table 8-11: Descriptive Statistics for the Impact of SRS? Writing Styles on 
Effectiveness 
SRS?  
Writing Style 
Dependent 
Variable Mean 
Standard 
Deviation Min 
Lower 
Quart Median 
Upper 
Quart Max 
Style I 
Accuracy, 
% 82.4 3.16 78.3 80.4 81.8 84.7 87.0 
Recall, 
% 84.9 3.19 81.8 82.4 84.4 86.0 90.5 
Precision, 
% 96.6 1.93 94.0 94.9 97.1 97.2 100.0 
Style II 
Accuracy, 
% 71.9 5.91 65.1 66.7 69.9 77.2 79.5 
Recall, 
% 75.9 5.94 68.3 71.6 73.9 82.1 83.3 
Precision, 
% 92.8 1.58 90.3 91.6 92.8 94.2 94.6 
 
 The box plots in Figure 8-4 graphically show the impact of the SRS? writing style 
on the effectiveness. 
 
186 
 
  
 (a) Accuracy                          (b) Recall                          (c) Precision 
Figure 8-4: Impact of Writing Style on Effectiveness 
 Table 8-12 presents the results obtained from the F-test, Student?s t-test, and 
Mann-Whitney U test using SPSS Statistics? [112]. In this table, ?N? represents 
observations, ?df? short for ?degree of freedom?, and ?t-Stat? for ?t-Statistic?. 
Table 8-12: Statistical Testing Results for Hypothesis H3 ( 0.05a = ) 
Independent 
Variable 
Dependent 
Variable N 
F-test Student?s t-test 
Mann-
 Whitney 
U test 
df F P(F<=f)1-tailed df 
t-
 Stat 
P(T<=t) 
1-tailed14 
P-value 
1-tailed 
SRS? Writing 
Style 
Accuracy 8 7 0.28 0.059 14 4.60 0.00021 0.0005 
Recall 8 7 0.29 0.060 14 3.81 0.00097 0.0025 
Precision 8 7 1.5 0.302 14 4.31 0.00036 0.0005 
 The results of F-tests ( ( ) 0.059P F f? = , ( ) 0.060P F f? = , and 
( ) 0.302P F f? = ) indicate that there is no significant difference in variances 
between samples for the accuracy, recall, and precision measure, respectively.  
                                                 
 
 
14 The value of ?P(T <= t) 1-tailed? indicates the False Negative Rate ?, the probability of failing to 
reject a null hypothesis.?1 ? ? is the power of a test. 
60
 65
 70
 75
 80
 85
 90
 95
 100
 Style?I Style?II
 60
 65
 70
 75
 80
 85
 90
 95
 100
 Style?I Style?II
 60
 65
 70
 75
 80
 85
 90
 95
 100
 Style?I Style?II
 
187 
 
 The results of both Student?s t-test ( ( ) 0.00021P T t? = , ( ) 0.00097P T t? = , and 
( ) 0.00036P T t? = ) and Mann-Whitney U test ( 0.0005P = , 0.0025P = , and 
0.005P = ) show that there are strongly significant effects of the SRS? writing style 
on the accuracy, recall, and precision at 0.05a = , suggesting that A-CEG elements 
in SRSs of Style I  were identified more effectively than those in SRSs of Style II. 
These allow H30 to be rejected in favor of H3a. 
8.7.1.4. Impact of SRS? Writing Style on Efficiency (Hypothesis H4) 
 Table 8-13 follows the presentation style used in Table 8-11, but deals with 
efficiency instead of effectiveness. 
Table 8-13: Descriptive Statistics for the Impact of SRS? Writing Style on Efficiency 
SRS? 
Writing 
Style 
Dependent 
Variable Mean 
Standard 
Deviation Min 
Lower 
Quart Median 
Upper 
Quart Max 
Style I Efficiency, A-CEG 
elements/hr 
26.25 3.08 22.1 24.2 25.9 27.6 31.6 
Style II 21.9 2.22 17.7 21.0 22.7 23.5 24.0 
 The box plot in Figure 8-5 graphically shows the impact of the SRS? writing style 
on the efficiency. 
 
Figure 8-5: Impact of Writing Style on Efficiency 
15
 20
 25
 30
 35
 Style?I Style?II
 
188 
 
 Table 8-14 presents the results obtained from the F-test, Student?s t-test, and 
Mann-Whitney U test using SPSS Statistics? [112]. In this table, ?N? represents 
observations, ?df? short for ?degree of freedom?, and ?t-Stat? for ?t-Statistic?. 
Table 8-14: Statistical Testing Results for Hypothesis H4 ( 0.05a = ) 
Independent 
Variable 
Dependent 
Variable N 
F-test Student?s t-test 
Mann-
 Whitney 
U test 
df F P(F<=f)1-tailed df 
t-
 Stat 
P(T<=t) 
1-tailed15 
P-value 
1-tailed 
SRS? Writing 
Style Efficiency 8 7 1.92 0.204 14 3.24 0.00296 0.0015 
 The results of the F-test ( ( ) 0.204P F f? = ) indicate that there is no significant 
difference in variances between samples for the efficiency measure.  
 The results of both Student?s t-test ( ( ) 0.00296P T t? = ) and Mann-Whitney U test 
( 0.0015P = ) show that there is a strongly significant effect of the SRS? writing style 
on the efficiency at 0.05a = , suggesting that A-CEG elements in SRSs of Style I  
were identified more efficiently than those in SRSs of Style II. These allow H40 to be 
rejected in favor of H4a. 
8.7.1.5. Impact of SRS? Application Type on Effectiveness (Hypothesis H5) 
 Table 8-15 presents the descriptive statistics for the impact of the SRS? 
application type (independent variable) on the effectiveness (dependent variable). 
This independent variable was defined with two levels: SYSTEM and MIS. 
                                                 
 
 
15 The value of ?P(T <=t ) 1-tailed? indicates the False Negative Rate ?, the probability of failing to 
reject a null hypothesis. 1 ? ? is the power of a test. 
 
189 
 
Table 8-15: Descriptive Statistics for the Impact of SRS? Application Type on 
Effectiveness 
SRS? 
Writing 
Style 
Dependent 
Variable Mean 
Standard 
Deviation Min 
Lower 
Quart Median 
Upper 
Quart Max 
SYSTEM 
Accuracy, 
% 78.4 6.67 66.7 76.9 79.8 81.8 86.4 
Recall, 
% 81.8 5.87 71.8 80.3 83.1 84.3 90.5 
Precision, 
% 94.7 2.98 90.3 93.7 94.5 95.5 100.0 
MIS 
Accuracy, 
% 76.3 7.32 65.1 70.5 77.45 81 87 
Recall, 
% 79.0 7.31 68.3 72.3 81.8 83.0 88.9 
Precision, 
% 95.4 2.06 92.3 93.9 95.9 97.1 97.6 
 The box plots in Figure 8-6 graphically show the impact of the SRS? application 
type on the effectiveness. 
  
 
190 
 
 
  
(a) Accuracy                          (b) Recall                          (c) Precision 
Figure 8-6: Impact of Application Type on Effectiveness 
 Table 8-16 presents the results obtained from the F-test, Student?s t-test, and 
Mann-Whitney U test using SPSS Statistics? [112]. In this table, ?N? represents 
observations, ?df? short for ?degree of freedom?, and ?t-Stat? for ?t-Statistic?. 
Table 8-16: Statistical Testing Results for Hypothesis H5 ( 0.05a = ) 
Independent 
Variable 
Dependent 
Variable N 
F-test Student?s t-test 
Mann-
 Whitney 
U test 
df F P(F<=f)1-tailed df 
t-
 Stat 
P(T<=t) 
1-tailed16 
P-value 
1-tailed 
SRS? 
Application 
Type 
Accuracy 8 7 0.83 0.406 14 0.59 0.281 0.287 
Recall 8 7 0.64 0.288 14 0.87 0.200 0.140 
Precision 8 7 2.09 0.176 14 -0.58 0.574 0.253 
                                                 
 
 
16 The value of ?P(T <= t) 1-tailed? indicates the False Negative Rate ?, the probability of failing to 
reject a null hypothesis.?1 ? ? is the power of a test. 
60
 65
 70
 75
 80
 85
 90
 95
 100
 SYSTEM MIS
 60
 65
 70
 75
 80
 85
 90
 95
 100
 SYSTEM MIS
 60
 65
 70
 75
 80
 85
 90
 95
 100
 SYSTEM MIS
 
191 
 
 The results of F-tests ( ( ) 0.406P F f? = , ( ) 0.288P F f? = , and ( ) 0.176P F f? = ) 
indicate that there is no significant difference in variances between samples for the 
accuracy, recall, and precision measure, respectively.  
 The results of both Student?s t-test ( ( ) 0.281P T t? = , ( ) 0.200P T t? = , and 
( ) 0.574P T t? = ) and Mann-Whitney U test ( 0.287P = , 0.140P = , and 0.253P = ) 
show that there are no significant effects of the SRS? application type on the accuracy, 
recall, and precision at 0.05a = . These allow H50 to be accepted. 
8.7.1.6. Impact of SRS? Application Type on Efficiency (Hypothesis H6) 
 Table 8-17 follows the presentation style used in Table 8-15, but deals with 
efficiency instead of effectiveness. 
Table 8-17: Descriptive Statistics for the Impact of SRS? Application Type on 
Efficiency 
SRS? 
Application 
Type 
Dependent 
Variable Mean 
Standard 
Deviation Min 
Lower 
Quart Median 
Upper 
Quart Max 
SYSTEM Efficiency, A-CEG 
elements/hr 
24.2 3.53 17.7 23.1 23.8 26.5 29.4 
MIS 24.0 3.57 19.8 21.8 23.8 24.6 31.6 
 The box plot in Figure 8-7 graphically shows the impact of the SRS? application 
type on the efficiency. 
 
192 
 
 
Figure 8-7: Impact of Application Type on Efficiency 
 Table 8-18 presents the results obtained from the F-test, Student?s t-test, and 
Mann-Whitney U test using SPSS Statistics? [112]. In this table, ?N? represents 
observations, ?df? short for ?degree of freedom?, and ?t-Stat? for ?t-Statistic?. 
Table 8-18: Statistical Testing Results for Hypothesis H6 ( 0.05a = ) 
Independent 
Variable 
Dependent 
Variable N 
F-test Student?s t-test 
Mann-
 Whitney 
U test 
df F P(F<=f)1-tailed df 
t-
 Stat 
P(T<=t) 
1-tailed17 
P-value 
1-tailed 
SRS? 
Application 
Type 
Efficiency 8 7 0.98 0.487 14 0.11 0.456 0.399 
 The results of the F-test ( ( ) 0.487P F f? = ) indicate that there is no significant 
difference in variances between samples for the efficiency measure.  
                                                 
 
 
17 The value of ?P(T <= t) 1-tailed? indicates the False Negative Rate ?, the probability of failing to 
reject a null hypothesis.?1 ? ? is the power of a test. 
15
 20
 25
 30
 35
 SYSTEM MIS
 
193 
 
 The results of both Student?s t-test ( ( ) 0.456P T t? = ) and Mann-Whitney U test 
( 0.399P = ) show that there is no significant effect of the SRS? application type on 
the efficiency at 0.05a = . These allow H60 to be accepted. 
8.7.2 Summary of Statistical Testing 
 Table 8-19 provides the summary of the statistical tests. Overall, the statistical 
testing results indicate two things: 
1. The A-CEG Construction Rules are helpful in identifying A-CEG elements. 
Subjects using the A-CEG Construction Rules committed false positives and 
false negatives less frequently and identified the true positives more 
efficiently. 
2. The SRS? writing style has a significant impact on the identification of A-
 CEG elements. SRSs of Style I were handled far more effectively and 
efficiently than SRSs of Style II.  
 The most surprising finding is that the application type, which was assumed 
important, was shown to have no statistically significant impact on A-CEG 
construction. Note that this conclusion was drawn on the basis of the comparison 
between SYSTEM and MIS. It may not be true for the comparison among other 
application types. 
Table 8-19: Summary of Statistical Tests 
Hypothesis Testing Result Explanation 
H1 Accepted H1a 
The subjects applying the A-CEG Construction Rules 
significantly outperform the subjects using the general A-
 CEG construction guidelines in terms of effectiveness. 
H2 Accepted H2a The subjects applying the A-CEG Construction Rules significantly outperform the subjects using the general A-
 
194 
 
CEG construction guidelines in terms of efficiency. 
H3 Accepted H3a SRS? writing style significantly affects subjects? effectiveness in identifying A-CEG elements. 
H4 Accepted H4a SRS? writing style significantly affects subjects? efficiency in identifying A-CEG elements. 
H5 Accepted H50 
The impact of SRS? application type on subjects? 
effectiveness in identifying A-CEG elements is relatively 
small and not statistically significant 
H6 Accepted H60 
The impact of SRS? application type on subjects? efficiency in 
identifying A-CEG elements is relatively small and not 
statistically significant 
8.7.3 Qualitative Analysis 
 Table 8-6 presents the experiment data used for qualitative analysis. A major 
caveat would be that the postmortem questionnaire (see Appendix F) measured 
subjects? stated opinions rather than their actual ones, which could be markedly at 
odds with this. 
Table 8-20: Experiment data for Qualitative Analysis 
Subject Group 1
 T , 
minutes 
Answers to Postmortem Questionnaire 
Q1 
(Usefulness) 18 
Q2 (Ease 
of use) 19 
Q3 (Ease of 
Learning) 20 
Q4 
(Satisfaction) 21 
Q5 (In 
general) 22 
Student 
A I 285 1 4 3 4 3 
Student 
C I 293 2 4 3 2 2 
Student 
B II 302 2 4 3 4 3 
Student 
D II 291 2 2 1 2 2 
 
? Learning Time ( 1T ) 
                                                 
 
 
18 For this column, an answer with ?1? is the most useful and ?5? the least useful. 
19 For this column, an answer with ?1? is the easiest and ?5? the most difficult. 
20 For this column, an answer with ?1? is the easiest and ?5? the most difficult. 
21 For this column, an answer with ?1? is the most satisfactory and ?5? the least satisfactory. 
22 For this column, an answer with ?1? is the best and ?5? the worst. 
 
195 
 
 According to Table 8-20, all subjects spent almost the same amount of learning 
time. This is partly because most of the learning time was spent in in-class trainings, 
which was equal by design.  The only differences among the subjects were the times 
spent in help sessions. However, the differences between the subject/groups are small. 
? User Satisfaction 
 There is no significant difference in user satisfaction between two groups. In 
general, subjects of both groups were not very satisfied with either of the A-CEG 
construction methods. This indicates the need to improve the A-CEG Construction 
Rules in terms of the ease of use and ease of learning. 
? Excerpted Comments from Subjects 
 Student A: ?? It is hard to distinguish between causes and effects just based on 
the SRS. This is maybe due to the fact that there is not enough information in SRS 
itself, or the SRS itself is vague. ??  
 Student B: ?? The CEG is useful in the sense that it gives a good picture of the 
SRS and how it is organized. Moreover, graphical representations are usually a good 
way to picture how things work. But the difference between cause and effect is still 
not clear. I believe the method of filling in tables could be a good way to work 
with. ?? 
 Student C: ?? I like the fact that there are a low number of different elements 
and operations for the CEG. This small variety helped me to quickly understand the 
notation and the basic rules to design a CEG. Moreover, it offers a clean overview of 
the specifications which is important to better spot defects. On the other hand, I found 
 
196 
 
that the definitions of cause and effect were not properly stated. It took me some time 
to figure out that cause indicates everything external, related to the user, and effect 
includes both consequences and actions. Maybe a better definition of effect could 
help novice users to quickly understand the potential of CEG. ?? 
 Student D: ?? The rules help in understanding how to construct the CEG and 
how to handle the duplications. However, the use of action words in finding events is 
not practical because many events were not related to any action words. ?? 
8.8 Threats to Validity 
 As with any empirical study, there are various threats to validity that must be 
discussed. This section explains the major threats to validity in this study.  
 The first threat is the threat of a selection bias in the subject population. The 
specific subjects who participated in this study could be the major source of the 
observed result and may not be repeatable by other researchers. This threat was 
alleviated to some degree by the fact that the participants were selected without any 
prior information about the composition of the class or participants. In addition, the 
participants did not receive any compensation for participation in the study. They all 
participated as a part of their class project and therefore the level of motivation of 
each subject should have been similar. 
 The second threat, the representativeness of the artifact is a threat to external 
validity. It is possible that the SRSs used in this study may not be reflective of an 
actual requirements document. This threat is addressed to some degree by the fact that 
the SRSs were selected from public academia and industry projects. Hence these 
SRSs describe a realistic piece of software that is not a trivial system. 
 
197 
 
 The third thread is the experience of subjects? - the most frequent concern with 
experiments using student subjects is that the results cannot be generalized to 
professionals. Experience is certainly an issue for this experiment, where the subjects 
had no industrial experience. However, we do not believe that the experiment was 
influenced by our subjects? limited experience with SRS analysis, because 
implementing A-CEG construction was rather straightforward. 
 The last threat is one that is common to any empirical study. Researchers cannot 
draw a general conclusion based solely on the results of one study. Because of the 
presence of a large number of context variables, both known and unknown to the 
researchers, it cannot be assumed that results will always generalize beyond the 
setting in which the study was conducted. More confidence in a result comes from 
replication of a study. Therefore, this study needs to be replicated to build a body of 
empirical knowledge to allow concrete, general conclusions to be drawn.  
8.9 Summary 
 The objective of Experiment D is to compare and hence evaluate how well the A-
 CEG Construction Rules set performs in comparison to other A-CEG construction 
methods. This chapter presents a small-scale controlled experiment where the A-CEG 
Construction Rules set is compared to the general A-CEG construction guidelines 
used in industry. The results are promising since the study shows that the A-CEG 
Construction Rules set is significantly better than the general A-CEG construction 
guidelines in terms of both effectiveness and efficiency in finding the A-CEG 
elements in SRSs. 
 
198 
 
 Be aware that there are several limitations to this experiment. First of all, this 
experiment is clearly based on a small sample size and therefore, one has to take into 
account the possibility of response bias. A larger-scale experiment is needed to 
validate our claims. Secondly, the experiment has to be replicated in different 
contexts. The replications should address changes in the SRSs, for example, using 
other different application types. The experiment should also be investigated in an 
industrial setting in order to evaluate whether it still provides positive effects. It 
would be especially interesting to investigate the method with professionals as 
subjects. Other further work also includes enhancement of the A-CEG Construction 
Rules set, either to include checklist items or to develop automation tools to facilitate 
the identification of A-CEG elements.   
 
199 
 
Chapter 9:  Conclusion and Suggestions for Future Research 
9.1 Principal Results of this Study and its Significance 
 In the software development life cycle, requirements analysis is one of the 
important phases as any fault in this phase will be carried through the rest of the 
development. In particular, the SRS, a product of the requirement analysis phase, is 
so crucial to the success of a software project that it is hard to improve the quality 
and/or productivity of the project without first addressing the quality of the SRS. 
Studies revealed that faults made in the requirements phase are extremely expensive 
to repair and requirements faults are the largest class of faults typically found in a 
complex software project. Requirements must be correct if the rest of the 
development effort is to succeed. In order to improve quality and reliability of 
software continuously throughout the software development life cycle, it is 
imperative to develop measurement criteria along the life cycle, especially in the 
early stages. Activities like CEGA which can be carried out in the early phases of 
software development can ensure software quality and reliability. A review of the 
literature reveals the scarcity of any publicly reported software measurements related 
to the detection of problematic requirements and to software reliability prediction at 
the early stages of software development. 
 This study focuses on developing an approach to enable the detection of 
requirements faults and prediction of software reliability at the requirements analysis 
stage when limited information about the software project is available. The proposed 
approach is based on the enhanced CEGA, and can be employed for SRS faults 
 
200 
 
detection and reliability prediction in an early stage and possible throughout the 
development life cycle. It is shown how the faults in the requirements specifications 
document can be systematically detected and how the output from the SRS faults 
detection process can be used as an input to enable the prediction of software 
reliability in the requirements analysis phase and other development phases. It is 
demonstrated that the use of the enhanced CEGA as a software reliability 
measurement tool can be more rigorous and intuitive. Related techniques, methods, 
and rules are developed to enhance the rigidity, repeatability, and scalability of the 
approach. The feasibility, usability, and scalability of the approach are experimentally 
validated. 
 More specifically, this study accomplishes the following: 
? Thoroughly analyzed the advantages, disadvantages, and other technical barriers 
for CEGA to serve as a software reliability measurement.  
? Mathematically formalized CEGA and enhanced its rigidity, repeatability, and 
scalability toward a solid software reliability measurement. These formal 
definitions are necessary to ensure that the CEG is meaningful, true and of known 
accuracy, easy to be stored, represented, and implemented by computers, and can 
be updated easily in response to the frequent requests for requirements change in 
practice. 
? Developed a CEGA-based taxonomy for SRS faults. One cannot expect to 
identify types of SRS faults that he or she never ever has thought about or come 
across. The contribution of the taxonomy lies in providing a systematic way to 
explore this implicitly existing knowledge by using the heuristics and in 
 
201 
 
increasing the requirements engineer?s awareness of the problematic areas in an 
SRS. 
? Developed a two-phase CEGA-based method for SRS faults detection. This 
method allows software project stakeholders to identify problematic areas in the 
requirements at a very early development stage. Moreover, this method 
overcomes the shortcomings of other techniques that fail to ensure complete 
coverage of functional requirements. According to the cost ratio shown in Figure 
2-9, applying our method at the requirements analysis phase could save as much 
as 99% (or even more) on detecting and fixing the SRS faults if the same SRS 
faults were not found and fixed until the testing phase. 
? Developed a CEGA-based algorithm to quantify the impact of detected faults on 
software reliability. This is the first method of its kind in the literature. Starting 
from this method, software project stakeholders are allowed to determine at a very 
early development stage whether or not the project is at high risk of failure while 
limited information about the software project is available. They can use the 
predicted reliability to assess the risks of a project, determine whether a trade-off 
between new functionalities and the possible loss of reliability is cost-effective, 
mitigate the risks by removing the major contributor(s), or even cancel the project. 
However, the topic on how to make decisions based on the reliability predicted at 
the early stages of software development is beyond the scope of this study. 
Interest readers are referred to the literature on risk management and/or decision 
making for further information. 
 
202 
 
? Examined the feasibility and scalability of the proposed techniques for detecting 
SRS faults and predicting the reliability at the requirements analysis phase via two 
case studies. 
? Revealed many aspects of the nature of CEG construction, including 
o collected and distilled patterns in CEG construction. 
o identified and analyzed the influencing factors in CEG construction. 
o provided SRS writers with caveats to avoid common problems found in 
the practice of specifying SRSs. These problems might cause difficulties 
in identifying A-CEG elements and lead to increased risks of unreliable 
software products. 
o developed a set of rules to ease the task of CEG construction. According 
to the results of Experiment D, the mean accuracy of A-CEG constructors 
who were using the A-CEG Construction Rules is 8% higher than that of 
those who were using the general A-CEG construction guidelines, the 
mean recall 10% higher, and the mean efficiency 13% higher. 
? Statistically evaluated the usability of the proposed rules. 
? Statistically verified the impact of two influencing factors on using these rules.  
 The proposed approach provides methods for development teams to detect faults 
in requirements specification and determine the uncertainty of their impact; it 
supports trade-off decision and evaluation of remedial actions.  The approach is still 
open for improvement, but it can be concluded that so far the results are inspiring for 
the future. It will enable software project stakeholders to effectively detect 
requirements faults and assess the quality of requirements early in development, and 
 
203 
 
ultimately lead to improved software reliability if the identified faults are removed in 
time. Even with some limitations, the intrinsic advantages of our approach make it 
attractive from a usability perspective. Software project practitioners (including 
architects, requirements specialists, designers, coders, testers, and managers), 
regulators, and policy makers involved in the certification of software systems can 
benefit most from the techniques proposed in this study. 
9.2 Advantages 
 Our approach has the following advantages/characteristics: 
? Our approach is applicable at the requirements analysis phase, a very early 
stage in the software development lifecycle. One obvious benefit of this 
characteristic is that fault detection and reliability prediction realized earlier in 
the software development cycle have a dramatic effect on making software 
development practices better and more efficient. The CEGA technique 
discussed in this study can identify potential problem areas in SRSs that may 
lead to problems or faults in the later development phases. Finding these 
problem areas in the requirements analysis phase decreases the cost and 
prevents potential ripple effects from SRS, later in the development life cycle. 
The primary value gained from utilizing our CEGA-based approach is the 
capability of systematically analyzing and detecting SRS faults and predicting 
reliability early in the development process. What makes the approach 
especially attractive is that CEGA appears to be very effective in detecting 
other requirements fault types. We have empirically evaluated this broader 
 
204 
 
aspect of the CEGA strategy on a simplified personal access control system 
and a safety-related real-time control system used for nuclear power plants. 
? Our approach distinguishes itself from others by its CEGA-based attribute, 
which is rigid, methodical, systematic, and therefore uniform, highly 
repeatable, and reliable. Only a graphical technique such as CEGA may be 
able to capture the implications in an SRS. CEGA can reveal complexity that 
may have been hidden by the words alone. It exposes incomplete, incorrect, 
and ambiguous functional requirements in an SRS.  
? Our SRS faults detection methods ensure complete coverage of functional 
requirements. The SRS analyst can be confident that once CEGA is 
implemented, the functional requirements are to the best of his/her knowledge 
faultless, and no ambiguous, incomplete, inconsistent, or incorrect 
functionality will move into production. 
? Many aspects of our approach can be automated.  
? Our approach requires only functional requirements and the associated 
operational profile, which is most likely to be available in the early 
development stages. 
? Our approach is applicable to all types of software systems, although this 
study focuses on mission-critical systems where a reliable final product has 
top priority. 
9.3 Limitations 
 This study has a few limitations. Practitioners must carefully weigh these against 
other options on a case-by-case basis. The limitations of this study are: 
 
205 
 
1. Our approach assumes that the SRSs are written in plain English text, a primary 
form to state requirements. Our approach may not be applicable for SRSs 
specified in some formal languages. 
2. Our approach is based on CEGA which uses a CEG to provide a concise 
representation of logical combinations and corresponding actions specified in an 
SRS. Be aware that not every aspect of a software system will be specifiable by a 
CEG. The CEG can only capture functional requirements specified in the SRS 
and is primarily concerned with modeling inputs and outputs involved in the 
system to be specified.  
3. CEGA will not be able to detect hidden requirements  
4. It is unclear how accurate the reliability prediction given by our approach would 
be. Further research will help answer this question.  
5. Implementing our approach is very costly. The most time-consuming task in our 
approach is to construct an A-CEG from a given informal specification. 
Automation is a good way to cut down time and cost in A-CEG construction.  
And we think that A-CEG construction can be partially automated.  
6. In general, a significant amount of human intervention is still needed in our 
approach. The process of identifying SRS faults requires domain knowledge and 
understanding of the system under study, as well as inspector?s creativity, 
experience, and even intuition. Without prior knowledge of the system, the faults 
found through CEGA may not be correct and the final reliability estimation may 
not be very meaningful.  Unfortunately, automatic SRS faults? detection is very 
difficult. 
 
206 
 
 It might seem that these restrictions eliminate many potential applications. 
However, despite external appearances, overheads associated with our approach 
(even without tool support) are lower than the expected benefits which will be 
incurred by the project. Especially, the cost of using our approach is small compared 
to the potential major downstream savings because the project teams avoid 
unnecessary rework and operational problems. 
9.4 Suggestions for Future Research 
 We encourage further studies on the following topics: 
? Further Validation of the usability of the A-CEG Construction Rules. Through a 
small-scale controlled experiment we have assessed the usability of the rules for 
CEG generation. This has served as a proof of the feasibility and usability of the 
rules. Due to the intricacy of A-CEG construction and the scarcity of empirical 
evidence available, there is also a need to further validate our findings by 
considering SRSs from different domains and explicitly controlling people-related 
factors, such as SRS analysis expertise in a particular domain. 
? Improvement of the A-CEG Construction Rules. The A-CEG Construction Rules 
set is an attempt to ease A-CEG construction.  It is very helpful to add consistency 
to the way we construct A-CEGs. However, while it still remains useful for A-
 CEG construction and can produce significant cost and time savings in CEGA 
implementation, the rules set is still open to criticism and improvement.  
? Automation of our CEGA-based approach. The automation of our approach might 
be an interesting direction to pursue. There are several aspects of our approach 
that can be automated:  
 
207 
 
1) Conversion between the mathematical expression and graphical 
expression for a CEG. Graphical techniques are especially valuable for 
communicating with people who speak different languages. Though 
informal, unscalable, and unnecessary in our approach, the graphical 
expression of a CEG helps project stakeholders to find, illustrate, and 
analyze the software functional requirements, and ease the communication 
among different project roles. Therefore, it is desirable to develop a tool 
that will allow convenient conversion between these two CEG formats. 
2) Tools that facilitate the detection of SRS faults. Our SRS fault detection 
methods are performed by humans through a time-consuming procedure 
of reading requirements documents and looking for errors. This is tedious 
at best, and at worst, prone to errors. Even if a complete and general 
automation of the entire fault detection process is impossible, the most 
promising approach to improved fault detection is a systematic manual or 
partially automated procedure. Our methods in conjunction with a 
powerful analytical tool will provide a rigorous, consistent and cost 
effective approach to detect SRS faults.  
3) Tools that facilitate A-CEG construction. We notice that manually 
constructing an A-CEG for a bulky SRS is very time-consuming, even 
with the help of the A-CEG Construction Rules. 
4) Identification of failure-relevant inputs. The unified failure-relevant input 
determination algorithm (shown in Figure 5-5) is ready for automation. 
 
208 
 
Even if doable, it is challenging to manually determine failure-relevant 
inputs when the number of causes is more than 15. 
? Validation of the accuracy of the reliability prediction given by our approach. 
Although the feasibility and scalability of our approach have been verified using 
real applications, it is unclear how accurate the reliability predictions given by our 
CEGA-based approach would be. It has been pointed out [40] that reliability 
prediction based on process or product measurements alone may not be 
sufficiently accurate. These predictions need corroboration. In practice, 
particularly when high levels of reliability need to be assured, it will be necessary 
to use several sources of evidence to support reliability claims, for instance, 
evidence of process quality and evidence from software components and 
structure. Combining such disparate evidence to aid decision making is itself a 
difficult task. Research in this area is still in a rather early stage. The explosive 
complexity of today?s software systems has made this task even more 
challenging. Nevertheless we believe this kind of approaches offers the best 
prospects for accurate reliability prediction and more potential refinement in the 
future. 
? Expansion of our approach to other software development phases. A natural 
extension of this study is to consider applying similar techniques to later products 
in the life-cycle, such as designs or even source code, where the potential savings 
are less. We believe that the key characteristics of our approach should apply to 
other software development phases. 
  
 
209 
 
Appendix A: List of Words that Point to Potential Ambiguities (adapted from 
[70]) 
 
Dangling Else 
 
 
Ambiguity of Reference 
 
 
Ambiguous Adjectives 
 
 
Ambiguous Adverbs 
 
 
can could is one of must 
shall should will would 
above below it such   
the previous them these this 
those    
all any appropriate custom 
efficient every few frequent 
improved infrequent intuitive invalid 
many most normal ordinary 
rare same seamless several 
similar some standard the complete 
the entire transparent typical usual 
valid    
accordingly almost approximately by and large 
commonly customarily efficiently frequently 
generally hardly ever in general 
infrequently 
 
Intuitively just about more often than 
not 
more or less mostly nearly normally 
not quite often on the odd occasion ordinarily rarely 
roughly seamlessly seldom similarly 
sometime somewhat transparently typically 
virtually    
 
210 
 
Ambiguous Variables 
 
 
Ambiguous Verbs 
 
 
E.G. versus I.E. 
 
 
Implicit Cases 
 
  
the application the component the data the database 
the field the file the frame the information 
the message the module the page the rule 
the screen the status the system the table 
the value the window   
adjust alter amend calculate 
change compare compute convert 
create customize derive determine 
edit enable improve Indicate 
manipulate match maximize may 
minimize might modify optimize 
perform process produce provide 
support update validate verify 
e.g. i.e.   
also although as well besides 
but even though for all other furthermore 
however in addition to likewise moreover 
still notwithstanding otherwise on the other hand 
though unless whereas yet 
as required as necessary   
 
211 
 
Temporal Ambiguity 
 
 
Boundary Ambiguity 
 
 
Totally Ambiguous 
 
  
after annually at a given time 
 
at the appropriate 
time 
bimonthly biweekly daily every other month 
every other week fast in a while later 
monthly quarterly quickly soon 
twice a month twice a year weekly yearly 
up to among including  
etc. (sentences that end 
with ???) 
  
 
212 
 
Appendix B: Sample Source Code for Calculating the Occurrence Probability 
of a BDD?s Top Node 
 
/************************************************************************ 
PROGRAM:      BDD? Top Node Occurrence Probability Calculation  
FILE:               CEG_main.cpp 
FUNCTION:     Constructing a BDD and calculating the top node?s occurrence probability  
AUTHOR:        Wende Kong 
REVISIONS:   02/21/2005 Second release; 10/1/2004 First release 
ENVIRONMENT: Visual C++ version 6.0, Pemtium 4/1.0G ; 256mb RAM, Windows XP 
 
NOTES:  
 This C++ program incorporates a software module complied from Binary Decision 
Diagrams Library Package Version 2.3 (Copyright ? 1996, Jorn Lind-Nielsen, All right 
reserved).  This software tool is applicable for other software applications, too, with minor 
modification. The following steps are required to compile this source code: 
(1) Download the file "buddy19.zip" to your computer from 
 http://www.ee.pdx.edu/~alanmi/research/softports.htm 
(2) Unzip the file into a local directory, which will become the home directory of the Buddy 
static library 
(3) Open "Buddy.dsw" in Microsoft Visual C++ 6.0 (click ?File -> Open Workspace...?) 
(4) Click ?Executing command Build -> Rebuild All?. Ignore the 15 warnings produced by 
compiling. Thus the buddy.lib is created. 
(5) Create an empty project of type "Console Application" 
(6) Add this source code the ?file source?? 
(7) Add "<your_library_path>\buddy\include" to  
  Project -> Settings -> C/C++ -> Additional include directories 
(8) Add "<your_library_path>\buddy\Debug\buddy.lib" to 
  Project -> Settings -> Link -> Object/library modules 
(9) Compile and link the project. An .exe file is created. Execution of this .exe file will yield 
the occurrence probability of PACS? A-CEG fails. In this sample source code, we 
assume that the identified failure-relevant inputs are 1 2 4 5 1 2 4 5 1 2 4 5c c c c c c c c c c c c? ? . 
*************************************************************************/
 
213 
 
#include <iostream.h> 
#include "bdd.h" 
 
#define N 14   
// N is the number of causes plus 1. An extra storage is needed since the first element  
// (the one with index 0) of the array is purposely neglected. 
 
float bddProbCal(bdd & currentBdd); 
 
static float NodeProb[N]={0.97,0.97,0.97,0.99,0.97,0.8,0.8,0.8,0.98,0.5,0.5,0.001,0.99}; 
//NodeProb array stores the probabilities of all causes. 
// Pr(c1)=0.97, Pr(c2)=0.97, Pr(c3)=0.97, Pr(c4)=0.99, Pr(c5)=0.97, Pr(c6)=0.8, Pr(c7)=0.8, 
// Pr(c8)=0.8, Pr(c9)=0.98, Pr(c10)=0.5, Pr(c11)=0.5, Pr(c12)=0.001, Pr(c13)=0.99. 
 
//The following code is corresponding to the revised recursive algorithm shown in  
// Figure 5-17. 
float bddProbCal(bdd & currentBdd) 
{ 
 float PL, PH, q;  
 // consider "High" branch 
 if (bdd_high(currentBdd) = = (bdd) 1) 
  PH = 1; 
 else if (bdd_high(currentBdd) = = (bdd) 0) 
  PH = 0; 
 else 
  PH = bddProbCal(bdd_high(currentBdd)); 
  
 // consider "Low" branch 
 if (bdd_low(currentBdd) = = (bdd)0) 
  PL = 0; 
 else if (bdd_low(currentBdd) = = (bdd)1) 
  PL = 1; 
 else 
  PL = bddProbCal(bdd_low(currentBdd)); 
 
214 
 
 q = NodeProb[bdd_var(currentBdd)-1]; 
 
 //calculate and return probability value of node 
 return (q*PH+(1-q)*PL); 
} 
 
void main(void) 
{ 
 int i, j, k; 
 bdd c[N+1];  // c[i] is corresponding to ci. c[0] is not used. 
 bdd I_1;  // I_1 is the expression of failure relevant inputs 1 2 4 5c c c c . 
 bdd I_2;            // I_2 is the expression of failure relevant input 1 2 4 5c c c c . 
 bdd I_3;  // I_3 is the expression of failure relevant input 1 2 4 5c c c c . 
 bdd I_ALL;  // I_ALL is the expression of the union of all failure relevant inputs. 
 
 // Initialize the BDD storage. 
 bdd_init(100,100); 
 bdd_setvarnum(N); 
 
 // The variable order in the final ROBDD is: c1, c2, c3, ?, cN) 
 for (i = 1; i <= N; i++) 
  c[i] = bdd_ithvar(i); 
 
 I_1 = bdd_not(c[1]) & bdd_not(c[2]) & bdd_not(c[4]) & bdd_not(c[5]); 
 
 I_2 = bdd_not(c[1]) & c[2] & bdd_not(c[4]) & bdd_not(c[5]); 
 
 I_3 = c[1] & bdd_not(c[2]) & bdd_not(c[4]) & bdd_not(c[5]); 
 
 I _ALL= I_1 | I_2 | I_3;  
  
 cout << "Final Prob=" << bddProbCal(I_ALL) << endl; 
} 
 
215 
 
Appendix C: Results of Case Study A 
 
B1. PACS?s A-CEG for PACS 
 
Figure Appendix C-1: Graphical Expression of PACS?s A-CEG 
  
 
216 
 
??????? ? ???;???; ??; ??; ??; ??; ??; ??; ???; ???; ???;???? 
??????? ? ???;???; ??; ??; ??; ??;???; ??; ???; ???; ???; ???? 
??????? ?
 ?
 ??
 ??
 ?
 ?
 ??
 ??
 ?
 ? ?? ?? ??? ? ?? ? ??? ? ???;?? ?? ???? ? ???? ? ???;?? ?? ??;?? ?? ??;
 ?? ?? ??? ? ???? ? ?? ? ?????? ? ????;
 ?? ?? ??? ? ???? ? ?? ?? ? ?? ? ?????;
 ?? ?? ??? ? ???? ? ?? ?? ? ?? ? ?????;
 ?? ?? ??? ? ???? ? ?? ?? ? ?? ? ?????;
 ??? ?? ??? ? ?? ? ??? ? ??? ? ?? ? ???;
 ??? ?? ???? ? ?? ? ??? ? ??? ? ?? ? ???? ? ???;??? ?? ???? ? ??? ? ???;??? ?? ??? ?
 ??
 ??
 ?
 ?
 ??
 ??
 ?
 ?
  
???????? ?
 ?
 ?
 ?
 ?
 ? ??????????, ???;??????????, ???;
 ??????????, ???;
 ????????????, ??, ???;
 ????????????, ??, ???;
 ?????????????, ???? ?
 ?
 ?
 ?
 ?
  
Figure Appendix C-2: Mathematical Expression of PACS?s A-CEG 
 
 
B2. Identified Faults in PACS? A-CEG 
 No faults were identified from constructing PACS? A-CEG and conducting 
ambiguities review.  The following faults were detected using the CEG validation 
algorithm described in Section 4.5.3. 
? Wrong Boolean function for effect e1 (missing cause c4) 
? Wrong Boolean function for effect e2 (missing cause c4); 
? Missing effect e3;  
? Missing effect e10;  
 
 
217 
 
B3. PACS? B-CEG 
 
 Figure Appendix C-3: Graphical Expression of PACS?s B-CEG 
  
 
218 
 
??????? ? ???;???; ??; ??; ??; ??; ??; ??; ??; ???; ???; ???; ???? 
??????? ? ???;???; ??; ??; ??; ??; ??;???; ??; ???; ???; ???; ???; ???? 
??????? ?
 ?
 ??
 ??
 ??
 ?
 ??
 ??
 ??
 ? ?? ?? ??? ? ?? ? ??? ? ?? ? ???;?? ?? ???? ? ???? ? ?? ? ???;?? ?? ???;?? ?? ??;?? ?? ??;
 ?? ?? ??? ? ???? ? ?? ? ?????? ? ????;
 ?? ?? ??? ? ???? ? ?? ?? ? ?? ? ?????;
 ?? ?? ??? ? ???? ? ?? ?? ? ?? ? ?????;
 ?? ?? ??? ? ???? ? ?? ?? ? ?? ? ?????;
 ??? ?? ??? ? ?? ? ??? ? ??? ? ?? ? ???;
 ??? ?? ??? ? ?? ? ??? ? ??? ? ?? ? ???;
 ??? ?? ???? ? ?? ? ??? ? ??? ? ?? ? ???? ? ???;??? ?? ???? ? ??? ? ???;??? ?? ??? ?
 ??
 ??
 ??
 ?
 ??
 ??
 ??
 ?
  
???????? ?
 ?
 ?
 ?
 ?
 ? ??????????, ???;??????????, ???;
 ??????????, ???;
 ????????????, ??, ???;
 ????????????, ??, ???;
 ?????????????, ???? ?
 ?
 ?
 ?
 ?
  
Figure Appendix C-4: Mathematical Expression of PACS?s B-CEG 
 
  
 
219 
 
B4. Definitions of Effects? in PACS? A-CEG and B-CEG 
Table Appendix C-1: Definitions of Effects? in PACS? A-CEG and B-CEG 
Effect Description 
e1 Displaying ?Enter PIN? on the screen; 
e2 Displaying ?Retry? on the screen; 
e3 Displaying ?Access Denied? to officer; 
e4 Displaying ?Access Denied? on the screen; 
e5 Displaying ?System Failure? to officer; 
e6 Displaying ?Invalid PIN? on the screen; 
e7 Recording a failed entry into a file; 
e8 Displaying ?see officer? to officer; 
e9 Displaying ?see officer? on the screen; 
e10 Displaying ?Please Proceed? on the screen; 
e11 Recording and reporting a successful entry; 
e12 Opening the gate; 
e13 Resetting system and displaying ?Insert Card?; 
e14 Locking the gate. 
 
 
B5. Identified Failure-relevant Inputs 
 The identified failure-relevant inputs for PACS are 
  
1 2 3 4 12 1 2 3 4 12 1 2 3 4 12 1 2 4 12 1 2 4 12
 1 2 4 12 4 4 5 6 7 9 12 4 5 6 7 9 12 4 5 6 7 9 12
 3 4 5 8 9 12 3 4 5 8 9 12 3 4 5 8 9 12 3 4 5 8 9 12
 3 4 5 8 9 12 3 4 5 8 9 12 3 4
 c c c c c c c c c c c c c c c c c c c c c c c
 c c c c c c c c c c c c c c c c c c c c c c c
 c c c c c c c c c c c c c c c c c c c c c c c c
 c c c c c c c c c c c c c c c
 ? ? ? ? ?
 ? ? ? ? ?
 ? ? ? ?
 ? ? 5 8 9 12 3 4 5 8 9 12
 3 4 5 8 9 12 4 5 6 7 8 9 12 4 5 6 7 8 9 12 4 5 6 7 8 9 12
 4 5 6 7 8 9 12 4 5 6 7 8 9 12 4 5 6 7 8 9 12
 4 5 6 7 8 9 10 12 4 5 6 7 8 9 10 12 4 5 6 7 8 9 10 12
 c c c c c c c c c
 c c c c c c c c c c c c c c c c c c c c c c c c c c c
 c c c c c c c c c c c c c c c c c c c c c
 c c c c c c c c c c c c c c c c c c c c c c c c
 ? ?
 ? ? ? ?
 ? ? ?
 ? ?
  
  
 
220 
 
B6. PACS? Operation Profile(obtained from [5]) 
Table Appendix C-2: PACS? OP 
Cause Description Probability 
c1 Entering a valid card at the first attempt; 0.97 
c2 Entering a valid card at the second attempt; 0.97 
c3 Entering a valid card at the third attempt; 0.97 
c4 Database is available for access; 0.99 
c5 Hardware Failure; 0.001 
c6 Entry of digits of PIN within the 5-seconds time limit; 0.97 
c7 Entering a valid PIN at the first attempt; 0.80 
c8 Entering a valid PIN at the second attempt; 0.80 
c9 Entering a valid PIN at the third attempt; 0.80 
c10 Entry of the 1st digit within the 10-seconds time limit; 0.98 
c11 Guard Overriding: the guard allows the user to entry; 0.50 
c12 Officer resets the system; 0.50 
c13 
User able to pass within the 30-second time limit after the 
gate is opened. 0.99 
 
 
B7. The Predicted Reliability for PACS 
 Implementing source code similar to Appendix B yields 0.003856, which is 
corresponding to the occurrence probability of PACS? A-CEG fails. Therefore, the 
predicted reliability of PACS? is 1 ? ?0.003856 ? ??0.996144. 
  
 
221 
 
Appendix D: Reporting Tables Used in Experiment D 
Table Appendix D-1: Identified Causes and Effects (for Group I and II) 
Sentence 
No. 
Cause/Effect 
Index Description 
1 e.g.: c1 e.g.: The user provides the speed value from the keyboard 
? ? ? 
 
 
Table Appendix D-2: Identified Constraints (for Group I and II) 
No. Constraint 
1 e.g.: REQUIRE(c1, c2) 
? ? 
 
 
Table Appendix D-3: Identified Logical Relationships (for Group I and II) 
No. Logical Relationship 
1 e.g.:  e1 :=  c1 ? c2 
? ? 
 
  
 
222 
 
Table Appendix D-4: Activity-Effort Log Sheet (for Group I and II) 
Section 
No. Start time End time Activities Percentage, % 
1 e.g.: 3:20pm e.g.: 4:30pm 
e.g.: identifying events e.g.: 10% 
e.g.: removing duplicate 
events e.g.: 20% 
? ? 
? ? ? ? ? 
 
Instructions for using Table Appendix D-4:  
 This table is designed to keep track of your learning process. You will be graded partly 
on this document. Please follow the instructions carefully. 
1. Record data for each session. A session is any time you start working on the 
application till you take a break. 
2. Note the session start time and session end time. A session may be as small as five 
minutes. 
3. An activity is anything you do during a session. It can be anything from learning how 
to use the rules, reading and understanding the specification, applying rules to 
identify cause, effects, logical relationships, constraints, applying rules to refine the 
results, to drawing the cause-effect graphs. Record every activity in a Session. 
4. Be honest and attentive. Although a detailed recording of activities will be 
appreciated, you need not be creative with your data. 
 
 
Table Appendix D-5: Training Activity-Effort Log Sheet (for Group II only) 
Section 
No. 
Start 
time 
End 
time Training Activities 
Duration, in 
minutes 
1   Step 1: Explaining rules  
2   
Step 1: Showing an example of applying rules to a 
sentence  
Step 2: Practice 1 on applying rules to sentences  
Step 3: Practice 2 on applying rules to sentences  
3   
Step 1: Explaining workflow  
Step 2: Showing how to use the workflow  
Step 3: Practice 1 on how to use the workflow  
Step 4: Practice 2 on how to use the workflow  
 
 
223 
 
Table Appendix D-6: Rules to Identify Constraints (for Group II only) 
Constraints Applicable Rules (mark ?N/A? if no rules are applicable) 
e.g.: REQUIRE(c1, c2) e.g.: Rule 7.9 
? ? 
 
 
 
Table Appendix D-7: Rules to Identify to Identify Causes and Effects (for Group 
II only) 
Event index Cause/Effect Applicable Rules 
e.g.: c1 e.g.: Cause e.g. Rule 7.10 
? ? ? 
 
 
 
Table Appendix D-8: Rules to Identify Logical Relationships (for Group II only) 
Logical Relationships Applicable Rules (mark ?N/A? if no rules are applicable) 
e.g.:  e1 :=  c1 ? c2 e.g.: Rule 7.5 
? ? 
 
 
224 
 
Appendix E: Questionnaire Used in Experiment D (to assess subjects? 
background) 
 
Your Name: _____________________ 
1. Have you taken the following classes (Please darken the appropriate option(s))? 
? Data structure and algorithms  ? Computer Systems Architecture 
? Object-Oriented Programming ? Database Design 
? Computer Networks    ? Information Security 
? Operating Systems    ? Compiling Principle 
? Computer Graphics    ? Software Engineering 
? Software Testing     ? Software Safety 
? Ensuring Software Reliability and its Integrity 
 
2. Do you have any other professional experience relevant to software engineering? 
 
3. Why did you take the course and what would you most like to get out of the course? 
 
4. What research are you working on? 
 
5. What is recursion? (Please darken the correct option) 
? A function issues a call to itself   
? A function is repetitively called in an application 
? An array with infinite number of elements 
? Other (Please Specify): _______________________________________ 
 
6. (a) Which is the fastest sorting algorithm? (Please darken the correct option) 
? Quick sort  ? Insertion sort ? Bubble sort ? Heap sort ? Merge sort 
? Selection sort  ? Shell sort  ? Bin sort (bucket sort) 
? Other (Please Specify): _______________________________________ 
 
(b) Order these sorting algorithms. 
 
7. What is a function point? (Please darken the correct option) 
o It is the main objective of a function as specified in the software requirements 
specification 
o It is the metric that represents a function?s contribution in LOC to the net SLOC. It is 
expressed as a fraction of the net SLOC. 
o It is a measure of the size of computer applications and the projects that build them. 
The size is measured from a functional, or user, point of view. 
o None of the above. 
 
8. What is Cause-Effect Graphing technique? (Please darken the correct option) 
 
225 
 
o It is a black-box testing technique that was originally proposed to generate test cases 
by transforming a natural language SRS into an acyclic Boolean logic network 
o It is a software reliability measurement that aids in identifying requirements that are 
incomplete and ambiguous. 
o It is a computer graphic technique used to render photographic-quality, realistic 
images 
o None of the above 
 
9. Which of the following computer languages have you had experience with? What is your 
level of expertise?  
? I have NOT used any of these ever. 
? I have used the following: 
 
Language Experience(months) Expertise Scale (1 -10)  
(10 is for the strongest level) 
C   
C++   
VB   
Java   
SQL   
JSP 
ASP 
  
 
HTML   
PHP   
Perl   
other:   
 
 
10. Please darken testing techniques that you have ever learned/used. 
o Data Flow Testing 
o Control Flow Testing 
o Loop Testing 
o Domain Testing 
o Boundary Testing. 
o Transaction Flow Testing 
o Code Walk-through 
o Code Inspection 
o Compatibility Testing 
o Configuration Testing 
o Localization Testing 
o Stress Testing 
o Performance Testing 
o Verification & Validation 
o Peer review 
o Decision table testing (Cause-effect graphing testing) 
  
 
226 
 
Appendix F: Postmortem Questionnaire Used in Experiment D (to assess 
usability of the A-CEG Construction Rules set) 
 
Your Name: _____________________ 
 
(Please tick the blank most closely corresponding to your feelings on the statements below) 
Q. 1: ?The CEG technique is useful for me to understand the SRS? (Usefulness) 
? Strongly Agree  ? Agree  ? Neither Agree nor Disagree  ? Disagree   ? Strongly Disagree 
 
Q 2: ?The CEG technique is very easy, and simple to use? (Ease of Use) 
? Strongly Agree  ? Agree  ? Neither Agree nor Disagree  ? Disagree   ? Strongly Disagree 
 
Q 3: ?I learned to use the CEG technique quickly? (Ease of Learning) 
? Strongly Agree  ? Agree  ? Neither Agree nor Disagree  ? Disagree   ? Strongly Disagree 
 
Q 4: ?I am satisfied with the CEG technique? (Satisfaction) 
? Very good  ? Good   ? Fair   ? Bad   ? Terrible 
 
Q 5: What is your general impression of the CEG technique? (In general) 
? Very good  ? Good   ? Fair   ? Bad   ? Terrible 
 
(Please describe your answers to the below questions) 
Q 6: What are the strengths and weaknesses of the CEG technique that you were assigned. 
 
Q 7: What in particular do you like or dislike about the CEG technique? Do you have other 
comments or suggestions that can help us improve the CEG techniques? 
 
Q 8: In your opinion, under what circumstances and to what extent the assigned technique has 
the advantages as an A-CEG construction technique, and under what circumstances and to 
what extent the technique has the disadvantages, why? 
 
Q 9: What are the problems that you found when using the technique? 
  
 
227 
 
Glossary 
 
 
A-CEG  Actually-implemented Cause-Effect Graph 
B-CEG  Benchmark Cause-Effect Graph 
BDD  Binary Decision Diagram 
CEG  Cause-Effect Graph 
CEGA  Cause-Effect Graphing Analysis 
CMM  Capability Maturity Model 
CMMI  Capability Maturity Model Integration 
DD  Defect Density measurement 
DoD  Department of Defense 
FDN  Fault-Days Number measurement 
FN  False Negative 
FP  False Positive 
MIS  Management Information System 
O-CEG  Oracle Cause-Effect Graph 
PACS  Personal Access Control System 
ROBDD  Reduced Ordered Binary Decision Diagram 
RSCR  Requirements Specifications Change Request measurement  
RT  Requirements Traceability measurement 
SRS  Software Requirements Specifications 
TN  True Negative 
TP  True Positive 
UML  Unified Modeling Language 
V&V  Verification & Validation 
   
  
 
228 
 
Bibliography 
 
 
[1] IEEE Computer Society, IEEE standard glossary of software engineering 
terminology, 1990. IEEE Std. 610.12-1990. 
[2] IEEE Computer Society, IEEE Standard for a Software Quality Metrics 
Methodology, 1998. IEEE Std. 1061-1998. 
[3] Lyu, M. R., Handbook of Software Reliability Engineering, New York : McGraw-
 Hill publishing, 1995. ISBN: 0-07-039400-8. 
[4] Musa, J. and Okumoto, K., Software Reliability: Measurement, Prediction, 
Application, New York : McGraw-Hill Book Company, 1987. ISBN: 0-07-044093-X. 
[5] Li, M., On the Nature of Relationships Between Measures and Reliability, Ph.D. 
Dissertation in Materials and Nuclear Engineering, College Park: University of 
Maryland, 2002. 
[6] Neumann, P. G., Illustrative Risks to the Public in the Use of Computer Systems 
and Related Technology, Feb. 2008,  
[7] Li, M. and Smidts, C. S., "A Ranking of Software Engineering Measures Based 
on Expert Opinion," IEEE Transactions on Software Engineering, vol. 29, pp. 811-
 24, 2003. 
[8] Vliet, H. V., Software Engineering : Principles and Practice, 3rd Edition, 
Hoboken, NJ : John Wiley & Sons, 2008. ISBN: 9780470031469. 
[9] Chrissis, M. B., Konrad, M. and Shrum, S., CMMI: Guidelines for Process 
Integration and Product Improvement, 2nd Edition, New York : Addison-Wesley 
Professional, 2006. ISBN-10: 0321279670. 
[10] Fenton, N. and Pfleeger, S., Software Metrics - A Rigorous and Practical 
Approach, s.l. : Brooks Cole Publishing Company, 1998. ISBN: 0534954251. 
[11] IEEE Computer Society, IEEE Standard Dictionary of Measures of the Software 
Aspects of Dependability, 2006. IEEE Std. 982.1-2005. 
[12] Smidts, C. S. and Li, M., Software Engineering Measures for Predicting 
Software Reliability in Safety Critical Digital Systems, Nuclear Regulatory 
Commission, Office of Nuclear Regulatory Research, Washington DC : USNRC, 
2000. Technical Report NUREG/GR-0019. 
[13] Boehm, B. W., Software Cost Estimation with COCOMO II, Englewood Cliffs : 
Prentice-Hall, Inc., 2000. 
[14] Cook, D. A., "Requirements Risks Can Drown Software Projects," 
CrossTalk:The Journal of Defense Software Engineering, no. 2, 2002. 
http://www.stsc.hill.af.mil/crosstalk/2002/04/leishman.html. 
[15] The Standish Group International, Inc., the Standish Group CHAOS Report, 
1995. Available online at www.standishgroup.com/chaos.html. 
[16] Easterbrook, S., et al., "An Experience Report on Requirements Reliability 
Engineering Using Formal Methods," IEEE Transactions on Software Engineering, 
vol. 24, no. 1, pp. 4-14, Jan. 1998. 
[17] Sheldon, F. et al, "Reliability Measurement from Theory to Practice," IEEE 
Software, vol. 9, no. 4, July 1992. 
[18] Martin, J., An Information Systems Manifesto, 1st Edition, Upper Saddle River, 
NJ, USA : Prentice Hall PTR, 1986. ISBN:0134647696. 
 
229 
 
[19] Software Engineering Institute, Process Maturity Profile of the Software 
Community, 2001. 
[20] Leffingwell, D. and Widrig, D., Managing Software Requirements: A Unified 
Approach, Reading, MA : Addison Wesley Publishing Co., 2000. ISBN:0201615932. 
[21] Graham, D., Finzi, S. and Glib, T., Software Inspection, New York : Addison-
 Wesley, 1993. ISBN-10: 0201631814. 
[22] McConnell, S., Rapid Development: Taming Wild Software Schedules, 
Redmond : Microsoft Press, 1996. p. 72, ISBN: 1-55615-900-5. 
[23] "Process Improvement and the Corporate Balance Sheet," IEEE Software, vol. 
10, no. 4, pp. 28-35, July 1993. 
[24] Davis, A. M. and Leffingwell, D. A., Using Requirements Management to Speed 
Delivery of Higher Quality Applications, 1995, available at: 
http://tinf2.vub.ac.be/~dvermeir/courses/software_engineering/696wp.pdf. 
[25] Pfleeger, S. L. and Atlee, J., Software Engineering: Theory & Practice, Third 
Edition, Upper Saddle River : Pearson Education, Inc., 2006. ISBN: 0-13-146913-4. 
[26] Kong, W., Shi, Y. and Smidts, C. S., "Early Software Reliability Prediction 
Using Cause-effect Graphing Analysis," The 53rd Annual Reliability and 
Maintainability Symposium (RAMS 2007), pp. 173 - 178, January 22-25, 2007. 
[27] Lubashevsky, A., "Early Estimation of Software Reliability in Large Telecom 
Systems," CrossTalk, June 2002. 
[28] Fagan, M., "Advances in Software Inspections," IEEE Transactions on Software 
Engineering, vol. 12, no. 7, pp. 744-751, July 1986. 
[29] Gaffney, J. E. and Davis, C. F., "An Automated Model for Software Early Error 
Prediction (SWEEP)," Proceedings of the 13th Minnowbrook Workshop on Software 
Reliability, July 1990. 
[30] Agreti, W. W. and Evanco, W. M., "Projecting Software Defects from Analyzing 
Ada Design," IEEE Transactions on Software Engineering, vol. 18, no. 11, pp. 988-
 997, Nov. 1992. 
[31] Rome Laboratory, Methodology for Software Reliability Prediction and 
Assessment, 1992. TechRep RL-TR-92-52, Vol. 1-2. 
[32] Smidts, C. S., Sova, D. and Mandela, G. K., "An Architectural Model for 
Software Reliability Quantification," The Eighth International Symposium On 
Software Reliability Engineering, vols. 2-5 Nov., pp. 324 - 335, 1997. 
[33] Smidts, C. S., Stutzke, M. and Stoddard, R. W., "Software Reliability Modeling: 
An Approach to Early Reliability Prediction," IEEE Transactions on Reliability, vol. 
47, no. 3, pp. 268-278, September 1998. 
[34] Yin, M.L., Hyde, C. L. and James, L. E., "A Petri-Net Approach for Early-Stage 
System-Level Software Reliability Estimation," Proceedings of Annual Reliability 
and Maintainablity Symposium (RAMS'00), pp. 100-105, 2000. 
[35] Zhao, J., Liu, H. and Yang, X., "Early Stage Software Reliability Estimation 
with Stochastic Reward Nets," Journal of Donghua University(English Edition), no. 
3, 2003. 
[36] Tripathi, R. and Mall, R., "Early Stage Software Reliability and Design 
Assessment," 12th Asia-Pacific Software Engineering Conference (APSEC'05), pp. 
619-628, 2005. 
 
230 
 
[37] Hu, Q. P., et al., "Early Software Reliability Prediction with Extended ANN 
Model," Proceedings of the 30th Annual International Computer Software and 
Applications Conference (COMPSAC'06), vol. 02, pp. 234 - 239, 2006. 
[38] Mei, D., "Early Software Reliability Prediction with Wavelet Networks Models," 
The 2007 International Conference on Intelligent Systems and Knowledge 
Engineering, Oct. 15-16, 2007. 
[39] Cheung, L., et al., "Early Prediction of Software Component Reliability," 
Proceedings of the 30th International Conference on Software Engineering (ICSE? 
08), pp. 111-120, May 10-18, 2008. 
[40] Smidts, C. S., et al., A Large Scale Validation of a Methodology for Assessing 
Software Quality, Office of Nuclear Regulatory Research, Washington, DC : 
USNRC, 2009 (submitted but not yet published). 
[41] Jones, C., Measuring Global Software Quality, Burlington, MA : s.n., 1995. 
[42] IEEE Computer Society, IEEE Standard Dictionary of Measures to Produce 
Reliable Software, 1988. IEEE Std. 982.1-1988. 
[43] Lawrence, J. D., et al., Assessment of Software Reliability Measurement Methods 
for Use in Probabilistic Risk Assessment, Fission Energy and Systems Safety 
Program, Lawrence Livermore Nationall Laboratory, 1998. Technical Report 
UCRLID-136035. 
[44] Li, M., et al., "Validation of a Methodology for Assessing Software Reliability," 
Proceedings of the 15th International Symposium on Software Reliability 
Engineering (ISSRE?04), pp. 66- 76, Nov. 2-5, 2004. 
[45] Myers, G. J., et al., The Art of Software Testing, 2nd Edition, Hoboken : John 
Wiley & Sons, Inc., 2004. pp. 65-88, ISBN: 0-471-46912-2. 
[46] Elmendorf, W. R., Cause-effect Graphs in Functional Testing, Poughkeepsie, 
NY : IBM Systems Development Division, 1973. TR-00.2487. 
[47] Software Reliability: Principle and Practices, New York : Wiley-Interscience 
Inc., 1976. pp. 218-227, ISBN: 0-471-62765-8.. 
[48] Nursimulu, K. and Probert, R. L., "Cause-Effect Graphing Analysis and 
Validation of Requirements," Proceeding of the 1995 Conference of the Centre for 
Advanced Studies on Collaborative Research, pp. 46-61, Nov. 1995. 
[49] Paradkar, A., "On the Experience of Using Cause-Effect Graphs for Software 
Specification and Test Generation," IBM Press, p. 51, 1994. 
[50] Paradkar, A., Tai, K. C. and Vouk, M. A., "Specification-based testing using 
cause-effect graphs," Annals of Software Engineering, vol. 4, pp. 133 - 157, 1997. 
[51] Bender RBT Inc., BenderRBT Cause-Effect Graphing Users Guide, 2006, 
retrieved August 2nd, 2008 http://www.benderrbt.com/BenderRBT-Cause-
 Effect%20Graphing%20User%20Guide.pdf. 
[52] Le, J. C., Perspectives on Software Requirements, Boston : Kluwer Academic 
Publishers Group, 2004. ISBN: 1-4020-7625-8. 
[53] IEEE Computer Society, IEEE Recommended Practice for Software 
Requirements Specifications, 1998. ANSI/IEEE Standard 830-1998. 
[54] Davis, A., Software Requirements: Objects, Functions, and States, Englewood 
Cliff, NJ : Prentice Hall, 1993. 
 
231 
 
[55] Schneider, G. M., Martin, J. and Tsai, W. T., "An experimental Study of Fault 
Detection in User Requirements Documents," ACM Transactions on Software 
Engineering and Methodddology, vol. 1, no. 22, pp. 188-204, 1992. 
[56] Gause, D. C. and Weinberg, G. M., Exploring Requirements: Quality Before 
Design, New York, NY : Dorset House, 1989. 
[57] Kamsties, E., Surfacing Ambiguity in Natural language Requirements, Ph.D. 
Dissertation, Fachbereich Informatik, University Kaiserslautern, Germany, 2001. 
[58] Wei?leder, S. and Sokenou, D., "Cause-Effect Graphs for Test Models Based on 
UML and OCL,"  
[59] Paradkar, A., Tai, K.C. and Vouk, M., "Automated test generation for cause-
 effect graphs," IEEE Transactions on Reliability, vol. 45, pp. 515-530, 1996. 
[60] Boris, B., Software Testing Techniques, 2nd Edition, New York : Van Nostrand 
Reinhold, Inc., 1990. 
[61] Ostrand, T. J. and Balcer, M. J., "The category-partition method for specifying 
and generating fuctional tests," Communications of the ACM, vol. 31, no. 6, 1988. 
[62] Copeland, L., A Practitioner's Guide to Software Test Design, Norwood : Artech 
House, 2004. ISBN:158053791x. 
[63] Ghose, S., Software Requirements Specifications for LOCAT, College Park, MD, 
USA : University of Maryland, 2004. 
[64] U.S. National Aeronautics and Space Administration (NASA), NASA Software 
Documentation Standard - Software Engineering Program , July 1991. NASA-STD-
 2100-91. 
[65] U.S. Department of Defense, Software Development and Documentation, 
Philadelphia : Naval Publications and Forms Center, Dec. 1994. MIL-STD-498 
Military Standard. 
[66] ISO/IEC, Systems and software engineering - Software life cycle processes, s.l. : 
International Organization for Standardization/International Electrotechnical 
Commission, 2008. ISO/IEC 12207:2008. 
[67] Hammer, T. F., Huffman, L. L. and Rosenberg, L. H., "Doing Requirements 
Right the First Time," Crosstalk, Journal of Defense Software Engineering, 
December 1998. 
[68] Hayes, J. H., "Building a Requirement Fault Taxonomy: Experiences from a 
NASA Verification and Validation Research Project," Proceedings of the 14th 
International Symposium on Software Reliability Engineering (ISSRE?03), pp. 49- 59, 
Nov. 2003. 
[69] IEEE Computer Society, IEEE Standard for Software Verification and 
Validation, 2004. IEEE Std 1012-2004. 
[70] Ryser, J., Berner, S. and Glinz, M., On the State of the Art in Requirements-
 based Validation and Test of Software, Zurich, Switzerland : the University of Zurich, 
1998. Available at: ftp://ftp.ifi.unizh.ch/pub/techreports/TR-98/ifi-98.12.pdf. 
[71] Rumbaugh, J. and Jacobson, I., The Unified Modeling Language Reference 
Manual, Boston : Addison-Wesley, 1999. 
[72] Spivey, J. M., The Z Notation: A Reference Manual, 2nd Edition, London : 
Prentice-Hall, 1992. 
[73] Abrial, J. R., The B Book - Assigning Programs to Meanings, England : 
Cambridge University Press, 1996. 
 
232 
 
[74] Porter, A. A., Votta, L. G. and Basili, V. R., "Comparing Detection Methods For 
Software Requirements Inspections: A Replicate Experiment," IEEE Transactions on 
Software Engineering, vol. 21, no. 6, pp. 563 - 575, 1995. 
[75] Mendenhall, W., Beaver, R. J. and Beaver, B., Introduction to Probability and 
Statistics, 13th Edition, Belmont : Duxbury Press, 2008. p. 159, ISBN-10: 
0495389536. 
[76] Hoffman, D., "A Taxonomy for Test Oracles," Quality Week '98, 1998. 
[77] Whitesitt, J. E., Boolean Algebra and Its Applications, Mineola, NY, USA : 
Dover Publications, 1995. 0486684830.. 
[78] Gregg, J. R., Ones and Zeros: Understanding Boolean Algebra, Digital Circuits, 
and the Logic of Sets, Hoboken : Wiley-IEEE Press, 1998. 978-0780334267.. 
[79] Bryant, R. E., "Graph Based Algorithms for Boolean," IEEE Transactions on 
Computer Engineering, vols. C-35, no. no. 8, pp. 677-691, Aug. 1986. 
[80] Reay, K. A. and Andrews, J. D., "A Fault Tree Analysis Strategy Using Binary 
Decision Diagrams," Reliability Engineering and System Safety, vol. 78, no. 1, pp. 
45-56, 2002. 
[81] Bryant, R. E., "Graph-Based Algorithms for Boolean Function Manipulation," 
IEEE Transactions on Computers, vol. 35, no. 8, pp. 677-691, August 1986. 
[82] IT-University of Copenhagen (ITU), BuDDy: Binary Decision Diagram 
papckage Release 2.2, 2002. 
[83] Walton, G. H., Poore, J. H. and Trammell, C. J., "Statistical Testing of Software 
Based on a Usage Model," Software Practice & Experience, vol. 25, no. 11, pp. 97-
 108, January 1995. 
[84] Musa, J., "The operational profile in software reliability engineering: an 
overview," Proceedings of 3rd International Symposium on Software Reliability 
Engineering, pp. 140-154, Oct. 7-10, 1992. 
[85] Sandfoss, R.V. and Meyer, S. A., "Input Requirements needed to Produce an 
Operational Profile for a New Telecommunications System," Proceedings of the 
Eighth International Symposium on Software Reliability Engineering (ISSRE '97), pp. 
29 - 39, Nov. 2-5, 1997. 
[86] Elbaum, S. and Narla, S., "A Methodology for Operational Profile Refinement," 
Proceedings of the 2001 Annual Reliability and Maintainability Symposium (RAMS 
2001), pp. 142 - 149, Jan. 22-25, 2001. 
[87] Gittens, M., Lutfiyya, H. and Bauer, M., "An Extended Operational Profile 
Model," Proceedings of the 15th International Symposium on Software Reliability 
(ISSRE 2004), pp. 314- 325, Nov. 2-5, 2004. 
[88] Lockheed Martin Corporation Inc., PACS (Personal Access Control System) 
Requirements Specification, Gaithersburg, MD : Lockheed Martin Corporation Inc., 
1998. 
[89] Center for Computer-Integrated Surgical Systems and Technology (CISST), 
Software Requirements Specification for MRC-II System, Baltimore, MD : Johns 
Hopkins University, 2003. 
[90] DPUSRS-01, Software Requirements Specifications for the Instrument A Data 
Processing Unit for the Company X Gamma Ray Burst Explorer, 2003. 
[91] Raytheon TI Systems, Software Requirements Sepcification for the Long Range 
Advanced Scout Surveillance System (LRAS3), 1997. 
 
233 
 
[92] Brossat, JeanPhilippe, Software Requirements Specification for Qheadache, 
Version 1.0, 2003. 
[93] Department of Computer Science, University of Toronto, Software Requirements 
Specification for the Graph Editor, 2001. 
[94] Amoson, J., et al., Software Requirements Specification for the ?Software 
Engineering Tool?, 2001. 
[95] Acorn Software, Bus Tracking System Software Requirements Specification, 
2006. 
[96] Tayyab, A. A., Software Requirements Specifications forFloristExchange 
Project, 2001. 
[97] Fox, J., et al., Software Requirements Specifications:PICASSO, s.l. : Computer 
Science Dept., Univ. Alabama in Huntsville, 1997. Technical Report TR-UAH-CS-
 1997-04. 
[98] Kao, A. and Poteet, S. R., Natural Language Processing and Text Mining, New 
York : Springer Publishing Company, 2006. ISBN-13: 978-1846281754.. 
[99] Shneiderman, B., Designing the user interface: Strategies for effective human-
 computer interaction, MA : Addison-Wesley, 1987. 
[100] Labbe, C. and Labbe, D., "Inter-Textual Distance and Authorship Attribution," 
Journal of Quantitative Linguistics, vol. 8, no. 3, pp. 213-231, 2001. 
[101] Labbe, C. and Labbe, D., "A Tool for literary studies- Intertexual Distance and 
Tree Classification," Literary and Linguistic Computing, vol. 21, no. 3, pp. 311-326, 
2006. 
[102] Jones, C., Applied Software Measurement: Assuring Productivity and Quality, 
2nd Edition, New York : McGraw-Hill, 1996. ISBN: 0070328269. 
[103] Box, G. E., Hunter, J. S. and Hunter, W. G., Statistics for Experimenters: 
Design, Innovation, and Discovery, 2nd Edition, New York : Wiley-Interscience, 
2005. ISBN: 0471718130. 
[104] Image Processing Group, University of Zagreb, CHAIRMAN - Conference 
Management System Software Requirements Specification (SRS), 2004. 
[105] GRUPPE 13, IMS Software Requirement Specification, June, 2003. 
[106] Ghose, S., Software Requirements Specification for Student Registry Query 
System (SRQS), College Park, MD, USA : University Maryland, 2004. 
[107] Ghose, S., Software Requirements Specifications for SSP, College Park, MD, 
USA : University of Maryland, 2004. 
[108] Ghose, S., Software Requirements Specifications for TELLERFAST, College 
Park, MD, USA : University of Maryland, 2004. 
[109] The Themas Team, The Energy Management System Software Requirements 
Specifications, 1998. 
[110] Ghose, S., Software Requirements Specifications for an Word Processor Unit 
(WPU), College Park, MD, USA : University of Maryland, 2004. 
[111] Peck, R. and Devore, J. L., Statistics: The Exploration & Analysis of Data, 
Pacific Grove, CA : Duxbury Press, 2007. ISBN-13: 978-0495390879. 
[112] Raynald Levesque and SPSS Inc., Programming and Data Management for 
SPSS? Statistics 17.0 A Guide for SPSS Statistics and SAS? Users, Chicago, IL : 
SPSS Inc., 2008. 
 
234 
 
[113] Hastie, S., Software Quality: The Missing X-Factor, June 2002, 
http://www.softed.com/Resources/WhitePapers/SoftQual_XFactor.aspx. 
[114] Leonard, J. G. and Nordgren, R. K., An Analysis of Early Software Reliability 
Improvement Techniques, 1997. Master's thesis. 
[115] Rome Laboratory, "Methodology for Software Reliability Estimation and 
Assessment," vol. 1 and 2, 1992. Technical Report RLTR- 92-52. 
[116] Guvenc, M., "Writing Testable and Code-able Requirements," BorCon 2004 
Proceedings, September 11-25, 2004. 
[117] Smidts, C. S. and Li, M., Validation of A Methodology for Assessing Software 
Quality, Nuclear Regulatory Commission, Office of Nuclear Regulatory Research, 
Washington DC : USNRC, 2004. NUREG/CR-6848. 
[118] Smidts, C. S. and Li, M., Preliminary Validation of a Methodology for 
Assessing Software Quality, University of Maryland, Washington, DC : U.S. Nuclear 
Regulatory Commission, 2004. NUREG/CR-6848. 
[119] Wohlin, C. and Runeson, P., "A Method Proposal for Early Software Reliability 
Estimations," Proceedings of 3rd International Symposium on Software Reliability 
Engineering, pp. 156-163, 1992. 
[120] "An Enhanced Model for Early Software Reliability Prediction Using Software 
Engineering Metrics," 2008 Second International Conference on Secure System 
Integration and Reliability Improvement (SSIRI'08), pp. 177-178, 2008. 
[121] Nagappan, N., "Towards a Metric Suite for Early Software Reliability 
Assessment," Proceedings of the 26th International Conference on Software 
Engineering, pp. 60 - 62, 2004. 
[122] Smith, C. and Uber, C., "Experience Report on Early Software Reliability 
Prediction and Estimation," Proceedings of the 10th International Symposium on 
Software Reliability Engineering, p. 282, 1999. 
[123] Shi, Y., Kong, W. and Smidts, C. S., "Data Collection and Analysis for the 
Reliability Prediction and Estimation of a Safety Critical System," Reliability 
Analysis of System Failure Data (RAF'07), March 1-2, 2007. Available at 
http://www.deeds.informatik.tu-darmstadt.de/RAF07/papers/wende_kong.pdf. 
[124] IEEE Computer Society, IEEE Recommended Practice for Software 
Requirements Specifications, 1993. ANSI/IEEE Standard 830-1993. 
[125] Ghose, S., Software Requirements Specifications for LOCAT-II, College Park, 
MD, USA : University of Maryland, 2004. 
[126] Requirements Specification for Personal Access Control System, Gaithersburg, 
MD, USA : Lockheed Martin Corporation Inc., 1998. 
[127] Gilb, T. and Graham, D., Software Inspection, New York : Addison-Wesley, 
1994. 
[128] Fagan, M. E., "Design and code inspections to reduce errors in program 
development," IBM Systems Journal, vol. 15, no. 3, pp. 182-211, 1976. 
[129] Carver, J. C., Nagappan, N. and Page, A., "The Impact of Educational 
Background on the Effectiveness of Requirements Inspections: An Empirical Study," 
IEEE Transactions on Software Engineering, vol. 34, no. 6, pp. 800-812, Nov/Dec 
2008.