ABSTRACT
Title of dissertation: GLOBAL NONLINEAR MODELING USING
AUTOMATED LOCAL MODEL NETWORKS
IN REAL TIME
Rose Weinstein
Doctor of Philosophy, 2021
Dissertation directed by: Professor James E. Hubbard Jr.
Department of Aerospace Engineering
Global nonlinear modeling is a challenging task that spans multiple disciplines.
When it is necessary to develop a model across the global input space, and a single
linear model is insufficient, nonlinear modeling methods are required. If the model
is constrained to be developed autonomously in real time, the modeling problem
is more difficult, and there are fewer available resources, tools, and techniques for
efficient and effective model development. This scenario specifically arises in the
context of the NASA Learn-to-Fly concept, which aims to develop tools for real-
time aerodynamic modeling and control for new or modified flight vehicles, and
which serves as the motivation for this research. This work aims to develop a
modeling method that enables the model to be developed automatically in real
time, with limited prior knowledge required, and that provides a model that is easily
interpretable, allows physical insight into the system, and offers good global and local
prediction capabilities. A novel method is developed and presented in this work for
automated real-time global nonlinear modeling using local model networks, known
as Smoothed Partitioning with LocalIzed Trees in Real time (SPLITR). The global
nonlinear system behavior is partitioned into several local regions known as cells,
with the dimension, location, and timing of each partition automatically selected
based on a new residual characterization procedure, under the constraints of real-
time operation. Regression trees represent the successive partitioning of the global
input space and describe the evolution of the cell structure. Recursive equation-
error least-squares parameter estimation in the time domain is used to estimate a
model that represents the local system behavior in each region so that the model
can be updated independently with data in the explanatory variable ranges of each
cell, even if the data are not contiguous in time. A weighted superposition of these
piecewise local models across the input space forms a global nonlinear model that
also accurately captures the local behavior. The SPLITR approach was tested and
validated using both simplified simulated test data, as well as experimental flight
test data, and the results were analyzed in terms of model predictive capabilities
and interpretability. The results show that SPLITR can be used to automatically
partition complex nonlinear behavior in real time, produce an accurate model, and
provide valuable physical insight into the local and global system behavior.
GLOBAL NONLINEAR MODELING USING
AUTOMATED LOCAL MODEL NETWORKS IN REAL TIME
by
Rose Weinstein
Dissertation submitted to the Faculty of the Graduate School of the
University of Maryland, College Park in partial fulfillment
of the requirements for the degree of
Doctor of Philosophy
2021
Advisory Committee:
Professor James E. Hubbard Jr., Chair/Advisor
Professor Amr Baz
Professor Robert M. Sanner
Professor Allen Winkelmann
Professor Huan Xu
? Copyright by
Rose Weinstein
2021
Preface
This material is based upon work supported by the National Science Founda-
tion Graduate Research Fellowship Program under Grant No. DGE 1840340, and
by the NASA Aeronautics Research Mission Directorate (ARMD) Transformational
Tools and Technologies (TTT) project.
ii
Dedication
To my parents, family, friends, and all the mentors and role models whose
dedication to me is what brought me here.
In memory of Eugene H. D. Heim,
an exemplary engineer, mentor, and colleague who inspired many with his
optimism, energy, determination, and success.
iii
Acknowledgments
?First, think. Second, dream. Third, believe. And finally, dare.? ? Walt Disney
I joined the University of Maryland?s Morpheus Lab in my sophomore year, and
emerge 8 years later a product of the lab of ?dreams,? academically, professionally,
and personally. Dr. Hubbard?s lab is a unique cultivating environment in which
we are given the utmost support, respect, and flexibility, and where we are trained
to pursue technical excellence and rigor. We are taught that success is not an
entitlement, and that hard work, determination, and of course, dreams, are key.
I find the Morpheus Lab experience to align well with Walt Disney?s quote.
First, we are equipped with the skills and technical expertise to think independently,
ambitiously, and rigorously. Second, we are encouraged and required to funnel our
thoughts into dreams that may seem ?hopelessly naive? to others, but that feel, and
are, within reach to us. Third, our foundation, collective support, and filtering of
negativity enable us to truly believe in the power of our dreams. And finally, we are
given the confidence, resoluteness, and determination to dare to pursue and fulfill
those dreams. Together, and only as a team, the Wolfpack endures all, and enables
and sets those dreams in motion.
Dr. Hubbard ? I cannot thank you enough for the critical role you played in
not only turning my dreams into reality, but enabling and equipping me to dream
in the first place. From the moment I woke you up during that first Skype meet-
ing in sophomore year, you have believed in me and supported me unconditionally.
iv
Your students never doubt that they are your constant priority, and that you care
deeply for our wellbeing, academically, professionally, and personally. You are al-
ways grounded in reality and in what is important. You encouraged and trained us
to focus on our goals and our pursuit of research, while ignoring the noise, the neg-
ativity, and the naysayers. You were an incredible role model, mentor, and advisor
throughout this journey. Dr. Hubbard ? ?you were right.?
I would not be where I am today without the support of Morpheus Lab Gen III
members Dr. Michael Cunningham, Dr. Jose Mondragon, Dr. Joaquim Dias, and
Nick Rymer. You all played a critical role in the development of my research, and of
myself as a researcher and an individual. Michael ? through our countless technical
discussions and meetings, both in person and virtual, you were always available to
talk through new ideas, you provided insightful feedback, and you helped me see the
clarity through the fuzziness. Jose ? I met you early in my undergraduate years,
you have been a constant and reliable friend throughout this entire journey, and you
were always available to offer needed support and guidance.
My graduate school experience was embedded at NASA Langley Research
Center in the Pathways Program, and I would like to thank the Flight Dynamics
Branch (FDB) for allowing me to pursue my graduate degrees while part of such a
great group of role models and colleagues. In particular, I worked closely with team
members in the FDB and the Dynamic Systems and Control Branch (DSCB) across
the Learn-to-Fly (L2F) and Modeling and Control for Agile Aircraft Development
(MCAAD) projects. I would like to acknowledge and thank my team members for
all of your support and technical discussions, and for the efforts conducting flight
v
test operations and collecting flight data used in this research.
I would also like to thank my NASA Pathways mentors whom I worked closely
with throughout my internship. Mark Croom ? you encouraged me to think outside
of the traditional box, and to always look toward the stars. Gene Heim ? your ?can
do? attitude, optimism, and energy were inspiring and infectious, and you applied
exemplary rigor to every problem we came across. Dr. Bruce Owens ? you brought
me over the finish line with your encouragement and support, despite my infringing
upon your exclusive FDB PhD rights.
Dr. Gene Morelli ? although I never had the privilege of taking your Air-
craft System Identification course, I learned so much from you directly during our
hours of technical discussions. Your door was always open to chat, and your exper-
tise and insight into the field of aircraft system identification are unrivaled. Your
commitment and dedication to research are truly admirable, and your wide-ranging
ideas and suggestions always gave me food for thought. Gene ? I very much ap-
preciate your time, patience, recommendations, consultation, guidance, and insight
throughout my dissertation research.
Dr. Jared Grauer ? you have been an integral part of my graduate school
experience, both at Morpheus Lab as well as at NASA. You have offered unparalleled
patience, time, expertise, support, guidance, mentorship, and friendship throughout
my research and beyond. You helped encourage me through the setbacks and joined
in celebrating the successes. You offered expertise and provided direction, while
encouraging individuality. Your accessibility and dedication went above and beyond,
and I truly appreciate your role and involvement throughout this work.
vi
Dr. Alison Flatau ? you started off as my undergraduate academic advisor,
and went on to offer me my first research experience in Aerosmart lab after my
freshman year. You truly believed in me from the very beginning, you sent me
to my first conference to experience where research comes together, and you were
always available to offer advice.
Dr. Zohaib Hasnain ? I began working with you after my freshman year
in Aerosmart Lab through your own PhD work, and that is where I got my first
research experience. You believed in me and in my potential before I had even taken
my introductory aerospace courses, and you provided me with the inspiration and
confidence to forge my own path forward. From the perspective that one domino
can trigger the whole journey, it all started working with you.
Last but not least, I would like to thank my family for their unconditional
support throughout this entire neverending pursuit of education. My father, Elliott
Weinstein, showed me a determination to take the difficult path, a deep respect
for math and science, and the importance of independent and resolute thinking.
My mother, Gitta Weinstein, provided the much-needed support, nurturing, and
reassurance to pursue the goals I set forth to accomplish, and made sure that I
balanced work with other important goals. My siblings, Yossi, Ari, and Dassy, my
sister-in-law, Deena, and my nieces and nephews offered levity and encouragement,
and helped me to balance academia with other priorities in life.
vii
Table of Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
Nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvi
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Motivation: a new era of air vehicles . . . . . . . . . . . . . . . . . . 1
1.1.1 Traditional aircraft modeling and control development process 3
1.1.2 Learn-to-Fly aircraft development concept . . . . . . . . . . . 6
1.2 Background on aircraft system identification . . . . . . . . . . . . . . 10
1.2.1 Modeling considerations . . . . . . . . . . . . . . . . . . . . . 12
1.2.2 A review of system identification tools, techniques, and processes 17
1.3 A literature review of global aerodynamic modeling from flight data . 25
1.3.1 Developing an aerodynamic database of local linear models . . 26
1.3.2 Global polynomial aerodynamic modeling . . . . . . . . . . . 27
1.3.3 Other modeling approaches . . . . . . . . . . . . . . . . . . . 31
1.4 Summary and limitations of previous approaches . . . . . . . . . . . . 34
1.5 Research approach, goals, and contributions . . . . . . . . . . . . . . 37
1.6 Dissertation outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2. Local Model Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.1 Introduction to local model networks . . . . . . . . . . . . . . . . . . 42
2.2 LMN cell structure overview . . . . . . . . . . . . . . . . . . . . . . . 46
2.2.1 Split dimensionality: axes orthogonal vs. axes oblique . . . . . 49
2.2.2 Split location: bisecting vs. customized . . . . . . . . . . . . . 50
2.3 LMN leaf models overview . . . . . . . . . . . . . . . . . . . . . . . . 51
2.3.1 Model inputs: explanatory variables and partitioning variables 51
2.3.2 Model order: constant, linear, or higher order
polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
2.3.3 Model structure: pre-specified vs. automated
selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2.3.4 Parameter estimation process: global learning vs.
local learning . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.4 Global LMN architecture overview . . . . . . . . . . . . . . . . . . . 59
2.4.1 Validity functions: Gaussian weights and a partition of unity . 59
2.4.2 Global LMN form: combination of local model
outputs vs. combination of local model parameters . . . . . . 61
2.5 A literature review of LMN construction algorithms . . . . . . . . . . 64
2.5.1 Batch LMN construction algorithms . . . . . . . . . . . . . . 66
2.5.2 Online LMN construction algorithms . . . . . . . . . . . . . . 69
2.5.3 Aerodynamic modeling LMN construction algorithms . . . . . 72
2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3. Smoothed Partitioning with LocalIzed Trees in Real time (SPLITR) . . . . 75
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3.2 Properties and aspects of leaf models . . . . . . . . . . . . . . . . . . 77
3.2.1 Selection of model explanatory and partitioning
variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
3.2.2 Specification of local model order . . . . . . . . . . . . . . . . 79
3.2.3 Determination of model structure . . . . . . . . . . . . . . . . 80
3.2.4 Parameter estimation process . . . . . . . . . . . . . . . . . . 80
3.3 Cell structure determination process . . . . . . . . . . . . . . . . . . 84
3.3.1 Cell structure problem statement . . . . . . . . . . . . . . . . 85
3.3.2 Preliminary guiding aspects and overview of cell
structure determination . . . . . . . . . . . . . . . . . . . . . 86
3.3.3 Data processing procedure . . . . . . . . . . . . . . . . . . . . 91
3.3.4 Cell splitting procedure . . . . . . . . . . . . . . . . . . . . . . 95
3.4 Global weighting and combination of local models . . . . . . . . . . . 103
3.5 SPLITR user specifications and inputs . . . . . . . . . . . . . . . . . 104
3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
4. SPLITR Case Studies and Sensitivity
Investigation with Simulated Test Data . . . . . . . . . . . . . . . . . . . . 117
4.1 1D piecewise linear data . . . . . . . . . . . . . . . . . . . . . . . . . 118
4.1.1 Baseline case . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
4.1.2 Minimum cell width . . . . . . . . . . . . . . . . . . . . . . . 127
4.1.3 Modeling data noise sensitivity . . . . . . . . . . . . . . . . . 129
4.2 1D quadratic data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
4.2.1 Baseline case . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
4.2.2 Filter window . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
4.2.3 Smoothness factor . . . . . . . . . . . . . . . . . . . . . . . . 139
4.2.4 Residual threshold factor . . . . . . . . . . . . . . . . . . . . . 142
ix
4.2.5 Varying noise magnitude . . . . . . . . . . . . . . . . . . . . . 146
4.2.6 Multiple simulations . . . . . . . . . . . . . . . . . . . . . . . 149
4.2.7 Model adaptation . . . . . . . . . . . . . . . . . . . . . . . . . 151
4.3 1D cubic data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
4.3.1 Baseline case . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
4.3.2 Varying noise magnitude . . . . . . . . . . . . . . . . . . . . . 157
4.4 2D test data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
4.4.1 2D quadratic data, case 1 . . . . . . . . . . . . . . . . . . . . 159
4.4.2 2D quadratic data, case 2 . . . . . . . . . . . . . . . . . . . . 162
4.4.3 2D quadratic data, case 3 . . . . . . . . . . . . . . . . . . . . 166
4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
5. SPLITR Applied to Aerodynamic
Modeling with Simulated and
Experimental Flight Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
5.1 Aerodynamic modeling problem setup . . . . . . . . . . . . . . . . . . 174
5.2 F-16 simulation data . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
5.3 Practical aspects of modeling flight data . . . . . . . . . . . . . . . . 179
5.4 E1 flight test data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
5.4.1 E1 test aircraft . . . . . . . . . . . . . . . . . . . . . . . . . . 182
5.4.2 Case study of Cm model . . . . . . . . . . . . . . . . . . . . . 184
5.4.3 Case study of CX model . . . . . . . . . . . . . . . . . . . . . 200
5.5 T-2 Generic Transport Model flight test data . . . . . . . . . . . . . . 208
5.5.1 T-2 test aircraft . . . . . . . . . . . . . . . . . . . . . . . . . . 209
5.5.2 Case study of CL model . . . . . . . . . . . . . . . . . . . . . 210
5.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
6. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
6.1 Dissertation summary . . . . . . . . . . . . . . . . . . . . . . . . . . 221
6.2 SPLITR summary, advantages, and limitations . . . . . . . . . . . . . 224
6.3 Contributions of this work . . . . . . . . . . . . . . . . . . . . . . . . 228
6.4 Future directions and applications for SPLITR . . . . . . . . . . . . . 231
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
x
List of Tables
3.1 SPLITR algorithm parameters. . . . . . . . . . . . . . . . . . . . . . 105
5.1 F-16 geometry and mass properties. . . . . . . . . . . . . . . . . . . . 177
5.2 E1 geometry and mass properties. . . . . . . . . . . . . . . . . . . . . 184
5.3 T-2 geometry and mass properties. . . . . . . . . . . . . . . . . . . . 210
List of Figures
1.1 Conventional aircraft modeling and control development process. . . . 4
1.2 Learn-to-Fly aircraft development process. . . . . . . . . . . . . . . . 6
1.3 Aircraft body-axis forces and moments on the E1 test aircraft. . . . . 11
1.4 System identification process. . . . . . . . . . . . . . . . . . . . . . . 17
1.5 Global polynomial modeling by selecting from a specified pool of can-
didate modeling terms. . . . . . . . . . . . . . . . . . . . . . . . . . 29
1.6 Cell structure partitioned by Mach number and angle of attack. . . . 31
2.1 Local model network architecture with M cells. . . . . . . . . . . . . 44
2.2 Regression tree construction and visualization. . . . . . . . . . . . . . 48
2.3 LMN split dimensionality options. . . . . . . . . . . . . . . . . . . . . 49
2.4 LMN split location options. . . . . . . . . . . . . . . . . . . . . . . . 51
2.5 Global model output as a weighted combination of local model outputs. 62
2.6 Global model output as a weighted combination of parameters across
local models. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.1 Overview of SPLITR modeling approach. . . . . . . . . . . . . . . . 76
3.2 F-16 simulation time history data with ? partitioning. . . . . . . . . 83
3.3 Residuals for F-16 models of CL = f(?). . . . . . . . . . . . . . . . . 89
3.4 SPLITR process overview. . . . . . . . . . . . . . . . . . . . . . . . . 90
3.5 Data processing and cell splitting procedures for SPLITR algorithm. . 90
3.6 Sample cell with 5 bins. . . . . . . . . . . . . . . . . . . . . . . . . . 96
3.7 A bin fails if the mean of both types of residuals is pulled beyond a
specified number of standard deviations from the mean of acceptable
residuals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
3.8 The split location is chosen as the right or left boundary of the col-
lection of failed bins. . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.1 Time history of response variable for piecewise linear baseline case. . 120
4.2 Cell structure evolution for piecewise linear baseline case. . . . . . . 121
4.3 Local model fit and residuals for piecewise linear baseline case. . . . 121
4.4 Binned local residual characterization for piecewise linear baseline
case. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
4.5 Residual threshold characteristics for piecewise linear baseline case. . 126
4.6 Time history of parameter estimates across all cells for piecewise lin-
ear baseline case. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
4.7 Local model parameter estimates across all cells for piecewise linear
baseline case. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
4.8 Local model fit for piecewise linear model with 25 simulation cases
and a minimum cell width of 0.05. . . . . . . . . . . . . . . . . . . . 128
4.9 Local model fit for piecewise linear model with 25 simulation cases
and a minimum cell width of 0.025. . . . . . . . . . . . . . . . . . . 129
4.10 Local and global model fits for piecewise linear data noise sensitivity
test case with SNR = 4. . . . . . . . . . . . . . . . . . . . . . . . . . 130
4.11 Residual threshold characteristics for piecewise linear data noise sen-
sitivity test case with SNR = 4. . . . . . . . . . . . . . . . . . . . . . 131
4.12 Time history of response variable for quadratic baseline case. . . . . 133
4.13 Cell structure evolution for quadratic baseline case. . . . . . . . . . . 133
4.14 Local and global model fits for quadratic baseline case. . . . . . . . . 134
4.15 Validity functions for quadratic baseline case. . . . . . . . . . . . . . 134
4.16 Residual threshold characteristics for quadratic baseline case. . . . . . 135
4.17 Local model parameter estimates across all cells for quadratic baseline
case. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
4.18 Time history of parameter estimates across all cells for quadratic
baseline case. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
4.19 Local and global parameter estimates for quadratic baseline case. . . 137
4.20 Cell structure evolution for quadratic model with filter window vari-
ation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
4.21 Local and global model fits for quadratic model with filter window
variation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
4.22 Residual threshold characteristics for quadratic model with filter win-
dow variation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
4.23 Global smoothed model fit for quadratic test case with smoothness
factor ?s = 0.25. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
4.24 Validity functions for quadratic test case with smoothness factor ?s =
0.25. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
4.25 Global smoothed model fit for quadratic test case with smoothness
factor ?s = 1.5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
4.26 Validity functions for quadratic test case with smoothness factor ?s =
1.5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
4.27 Local and global model fits for quadratic model with residual thresh-
old factor variation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
4.28 Residual threshold characteristics for quadratic model with residual
threshold factor variation. . . . . . . . . . . . . . . . . . . . . . . . . 144
4.29 Binned local residual characterization for quadratic model with resid-
ual threshold factor variation. . . . . . . . . . . . . . . . . . . . . . . 145
4.30 Time history of response variable for quadratic model with varying
noise magnitude. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
4.31 Cell structure evolution for quadratic model with varying noise mag-
nitude. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
xiii
4.32 Local and global model fits for quadratic model with varying noise
magnitude. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
4.33 Residual threshold characteristics for quadratic model with varying
noise magnitude. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
4.34 Validity functions for quadratic model with varying noise magnitude. 149
4.35 Global model fits for multiple simulations with quadratic data. . . . 151
4.36 Local parameter estimates for multiple simulations with quadratic
data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
4.37 Local and global model fits for quadratic model with model adapta-
tion using the first set of data. . . . . . . . . . . . . . . . . . . . . . . 154
4.38 Local and global model fits for quadratic model with model adapta-
tion using the full set of data. . . . . . . . . . . . . . . . . . . . . . . 154
4.39 Cell structure evolution for quadratic model with model adaptation. 155
4.40 Time history of response variable for quadratic model with model
adaptation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
4.41 Local and global model fits for cubic baseline case. . . . . . . . . . . 157
4.42 High-pass filtered modeling data for cubic baseline case. . . . . . . . 157
4.43 Local and global model fits for cubic test case with varying noise
magnitude. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
4.44 High-pass filtered modeling data for cubic test case with varying noise
magnitude. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
4.45 Time history of input data for 2D quadratic case 1. . . . . . . . . . . 161
4.46 Cell structure evolution for 2D quadratic case 1. . . . . . . . . . . . 161
4.47 Global model fit for 2D quadratic case 1. . . . . . . . . . . . . . . . 162
4.48 Validation test data and model fit for 2D quadratic case 1. . . . . . 162
4.49 Time history of input data for 2D quadratic case 2. . . . . . . . . . . 164
4.50 Test data for 2D quadratic case 2. . . . . . . . . . . . . . . . . . . . 164
4.51 Cell structure evolution for 2D quadratic case 2. . . . . . . . . . . . 165
4.52 Modeling data and SPLITR model fit for 2D quadratic case 2. . . . 165
4.53 Test data and model fit for 2D quadratic case 2. . . . . . . . . . . . 166
4.54 Time history of input data for 2D quadratic case 3. . . . . . . . . . . 167
4.55 Cell structure evolution for 2D quadratic case 3. . . . . . . . . . . . 168
4.56 Global model fit with validation data for 2D quadratic case 3. . . . . 169
4.57 Validation test data and model fit for 2D quadratic case 3. . . . . . 169
5.1 SPLITR model fits of CL = f(?) for F-16 data. . . . . . . . . . . . . 179
5.2 E1 test aircraft. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
5.3 Time histories of response variable and PV modeling data for E1 Cm
model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
5.4 Cm vs. ? modeling data for E1 Cm model. . . . . . . . . . . . . . . . 186
5.5 Cell structure evolution for E1 Cm model. . . . . . . . . . . . . . . . 187
5.6 Binned local residual characterization for E1 Cm model. . . . . . . . 189
5.7 Residual threshold characteristics for E1 Cm model. . . . . . . . . . . 191
5.8 Local model parameter estimates across all cells for E1 Cm model. . . 192
5.9 Time history of Cm? estimates across all cells for E1 Cm model. . . . 193
xiv
5.10 Validity functions for E1 Cm model. . . . . . . . . . . . . . . . . . . . 194
5.11 Local and weighted global parameter estimates of Cm? for E1 Cm
model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
5.12 RMS of binned global residuals for modeling data for E1 Cm model. . 196
5.13 Global residuals vs. PV for modeling data for E1 Cm model. . . . . . 197
5.14 Time histories of response variable and PV validation data for E1 Cm
model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
5.15 RMS of binned global residuals for validation data for E1 Cm model. 198
5.16 Global residuals vs. PV for validation data for E1 Cm model. . . . . . 199
5.17 Histogram of PV modeling data with RMS of error for E1 Cm model. 200
5.18 Time histories of response variable and PV modeling data for E1 CX
model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
5.19 CX vs. J modeling data for E1 CX model. . . . . . . . . . . . . . . . 202
5.20 Cell structure evolution for E1 CX model. . . . . . . . . . . . . . . . 203
5.21 Local model parameter estimates across all cells for E1 CX model. . . 204
5.22 Local and weighted global parameter estimates of CX for E1 CJ X
model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
5.23 Validity functions for E1 CX model. . . . . . . . . . . . . . . . . . . . 205
5.24 RMS of binned global residuals for modeling data for E1 CX model. . 206
5.25 Global residuals vs. PV for modeling data for E1 CX model. . . . . . 206
5.26 Time histories of response variable and PV validation data for E1 CX
model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
5.27 RMS of binned global residuals for validation data for E1 CX model. 208
5.28 Global residuals vs. PV for validation data for E1 CX model. . . . . . 208
5.29 T-2 GTM test aircraft. . . . . . . . . . . . . . . . . . . . . . . . . . . 209
5.30 Time histories of response variable and PV modeling data for T-2 CL
model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
5.31 CL vs. ? modeling data for T-2 CL model. . . . . . . . . . . . . . . . 212
5.32 Cell structure evolution for T-2 CL model. . . . . . . . . . . . . . . . 212
5.33 Local model parameter estimates across all cells for T-2 CL model. . . 214
5.34 RMS of binned global residuals for modeling data for T-2 CL model. . 215
5.35 Global residuals vs. PV for modeling data for T-2 CL model. . . . . . 215
5.36 Time histories of response variable and PV validation data for T-2
CL model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
5.37 RMS of binned global residuals for validation data for T-2 CL model. 217
5.38 Global residuals vs. PV for validation data for T-2 CL model. . . . . 217
xv
Nomenclature
Roman Symbols
ax, ay, az body-axis translational acceleration
b wing span, ft
CX , CY , CZ body-axis nondimensional force coefficients
Cl, Cm, Cn body-axis nondimensional moment coefficients
CL, CD wind-axis nondimensional force coefficients
c? wing mean aerodynamic chord, ft
1D one-dimensional
2D two-dimensional
D dispersion matrix
d propeller diameter, ft
Ix, Iy, Iz, Ixz inertia tensor elements, slug-ft
2
J advance ratio
m mass, slug
N number of data points
p, q, r body-axis roll, pitch, and yaw rates, rad/s
q? dynamic pressure, lbf/ft2
R real number
R2 coefficient of determination
S wing reference area, ft2
T thrust, lbf
u model input variable
V true airspeed, ft/s
w validity function
x model explanatory variable
y model output
z response variable
Greek Symbols
? angle of attack, rad or deg
?e elevator deflection, rad
? parameter estimate
?? residual threshold factor
?? standard deviation factor
??,norm number of standard deviations to normalize
?ff forgetting factor
xvi
?s smoothness factor
? mean
? residual
? covariance matrix
? standard deviation
? partitioning variable
? propeller speed, rad/s
Superscripts
?1 matrix inverse
? mean
? estimate
? time derivative
T transpose
Acronyms
AAM Advanced Air Mobility
CART Classification And Regression Trees
CAS Convergent Aeronautics Solutions
CFD computational fluid dynamics
EV explanatory variable
EM expectation-maximization
L2F Learn-to-Fly
LMN local model network
LOLIMOT LOcal LInear MOdel Trees
MCAAD Modeling and Control for Agile Aircraft Development
MOF multivariate orthogonal functions
MSE mean squared error
PTI programmed test input
PV partitioning variable
RMS root mean square
RSS residual sum of squares
SISO single input single output
SNR signal to noise ratio
SPLITR Smoothed Partitioning with LocalIzed Trees in Real time
UAM Urban Air Mobility
xvii
Chapter 1
Introduction
1.1 Motivation: a new era of air vehicles
In the past, the design objectives for aircraft have largely been driven by mil-
itary, commercial transport, and business transport utilization, with relatively few
large companies rising up to dominate the market, and development programs that
included up to billion dollar budgets, extensive workforces, and prolonged durations.
More recently, advancements in autonomous technologies, electric propulsion, and
computing capabilities have broadened the applications and advantages of taking to
the skies to undertake tasks previously done on the ground [1].
A proliferation of Unmanned Aerial Systems (UAS) of varied shapes and sizes
has enabled multiple sectors to harness the versatility for specialized applications.
The military can utilize unmanned vehicles for intelligence, surveillance, and com-
bat, and commercial applications include crops monitoring, power lines inspection,
search and rescue, and package delivery. Perhaps the most poignant and far-reaching
potential lies in transportation. Calls for Mobility On Demand (MOD) leverage
these technologies to transform existing transportation systems and provide travel-
ers with flexible, customized, and efficient transit options [2]. Advanced Air Mobility
(AAM) envisions on-demand, accessible aviation services that can transport people
and cargo across rural, suburban, and urban environments, with Regional Air Mo-
bility (RAM) expediting inter-city transit and Urban Air Mobility (UAM) easing
congestion with solutions that specifically address the challenges of air vehicle op-
eration within densely populated urban environments [1].
The objectives and associated demands of AAM have blurred the traditional
dichotomy of air vehicle categorization as conventional fixed-wing or helicopter, as
unconventional concepts are emerging to address the challenges associated with wide
ranges of travel distance, capacity, environments, and capabilities. From small-scale
package delivery drones, to Personal Air Vehicles (PAVs), air taxis, and regional
transports, new configurations span multicopters, tiltrotors, vertical take off and
landing (VTOL), and lift + cruise, among others [3, 4].
This revolutionary new frontier of air vehicle utilization, and the urgency asso-
ciated with ambitious demands for imminent implementation, have created a com-
petitive market that is occupied by both established large aircraft companies as well
as emerging startups [1,4]. Electric VTOL (eVTOL) concepts have proliferated into
hundreds of designs competing to fill the void. The success of an individual design
is based to a large degree on minimizing budgets, reducing the necessary workforce,
and expediting the program development. This all must be done while maintain-
ing strict standards of vehicle safety, reliability, and certification, and leveraging
2
new technologies to provide autonomy, intelligent systems, and advanced onboard
capabilities.
With reduced budgets and manpower, expedited timelines, and novel vehicle
configurations that have complex dynamics and aerodynamics, there is urgent de-
mand for a new collection of tools to efficiently, safely, and expeditiously develop
these air vehicles. The next section will describe the traditional aircraft development
process and the reasons why it may not meet these requirements, as well as an al-
ternative approach offered by the NASA Learn-to-Fly concept. This new approach
also offers valuable improvements to developing even more conventional vehicles.
The rest of the chapter will cover background material and a literature review of
aircraft system identification, which is at the core of aircraft development.
1.1.1 Traditional aircraft modeling and control development
process
For a given aircraft design, the conventional paradigm for the development
of aircraft flight systems is a lengthy, iterative process that involves estimating
an aerodynamic model, building a simulation model, and designing a control law
and guidance algorithm, where each step involves extensive testing and analysis, as
outlined in Fig. 1.1 [5, 6]. A core element of the aircraft flight systems that the
other components rely on is the aerodynamic model, which can provide physical
insight into the aircraft stability and control characteristics, inform a simulation
model to predict the aircraft response, and impact model-based control law designs
3
and guidance systems. Given the central role of the aerodynamic model, the model
fidelity is an important factor to ensuring the safety and reliability of the aircraft
operations.
Figure 1.1: Conventional aircraft modeling and control development process (adapted
from Ref. [7]).
The traditional approach to estimating an aerodynamic model may involve
both numerical and experimental test techniques. Vortex lattice methods (VLM)
based on first principles are easily accessible in software packages such as MIT?s
Athena Vortex Lattice (AVL) and the United States Air Force?s Digital DATCOM,
and can provide preliminary stability and control estimates rapidly [8]. Advance-
ments in numerical computing have allowed computational fluid dynamics (CFD)
to become a valuable and flexible tool to simulate flow conditions. However, fidelity
limitations exist from grid geometry and flow condition approximations, test cases
take a long time to run, and these methods have difficulty capturing complex flow
fields, particularly at low Reynolds numbers where UAM vehicles predominantly
operate [9]. Both static and dynamic captive wind tunnel tests are often used to de-
velop an aerodynamic database using a scaled aircraft model. However, test entries
4
can be expensive and time consuming, and results can be subject to errors from
dynamic scaling, wall and sting interferences, and incorrect Reynolds number and
Mach number matching [10]. Each of these test techniques is an effort to emulate
the aerodynamics that the full-scale aircraft would experience in flight, so flight
testing provides the most authentic data for aircraft modeling. Nevertheless, flight
test campaigns can be costly, lengthy, and risky to execute, and data quality can
be degraded by practical aspects such as measurement noise, insufficient dynamic
excitation, and unmodeled disturbances.
All of these modeling methods traditionally require extensive time, cost, plan-
ning, analysis, and iteration, and while improvements continue to be made in soft-
ware and hardware capabilities, there remain to be discrepancies between numerical,
wind tunnel, and flight test results. Furthermore, this entire process may need to
be repeated for a vehicle configuration change. Often when a new vehicle is being
developed, a similar past vehicle is found and used as a starting point to provide a
baseline aerodynamic model that can offer initial physical insight. However, in the
fast-moving innovative air vehicle environment inspired by UAM, the hybrid design
space that these vehicles fill is unprecedented, so the vehicles lack much similarity
to one another, and particularly to past well-established vehicles. For these reasons,
and the others discussed in this section, the traditional aircraft modeling and con-
trol development process may not be sufficient to meet the demands of UAM vehicle
development, and this realization additionally illuminates the inherent inefficiencies
involved with using this process even for more conventional vehicles.
5
1.1.2 Learn-to-Fly aircraft development concept
The NASA Learn-to-Fly (L2F) concept is an advanced technology develop-
ment initiative that aims to improve the traditional aircraft modeling and control
development process with a new paradigm that offers efficient, automated, and intel-
ligent methods that minimize the time, effort, and analysis involved. The approach
is to replace most of the ground-based testing and analysis with automated, onboard
tools that provide in-flight real-time aerodynamic modeling, learning control, and
guidance, with little reliance on prior knowledge of the particular aircraft aerody-
namics, as depicted in Fig. 1.2. The motivating analogy is that just as a young
bird can jump out of its nest and ?learn to fly? on its own by ?figuring it out,?
an aircraft can be equipped with a robust set of tools that will provide it with the
self-awareness and learning capabilities to learn how to fly and to guide itself safely.
Figure 1.2: Learn-to-Fly aircraft development process (adapted from Ref. [7]).
L2F aims to develop a toolbox of advanced technologies to facilitate rapid de-
velopment of new or modified vehicles by improving the efficiency of flight testing
with automated onboard capabilities to enable safe autonomous envelope expan-
6
sion and learning control. Since traditional understanding of conventional aircraft
aerodynamic properties cannot be fully relied upon for these unprecedented configu-
rations, the tools must be adaptive, customizable, and informative as these systems
are explored. The objective is to allow each vehicle to ?be all it can be,? the extent
of which is dependent on the stability and control characteristics, and control power
and bandwidth.
This enabling technology can actually be used across the full lifecycle of the
aircraft. So far, the focus has been on expediting the aircraft modeling and control
development process where the goal is to learn about the aircraft, to build the nom-
inal aerodynamic model, and to enable safe envelope expansion. These tools can
also be useful in the future for certification purposes by expediting and streamlining
the required flight testing component with efficient test techniques, and also supply-
ing rich, informative flight data for analysis. Finally, during flight operations L2F
algorithms can be running in the background to provide real-time self-awareness for
autonomous systems, with applications in health monitoring, fault detection, and
self-healing capabilities that can provide a valuable layer of safety and reliability.
L2F efforts have spanned numerous projects based at NASA Langley Research
Center (LaRC), with a strong emphasis on experimental flight testing. Early work
began with developing tools for global nonlinear aerodynamic modeling, with later
work building on these capabilities with model-based control and guidance meth-
ods [5, 6]. Several recent projects will be summarized below, but are not intended
as a comprehensive overview.
Under a NASA Aeronautics Research Mission Directorate (ARMD) Seedling
7
project from 2013?2014, several novel piloted excitation inputs were tested on the
MB-326M Impala at Mojave, CA, and a global nonlinear aerodynamic model was
updated onboard using batch data from several maneuvers at varying flight condi-
tions [11, 12].
In 2015, the Bat-4 was flown at NASA Wallops Flight Facility using the NASA
Langley AirSTAR flight testing capability with a Mobile Operations System (MOS)
ground station. With the aircraft remotely piloted from the MOS, this research
demonstrated automated multisine excitation inputs injected onto the control sur-
faces, along with real-time nonlinear aerodynamic model development that was com-
puted offboard using ground-based systems and real-time telemetry [7].
The Convergent Aeronautics Solutions (CAS) project from 2016?2017 took
an aggressive approach to evaluate the feasibility of the broader L2F vision: to
merge real-time aerodynamic modeling, learning control, and automated guidance
using standard onboard computing hardware [5]. Flight tests using a MiG-27 Foam
Target Drone at the NASA City Environment Range Testing for Autonomous Inte-
grated Navigation (CERTAIN) range, and using a novel joined-wing aircraft known
as Woodstock at Fort AP Hill, were used to simulate a bird jumping out of its nest by
releasing these unpowered gliders from a tethered balloon [5, 13]. In particular, the
Woodstock vehicle was purposely designed to be unconventional with twelve control
surfaces and an all-moving empennage for pitch and directional control, to test the
L2F algorithms to their limits. A more conventional test aircraft known as E1, and
pictured in Fig. 1.3, provided the flexibility of allowing a ground-based pilot to take
off and land, with additional flight modes that enabled automated control inputs
8
injected during piloted envelope expansion, and fully autonomous flight to test the
control law and guidance algorithms [5, 13]. Results of these flight tests showed a
successful onboard synthesis of automated excitation inputs, real-time global nonlin-
ear aerodynamic modeling, model-based control using adaptive nonlinear dynamic
inversion as well as a classic autopilot control law, and an automated guidance sys-
tem that specified desired flight conditions and enabled envelope expansion under
the constraints of the testing range [14-16].
L2F tools are continuing to be developed under Modeling and Control of Ag-
ile Aircraft Development (MCAAD), part of the NASA Transformative Aeronau-
tics Concepts Program?s (TACP) Transformational Tools and Technologies (TTT)
project. MCAAD research continues to advance the L2F onboard modeling and con-
trol capabilities with improvements such as in-flight moment of inertia estimates,
autotuning control laws, and additional modeling approaches that are frequently
flight tested on the E1 platform [6, 17-20]. Ongoing and future L2F work looks at
applying these tools to other novel UAM-inspired vehicles.
As discussed throughout this section, at the core of the aircraft dynamics and
control development process lies the reliance on an aircraft model, and this research
focuses on building an aerodynamic model from flight data. The aerodynamic model
provides essential physical insight into the vehicle?s response characteristics, and
can inform safe operation within the flight envelope. Additionally, the aerodynamic
model is central to the other aircraft development steps, including simulation models,
control laws, and guidance systems.
9
1.2 Background on aircraft system identification
The goal of aircraft system identification is to develop a mathematical descrip-
tion of the aircraft that predicts the aerodynamic responses due to control inputs
and flight variables. If the vehicle can be approximated as a rigid body with con-
stant mass properties, aircraft flight dynamics are derived from Newton?s second
law expressed in the body frame as [21]
F = FA + FT + FG = mV? + ? ?mV (1.1)
M = MA +MT = I?? + ? ? I? (1.2)
where F is the total applied force, FA,FT , and FG are the applied aerodynamic,
thrust, and gravity forces, respectively, m is the vehicle mass, V is the translational
velocity vector, ? is the rotational rate vector, I is the inertia matrix, M is the
total applied moment, and MA and MT are the applied aerodynamic and thrust
moments, respectively. With a propulsion model typically obtained from ground
tests, the identification challenge lies in estimating a model for the six aerodynamic
forces and moments in FA and MA, shown in Fig. 1.3, as a function of the measured
states and controls. The forces and moments are often nondimensionalized to remove
the dependency on mass, geometry, and dynamic pressure, and expressed as
10
?? ? ? ? ? ????
X??? ???C? ? X???? ????
L???? ?
?
bC
??? l ?
?
FA = ???Y ??? = q?S ???CY ??? , MA = ???M??? = q?S ???c?Cm?
?
???? (1.3)
Z CZ N bCn
where q? is dynamic pressure, S is the wing reference area, b is the wingspan, and c?
is the mean aerodynamic chord. By assuming small perturbations, the aerodynamic
force and moment coefficients can also be linearized about a reference flight condition
and expressed as a Taylor series expansion, and the parameters to be estimated are
the stability and control derivatives. For example, an approximation of Cm can be
expressed as
qc?
Cm = Cm0 + Cm??? + Cmq + Cm ??e (1.4)2V ?e0
Figure 1.3: Aircraft body-axis forces and moments on the E1 test aircraft (credit: NASA
Langley Research Center).
11
1.2.1 Modeling considerations
The remainder of Section 1.2 will discuss a variety of considerations and ap-
proaches for estimating an aerodynamic model from flight data. It is worth men-
tioning the well-known expression by statistician George E. P. Box that, ?all models
are wrong, but some are useful.? A perfect model does not exist, and the goal is not
to create one, but rather to find a sufficient representation that serves a specified
purpose based on a set of reasonable assumptions. Prediction error can exist due to
a variety of reasons, including noisy data, model structure errors, and parameter un-
certainties. Addressing these points in advance will clarify the appropriate modeling
approach from a wide variety of options. Any available a priori knowledge about
the system to be identified should be used to inform the identification process, to
ensure valid assumptions are made, and to make sure only the unknown components
are estimated.
The purpose of the model may be to understand the open- or closed-loop dy-
namic and stability characteristics, evaluate aircraft performance or handing quali-
ties, inform an aircraft simulation, design a control system, or any combination of
these options. Common assumptions include considering the aircraft as a rigid body,
and decoupling the longitudinal and lateral-directional dynamics. However, these
simplifications need to be evaluated if, for example, the aircraft displays aeroelastic
properties or exhibits strongly coupled motion.
Some of the common modeling considerations that can lead to simplifying the
problem and guiding the modeling process are summarized as follows [22]:
12
White box vs. black box vs. gray box: White, black, and gray box models are char-
acterized by the amount of prior knowledge that is used in the modeling process. A
white box model can be derived entirely from first principles. A black box model
assumes no prior knowledge and is based entirely on experimental data. Gray box
models, which cover most practical identification problems, use any available phys-
ical insight to inform a relevant model structure, while the remaining unknown
parameters are estimated through experimental data. ?Black box? is also used to
refer to a model that matches the input-output properties of the system, but which
lacks interpretability and physical meaning, both in the structure as well as the
estimated parameters.
Linear vs. nonlinear: If a linear approximation of the aircraft dynamics is sufficient
for a given application, then the modeling problem is drastically simplified and the
extensive body of linear systems theory can be used for identification and analysis.
However, for more complex nonlinear dynamics involving full-envelope modeling or
large amplitude maneuvers, a nonlinear model must be used.
Local vs. global: A local model can represent the system at a specified operating
point, while a global model may be valid over the entire operational envelope. Often,
local models tend to be linear while global models must take into account multiple
operating points, as well as nonlinear regimes.
13
Parametric vs. nonparametric: Parametric models such as transfer functions, state
space models, and polynomials are often used to characterize the aircraft dynam-
ics, but a priori assumptions are usually required to pose a physically meaningful
model structure and/or initial parameter estimates. Nonparametric models such as
impulse responses can also provide insight into linear system dynamics.
Online vs. offline: If the model can be developed offline after all the data have
been obtained, then the data can be processed in batch, and iterative, nonlinear
optimization methods can be used. If online modeling is required, then practical es-
timation methods are those that are compatible with recursive updates and require
reasonable computational power that can be provided by onboard flight computers.
Closed loop vs. open loop: Closed-loop identification is used to evaluate the dynamics
of a system with feedback for applications such as handling qualities, certification,
or control law evaluation, while open-loop identification can be used to estimate the
natural dynamics of the system for physical insight and control law design.
Stable vs. unstable: Time-domain parameter estimation methods that include inte-
gration may have numerical divergence issues, so system stability can impact the
system identification process. Additionally, unstable systems will require a control
law to fly the aircraft and to gather flight data.
Stochastic vs. deterministic: If the system is deterministic, then the state estimation
14
process is not needed, and a wider range of methods can be used. For stochastic
systems, statistical methods such as Bayesian approaches or Kalman filters must be
used to account for random variables.
Rigid-body vs. aeroelastic: If the aircraft can be sufficiently approximated as a rigid
body, the equations of motion can be expressed as in Eqs. (1.1 ?1.2) and the system
identification problem is simplified. If the aeroelastic properties are significant and
coupled with the rigid-body dynamics, then the model form must be adapted to
include the structural dynamics, and the aircraft must be outfitted with additional
sensors to measure the relevant structural responses.
Steady vs. unsteady: If the aerodynamics are approximated as steady or quasi-
steady, then a model structure that relies on instantaneous states or state estimates
can be used. If unsteady aerodynamics are evident, then time-dependent terms such
as lag states can be included.
Throughout the decision making used to inform a particular modeling ap-
proach, considerations should be made to ensure certain model properties that ful-
fill the purpose of the model. These characteristics are not binary standards, but
rather guiding criteria to evaluate model quality. Ideally, the model should display
all of these properties, but practically it usually exhibits a compromise among them.
Common desirable model properties include [21-23]:
15
Consistency: If an increase in the amount of modeling data will result in the model
approximation approaching a representation of the true physical system, the model
is consistent.
Parsimony: The law of parsimony is a guiding principle to selecting a model that
provides the prescribed level of prediction with the simplest model structure and
the fewest possible parameters.
Interpretability: An interpretable model is one that is not viewed as a black box;
rather, it is transparent such that the model structure and parameters provide phys-
ical insight into the system.
Predictive capability: A model that has good prediction capabilities captures the
deterministic modeling information but does not fit the noise content, and can be
used to predict data that were not used in the modeling process.
Robustness: A robust model is generalizable and provides results with good predic-
tion despite noisy data.
Modularity: If the model is effectively compartmentalized such that poor data qual-
ity or high uncertainty in one part of the modeling domain do not negatively impact
the model integrity in other regions, the model can be considered modular.
16
Model accuracy: Quantification of the uncertainty characteristics of the model out-
put and estimated parameters is an indication of the reliability and accuracy of the
model, and is a critical part of the modeling process.
1.2.2 A review of system identification tools, techniques,
and processes
The system identification process, shown in Fig. 1.4 and discussed at length
in Refs. [21, 24, 25], involves a complex series of steps including posing a desired
model form, designing an experiment to obtain informative data, developing a model
structure and estimating the model parameters, and validating the final model. Each
of these steps will be discussed throughout this section within the context of aircraft
system identification.
Figure 1.4: System identification process (adapted from Ref. [21]).
17
Model Postulation: Model postulation entails choosing a class of appropriate model
forms through which to characterize the system. A priori knowledge about the
system, underlying simplifications and assumptions about the dynamics, and spec-
ification of the purpose of the model influence the chosen form, which will impact
the entire experiment and the remaining identification steps.
Often, the goal of aircraft system identification is to express the six aerody-
namic force and moment coefficients as multivariate Taylor series expansions that
characterize the functional dependencies of the forces and moments on states and
controls. The truncated Taylor series expansion has inherent limitations, but is of-
ten a reasonable approximation. The estimated parameters are then stability and
control derivatives which offer physical insight into the aerodynamic characteristics.
The aerodynamic model can also then be incorporated into the equations of motion
derived from first principles to further evaluate and simulate the dynamic response.
The model can also take the form of a transfer function to characterize the
linear dynamic response of the aircraft to specified inputs expressed through poles,
zeros, and gains. This form is most convenient for single input single output (SISO)
cases, but can also be expanded to characterize pairwise inputs and outputs in a
multiple input multiple output (MIMO) case. Alternatively, models can be postu-
lated in state space form to directly express the equations of motion. In each of these
approaches, the estimated parameters representing stability and control derivatives
can be extracted from the model structure.
Typically, a certain amount of a priori knowledge of the system is a pre-
requisite for posing a representative model structure for identification. When this
18
information is not available or a meaningful model structure is unnecessary, non-
parametric models such as impulse responses or frequency responses can be derived
from test data and used to provide direct physical insight into the nature of the
aircraft dynamic response to inputs, as well as provide valuable information to in-
form a parametric model structure. Alternatively, black box models such as neural
networks focus on characterizing the input-output properties and allow the model
structure to be developed automatically using training methods. However, these
model forms do not offer physical insight into the system.
Experiment Design: The quality of the estimated model is predicated on the data
used in the system identification, so experiment design is an essential step toward
ensuring informative data for the modeling process. This step includes outfitting
the aircraft with sufficient instrumentation, designing the excitation inputs, and
specifying the flight conditions and maneuvers through which to gather data. The
experiment is also strongly influenced by the postulated model and underlying as-
sumptions, such as maintaining a linear response, assuring SISO assumptions, or
decoupling the longitudinal and lateral-directional dynamics. If the aircraft is un-
stable or difficult to control in an open-loop setup, then an inner-loop flight control
system may be needed to obtain desirable flight characteristics during flight testing.
If the aircraft is piloted (onboard or remotely), the pilot must be trained to fly the
aircraft through the flight envelope and excite the dynamics. Sometimes the per-
turbation inputs and/or the guidance are implemented autonomously through an
onboard flight computer.
19
The goal of input design is to excite the aircraft dynamics at a particular
flight condition in order to obtain informative flight data that capture the dynamic
response of the aircraft. Single-input excitation forms can include impulses, sine
waves, square waves, two-sided pulses such as doublets and multisteps, frequency
sweeps, and multisines. Each input design is informed by two main considerations:
sufficient amplitude to ensure data with good signal to noise ratios (SNRs), and
sufficient frequency content to excite the aircraft dynamics at the modal frequencies
to be modeled. The amplitude must be chosen to ensure sufficient SNR, subject to
the constraints of the approximation, such as using a linear model and ensuring small
perturbations about a flight condition. When the bandwidth of modal frequencies
to be modeled is known, then the inputs can be tailored to excite a particular set
of frequencies. Otherwise, wideband inputs must be used to excite a wide range of
frequencies that are expected to include the desired modal frequencies.
When there are multiple simultaneous input channels, the excitation must be
applied in a way that minimizes signal correlation so that it is clear which inputs
influence which parts of the response variables. This can be done by applying the
inputs sequentially in time, or by using orthogonal inputs on each control surface,
such as orthogonal square waves or multisines [21].
Past L2F work has shown that Programmed Test Inputs (PTIs) that apply
automated orthogonal optimized multisine perturbation inputs on multiple control
surfaces simultaneously produce informative multi-axis data in an efficient man-
ner [7,13,26-29]. The goal is to excite the aircraft dynamics using wideband inputs
that consist of a sum of sinusoids at specified frequencies, phases, and amplitudes.
20
The PTIs are designed within a bandwidth that includes the expected range of air-
craft modal frequencies, and they are phase-optimized for minimum relative peak
factor to provide perturbations that do not cause the aircraft to deviate far from
the nominal flight condition. The PTIs have been successfully demonstrated on L2F
flight tests throughout the CAS and MCAAD projects using a flight computer to
inject the excitation inputs on the control surfaces by summing these automated
inputs with the pilot or control system commands [7, 14, 18].
Data Compatibility Analysis: Prior to identification, measured sensor data obtained
from the flight test experiment must be processed to ensure the data are an accurate
representation of the signals. Data compatibility analysis verifies that the data are
kinematically consistent by comparing the measured responses with state estimates
reconstructed using the known rigid-body kinematics. The signals are corrected by
removing systematic errors such as time skews and sensor biases, correcting center
of gravity offsets, removing dropouts or spikes, and applying filtering or smoothing
techniques to improve the SNR [21,30].
Model Structure Determination: Model structure determination entails finding a
specific structure from a class of postulated models to describe a particular aircraft
using the test data. Model parsimony is an important consideration in this step
as motivation to choose the simplest model structure with the fewest estimated
parameters that offers a sufficient level of prediction accuracy. Regardless of the
postulated model form, model structure determination is an important step to cus-
21
tomize a model form for a particular data set to improve the accuracy of individual
parameter estimates and minimize correlation between chosen explanatory variables.
Parameter Estimation: For a given model structure, parameter estimation entails
using the measured input-output data to solve for the unknown model parameters
(and associated uncertainties) by minimizing a particular cost function. In aircraft
system identification, the three most common parameter estimation methods are
equation error, output error, and frequency response error [21, 24,31,32].
Equation-error parameter estimation uses linear regression techniques based
on ordinary least squares to minimize the sum of squared differences between the re-
sponse variable data and the model output for a model structure that is linear in the
parameters [21]. Some advantages of the equation-error method are the simplicity
of formulation and straightforward non-iterative solutions, the absence of temporal
requirements in the regressors that allows multiple maneuvers to be concatenated,
the expansion to stepwise regression methods to assist in model structure deter-
mination, and the flexibility of allowing nonlinear modeling terms. Additionally,
this method can be formulated recursively, which can enable real-time updates to a
model as new information is received. A disadvantage of equation-error parameter
estimation is the inherent assumption of deterministic regressors; if there is mea-
surement error in the regressors, the least squares estimates are biased, inconsistent,
and inefficient. Nevertheless, high quality sensors and data preprocessing can reduce
the measurement errors to provide reliable, useful results.
The output-error parameter estimation method often formulates the model
22
as the aircraft equations of motion in state-space form, and solves for unknown
parameters by minimizing the error between the measured or calculated data and
the output equation [21, 33]. The output-error method requires the states to be
calculated through integration and uses nonlinear optimization such as the Gauss-
Newton method to estimate the parameters that appear in the equations of motion.
Advantages of the output-error method are flexibility in the model structure, and
that the maximum likelihood parameter estimates are unbiased, consistent, and
efficient when there is no deterministic modeling error. Some disadvantages of the
output-error method are that initial parameter estimates are required, and iterative
nonlinear optimization solution methods are used, which can encounter convergence
problems and can be difficult to implement for real-time modeling.
Both the equation-error and output-error parameter estimation methods can
be applied for test data expressed in the time domain or the frequency domain.
Time-domain estimation is straightforward, as the goal is often to match time his-
tory data, and the bias term associated with the flight condition can be extracted.
For frequency-domain estimation, the data first need to be transformed into the
frequency domain using the Fourier transform, but this approach offers several ad-
vantages, such as fewer data points required in the estimation and robustness to
noise.
Another SISO approach to parameter estimation involving the frequency do-
main is to pose a cost function that minimizes the frequency response error between
the test data and the model. This usually constitutes a two-step approach: first a
nonparametric frequency response is computed, which can provide insight in and of
23
itself into the system dynamics, as well as inform the postulated model structure.
Then a nonlinear optimization is used to estimate the unknown model parameters
to match the frequency responses. This approach removes the effects of uncorrelated
process and measurement noise and is especially useful for unstable systems, but it
is restricted to linear models [30,34].
Model Validation: Model validation is the final step through which the estimated
model is evaluated to determine if it offers sufficient prediction, if the parameter
estimates are physically reasonable, and if it possesses the other desirable model
properties discussed at the end of Section 1.2.1. Validation criteria can include
comparing estimated parameters with other sources of information such as first
principles, CFD, or wind tunnel data. To ensure the model is generalizable and not
overfit to the test data, model predictive capability is evaluated using representative
validation flight data that was not used in the modeling process [21,30].
Given the variations of approaches for each step, the aircraft system identi-
fication process described in this section is generally applicable and widely used.
Popular software packages that employ these methods include SIDPAC developed
by Morelli at NASA Langley [21], CIFER by Tischler from the US Army Research
Laboratory [30], FITLAB by Seher-Weiss at DLR [35], and FVSysID by Jategaonkar
at DLR [36].
24
1.3 A literature review of global aerodynamic modeling from
flight data
For many applications, aircraft flight dynamics can be approximated as linear
at a specified local flight condition. If this assumption is valid, the model postula-
tion and model structure determination steps are simplified, and there is extensive
literature in linear systems theory that can be used to understand the model prop-
erties and to guide linear control law design. If aerodynamics were globally linear,
the estimated coefficients, e.g. Cm? in Eq. (1.4), would be constant across the entire
flight envelope, and only a small sample of data would be needed to estimate the
globally valid parameters. However, if a full-envelope model is required, even tra-
ditional aircraft aerodynamics, and certainly novel UAM vehicle aerodynamics, are
known to be globally nonlinear. Therefore, the modeling process must account for
these complexities to produce a model that is globally valid across the entire flight
envelope.
This section will describe several practical approaches to nonlinear full-envelope
aerodynamic modeling. Most of the methods discussed in Section 1.2.2 on the air-
craft system identification process are valid across many of these approaches, but
two particular aspects will be addressed here that set the approaches apart from
one another: flight envelope coverage and global model architecture. While the dis-
cussion on experiment design in Section 1.2.2 focused on designing excitation inputs
to generate flight data with sufficient information content, another part of this step
25
entails designing maneuvers to collect data across the full flight envelope to be mod-
eled. The model structure determination step will also be further addressed here
to focus on global model construction as either a combination of numerous local
(linear) models, or as a single global nonlinear model.
1.3.1 Developing an aerodynamic database of local linear
models
The traditional approach to full-envelope aerodynamic modeling based on
flight data alone involves developing a local linear model at numerous flight condi-
tions and test points, and stitching these models together in an extensive lookup
table, or ?aero database.?
For this approach, the envelope expansion component of experiment design
consists of designing an extensive test matrix that is parametrized by selected vari-
ables upon which the aerodynamics are nonlinearly dependent. These scheduling
variables can include terms based on the flight condition such as angle of attack,
sideslip, power setting, airspeed or Mach number, or those associated with the air-
craft configuration such as effects related to flap setting, weight, and center of gravity
position. At each discrete test point, the aircraft is trimmed and dynamic excita-
tion inputs such as doublets, 3-2-1-1 pulses, or frequency sweeps are applied as small
amplitude perturbations to maintain a linear approximation. Often, the excitations
are applied as single inputs sequentially at each point to excite separate modes in-
dividually, requiring multiple maneuvers at each test point. Additional maneuvers
26
also need to be performed to acquire separate validation data. Such flight test cam-
paigns typically extend over a period of several months consisting of a large number
of sorties, hundreds of conditions, and thousands of maneuvers performed [37-39].
Furthermore, if the aircraft configuration is modified, the flight tests may need to
be repeated.
Following the test flights, the techniques discussed in Section 1.2.2 may be
used to estimate a linear aerodynamic model at each test point, referred to as point
identification. These models are used in tabular lookup tables to fill a comprehen-
sive aerodynamic database that is parametrized by the various scheduling variables.
Sometimes multi-point identification is performed to combine several point models
over a range of flight conditions, such as angle of attack, to develop a single non-
linear model that is representative of a broader range of data [38]. This database,
similar to one built from wind tunnel data, can then serve as a tool for continuous
aircraft simulation studies, where the aerodynamic model and trim condition at any
particular point or configuration are obtained by multidimensional interpolation and
methods such as model stitching [40-43].
1.3.2 Global polynomial aerodynamic modeling
A linear model approximation about a trim condition may be unsuitable for
large amplitude maneuvers or highly nonlinear flight regions such as stall, and a non-
linear model structure is required. This section discusses approaches to developing
an analytical global nonlinear model that is valid across the full flight envelope. For
27
global aerodynamic modeling, maneuvers are designed to provide wide data cover-
age of the explanatory variables across multiple flight conditions, and are no longer
restricted to small amplitudes; as a result, the entire flight envelope can be covered
more quickly over fewer flights. Past work has used long-duration high-amplitude
maneuvers with excitation applied by the pilot [44-46].
L2F research has developed specialized flight test maneuvers to efficiently
gather flight test data for global modeling. These methods are often applied in
conjunction with the automated orthogonal optimized multisine excitation inputs
discussed in Section 1.2.2. Those inputs are overlaid on the maneuvers to provide
multi-axis excitation and flight data with high information content and low cor-
relation across the explanatory variables. During the Bat-4 flight tests, the L2F
maneuvers were inspired by a bird being dropped out of its nest, by beginning at
a nose-high flight condition and slowly varying the nominal pitch flight condition
down to steady-level flight, and then back up [7]. For the E1 tests, the ground-
based pilot performed slow approaches to stall and figure eights to obtain global
data across a wide range of flight variables [18]. For the manned Impala flights, the
pilot performed multi-axis uncorrelated ?fuzzy inputs? overlaid on slowly varying
flight conditions, as well as Spiral Powered All-axes fuZzy (SPAZ) maneuvers to ex-
pand the variations in flight data to include Mach number, load factor, and dynamic
pressure [12].
Past work has shown that global nonlinear aerodynamic models can be ex-
pressed as polynomial expansions with higher order and multivariate terms by ex-
tending the Taylor series expansion. The challenge then lies in establishing a pool of
28
candidate linear and nonlinear model terms known as regressors, such as in Fig. 1.5,
and then selecting those that retain the strongest modeling capability. Past offline
work has used engineering judgment to specify candidate regressors up to a speci-
fied order, and stepwise regression or multivariate orthogonal functions (MOF) to
choose the relevant modeling terms [29,47,48]. Notably, the orthogonalization pro-
cedure has also been used to develop the generic global aerodynamic (GGA) model
structure that is applicable across a wide range of aircraft [49]. Additional work has
investigated the inclusion of spline functions in the regressor pool to improve the
identification of localized aerodynamics [48, 50], while alternative approaches have
considered multivariate simplex splines [44, 51]. L2F work has extended the multi-
variate orthogonal function modeling approach to a recursive formulation that can
be used to efficiently model the nonlinear aerodynamics, and that has been success-
fully demonstrated in flight in real time across several flight vehicles [7, 12, 14, 18].
Another successful L2F global modeling approach uses frequency domain techniques
to estimate local models for sequential windows of data in time as a function of each
window?s nominal flight condition, and then uses these local models to fill a global
table [18].
Figure 1.5: Global polynomial modeling by selecting from a specified pool of candidate
modeling terms.
29
An alternative approach to global nonlinear aerodynamic modeling is known
as data partitioning, through which the global aerodynamics are partitioned in the
explanatory variable space into several regions where local (polynomial) models are
estimated [21]. Whereas approaches like stepwise regression and MOF characterize
nonlinearities by incorporating higher-order and multivariate modeling terms, the
data partitioning method instead divides the flight data into regions known as cells
where simple models can be used. The global nonlinearities are often characterized
by particular explanatory variables, such as nominal angle of attack, which can be
divided into local regions in which the effects of the nonlinearities are mitigated or
removed. If the data are partitioned by Mach number and angle of attack, for exam-
ple, the resulting cell structure might appear as in Fig. 1.6. If the local model can
be approximated as linear, then extensive linear systems theory can be employed to
investigate and understand the aerodynamics in each region. With certain param-
eter estimation methods, such as equation-error least squares in the time domain,
the data associated with the models in each cell do not need to be contiguous in
time, so flight test data across the global flight envelope can be sorted into the local
region to which they belong. Data partitioning for aerodynamic modeling was first
employed in Refs. [45,46] for large amplitude maneuvers. In these cases, a fine pre-
scribed resolution of cells was specified through extensive analysis of data density,
and so the cell locations were not customized based on particular nonlinear varia-
tions in the data, which resulted in a complex cell structure. Local model networks
(LMNs) extend the idea of data partitioning and provide an architecture for the
global model by expressing the model output as a smoothly weighted combination
30
of the local models, and this approach was used to model the aerodynamic coef-
ficients for nonlinear X-31 flight data [52-56]. Past L2F work also looked at data
partitioning approaches within the context of fuzzy logic to update aerodynamic
model parameters for the Impala aircraft in real time [12].
Figure 1.6: Cell structure partitioned by Mach number and angle of attack.
1.3.3 Other modeling approaches
Although polynomial aerodynamic model structures provide certain benefits
such as simplicity and interpretable parameters, other model forms have also been
used for aerodynamic modeling by considering a black box system and estimating an
input-output model without prior knowledge needed to inform the model structure.
This more flexible strategy allows for powerful computational modeling tools that
span many different fields including machine learning and statistics, and that extend
outside of the more traditional aerodynamic modeling approaches [22].
Neural networks offer a powerful tool for universal function approximation that
has been applied to aerodynamic modeling. Many neural network models estimate
the state derivatives or the total aerodynamic forces and moments. In Ref. [57], the
back propagation method was used to train a feed-forward neural network (FFNN)
31
to estimate a flight dynamics model using simulated flight data, and in Ref. [58],
the aerodynamic coefficients were estimated for a sparse wind tunnel and numerical
simulation database. In Ref. [59], an extended Kalman filter was used to train a
computational neural network to estimate the aerodynamic coefficients for simulated
flight data with function-derivative training that minimizes not only the function
error, but also the function derivative error. Reference [60] specifically applied neural
networks to global simulated flight data up to an angle of attack of 60 deg using a
variety of neural network structures of varying complexities.
Other neural network approaches aim to more specifically estimate the stability
and control derivatives in addition to the force and moment coefficients. In Ref. [61],
the ?Delta method? was used with a FFNN to estimate the lateral-directional aero-
dynamic parameters by perturbing the input values in accordance with the defini-
tions of stability and control derivatives, and training separate delta networks for
each perturbed state to compute the finite differences. In Ref. [62], the Neural-
Gauss-Newton method was used to estimate the aerodynamic derivatives with flight
data from the Hansa-3 aircraft. Note that although neural networks are capable of
complex nonlinear modeling, these approaches that focus on stability and control
derivatives still use linear perturbation flight data. Neural network-based models
have also been used for adaptive nonlinear control applications, such as for a non-
linear dynamic inversion control law [63, 64]. Other machine learning tools such as
support vector machines have also been applied to aerodynamic modeling [65].
Models can also be developed as networks of other functions such as radial
basis functions. In Ref. [66], a radial basis function neural network was used to
32
model the aerodynamic forces and moments, and a first order partial differentiation
scheme was used to obtain the stability and control derivatives.
Wavelets can also be used for modeling, have been used across a wide array
of different fields, and have valuable localization properties. Instead of partitioning
the explanatory variable input space as in the data partitioning approach, wavelet
functions can effectively partition the temporal and frequency components of the
data with the localization attributes of the ?daughter wavelets.? Stemming from
Fourier methods, wavelets possess properties of ?compact support? compared to
global sine and cosine functions, which allow them to effectively be used to represent
varying windows of data. The chosen window sizes allow each wavelet to represent
different features, both local and global, across the temporal and frequency domains.
Wavelets are useful tools to analyze a wide variety of signals, as they have the
capability to capture highly localized or non-smooth behavior such as spikes, as well
as low-frequency larger-scale components, effectively extracting particular features
of the global signal. In particular, wavelet thresholding methods allow a signal to be
reproduced by thresholding, or removing, certain small components that describe
details that are not important. This can be relevant for denoising applications by
explicitly removing certain attributes without the need to smooth the entire signal,
which has value for flight data that may contain significant noise [67,68].
In aerodynamic modeling applications, Ref. [69] uses several types of wavelet
functions to estimate black box models for the force and moment coefficients using
the output error method for wavelet coefficient estimation, while Ref. [70] combines
the localization benefits of wavelets with the learning tools of neural networks to
33
develop a wavelet neural network aerodynamic model.
1.4 Summary and limitations of previous approaches
Section 1.3 discussed several approaches to modeling the global full-envelope
aerodynamics. The traditional approach is to collect small perturbation local lin-
ear models about trim conditions across the flight envelope into an aerodynamic
database that is parametrized across several flight conditions and vehicle configu-
rations. This method takes advantage of linear systems theory and offers the flex-
ibility of using different modeling methods that are suitable for various data sets;
however, flight test campaigns tend to include inefficient, extensive test matrices,
and a global analytical model is not obtained. Instead, database interpolation is
typically required.
Global nonlinear modeling methods are not restricted to small amplitude ma-
neuvers, can be used to more efficiently obtain data throughout the flight envelope,
and can provide an analytical model that is globally differentiable and that can be
used with nonlinear control law approaches. A global nonlinear polynomial model
can include higher order and multivariate modeling terms to capture the nonlinear
effects across the full flight envelope; however, it may sacrifice the ability to charac-
terize significant localized variations in order to capture the global system behavior.
In the global form, it is also not necessarily apparent where within the input space
the model is performing better or worse, or where uncertainty is higher due to factors
such as limited data obtained. The inclusion of candidate spline functions in the
34
regressor pool can provide improved localization, but without knowledge of where
within the explanatory variables these terms should be placed, a general expan-
sive pool covering all practical possibilities may need to be specified, which would
increase the computational load, particularly for real-time onboard applications.
When flight data cover a wide range of explanatory variables and cannot be
adequately captured by a single linear model, several local models can also be used to
describe the large amplitude or nonlinear behavior through data partitioning [21,36].
This approach converts a complex nonlinear modeling problem into several local,
and possibly linear, models by dividing the flight data into multiple regions or cells,
and estimating a separate model using the data in each cell. These approaches
retain the easily interpretable polynomial model form, while providing a nonlinear
model that is globally valid, and that still captures the local aerodynamics in a
modular way. Drawbacks of data partitioning lie in the increased model complexity
compared to global polynomial models, the decision making that is necessary to
determine the cell partitions, particularly if conducted in real time, the requirement
to combine the local models across the cell boundaries, and the requirement for
sufficient localized data to accurately estimate the local models. Past work that
used fuzzy logic modeling in this context employed global regression techniques
across all cells, which resulted in correlated regressors, high condition numbers,
and interdependent estimated parameters with high uncertainties that could not be
interpreted individually. As a result, despite the localization aspects, the model was
considered as a black box. Furthermore, for real-time operation the cell structure
needed to be frozen while only the parameters could be updated recursively [11,12].
35
Finally, other modeling methods that span the field of machine learning, such
as neural networks and radial basis function networks, take advantage of powerful
learning tools. However, they often require a large amount of training data, are
computationally demanding to an extent that can hinder real-time operation, and
prioritize representation of the input-output relationships without consideration for
physical interpretability. Neural network approaches require specifications of the
hyperparameters that may be difficult to determine, such as the number of hidden
layers, number of nodes in each layer, training method, activation functions, and
training cycles. In general, for black box models that have hundreds of parameters, it
is difficult to discriminate between signal and noise, the data can often be overfit, and
the model can therefore offer poor prediction capabilities. Additionally, if the model
fit is poor in a particular region of the explanatory variables, it can be difficult to
understand the reason and to target a particular part of the model for improvement,
because it lacks transparency.
Wavelet transforms offer a unique extension of Fourier methods, have vast
flexibility, and can be used to capture both local and global time and frequency
content across large sets of modeling data. However, the flexibility lends into the
need to choose the right method parameters, including the types of wavelets, and
the dilation and translation specifications.
36
1.5 Research approach, goals, and contributions
As discussed in Section 1.1, this work is motivated by the NASA L2F concept,
which aims to develop real-time, automated, onboard system identification tools.
Section 1.3 provides a literature review of existing modeling approaches that have
been used for aerodynamic modeling, while Section 1.4 discusses some advantages
and limitations of various categories of approaches. The intention of this research
is to look at the modeling process and evaluate existing methods with a fresh per-
spective, so Section 1.2 provides a background of the broader context of the aircraft
system identification process as a whole, as well as the wide array of modeling con-
siderations and goals. Finally, this section will describe the particular goals and
modeling considerations within this work, and how they led to a new modeling
approach.
Several modeling goals were first specified within the context of L2F, and then
existing methods were explored to achieve them. First, particularly in the context of
novel aircraft designs and configurations, the intention is to learn as much as possible
about the vehicle. Physical insight into the aerodynamic properties is therefore a
priority for the model to provide. As a result, polynomial aerodynamic model forms
were chosen to easily offer that insight through the estimated stability and control
derivatives.
The modeling approach should also prioritize good local prediction capabilities
and modular characteristics, which can be addressed with data partitioning meth-
ods. Many current methods of data partitioning require extensive post processing
37
and analysis that are not compatible with the L2F priorities [45, 46], or resort to
bisecting split locations, which can be restricting and offer poor meaningful reso-
lution [53-56]. In the interest of having a model that emphasizes parsimony, with
physically meaningful localized cells, a new approach is needed that allows variable
split locations within this framework.
The model should also provide good global prediction capabilities. The tech-
nique of local model networks offers the architecture for data partitioning by ex-
pressing the global model output as a smoothly weighted superposition of the local
models that are weighted across the cell boundaries to avoid discontinuities. Current
methods that use local model networks tend to be offline and iterative to develop
the cell structure [12, 53-56], but in accordance with the L2F goals, this method
needs to be compatible with real-time operation onboard the aircraft. Furthermore,
the method should reliably distinguish between deterministic content and noise in
the data to avoid overfitting.
Combining these goals and associated methods, a new integrated modeling
approach was developed known as Smoothed Partitioning with LocalIzed Trees in
Real time (SPLITR). Particularly in the context of L2F, the objective of this work is
to develop a new approach to model the nonlinear aerodynamic force and moment
coefficients automatically for fixed-wing rigid-body aircraft. The method should
be compatible with real-time operation onboard the aircraft with limited a priori
knowledge of the aerodynamics required. The SPLITR approach is applicable to
more general vehicles, including those more aligned with UAM, as well as other
nonlinear systems outside of aerospace, but the scope of this work is confined to
38
applying them first to more conventional vehicles for which the results could be
easily validated and understood. This approach leverages a set of valuable and well-
established methods that have reliably been used in the past, combines them in a
new and unique way, and adapts them to real-time applications with a new cell
splitting procedure.
The specified modeling goals pursued in this work are that the model should be
developed automatically in real time with limited prior knowledge required, it should
be easily interpretable, provide physical insight into the aerodynamics, and offer
good global and local prediction capabilities. The SPLITR method expands on LMN
methods to allow models with variable split locations, and with real-time updates to
both the cell structure and parameter estimates. General technical challenges that
this work faces are abiding by the real-time compatibility constraint, and developing
methods that can be easily applied to different systems with minimal tuning.
In reference to the modeling considerations discussed in Section 1.2.1, this
work focuses on online gray box parametric modeling for open-loop global nonlin-
ear aerodynamics. For simplicity, the methods developed are applied to rigid-body
conventional aircraft, while the underlying approaches are developed to offer future
applications toward more complex UAM vehicles. Therefore, the prioritized model
properties include interpretability to provide physical insight into these novel sys-
tems, parsimony to ensure a simple model structure, and modularity to ensure that
poor data quality will not impact the entire global model.
The original contributions of this research include the development, testing,
and validation of a novel approach to global nonlinear modeling using automated
39
LMNs in real time, that is known as Smoothed Partitioning with LocalIzed Trees in
Real time (SPLITR). The SPLITR approach addresses the modeling goals specified,
and offers a robust methodology that is generally applicable. This novel method is
demonstrated across both generic simulated test data as well as experimental flight
test data, and offers user insight into both the modeling process and the model itself.
The test cases shown in this work are demonstrated using a full set of data that is
processed as if each point is received in real time.
Although SPLITR was motivated by, and developed for, the purpose of aero-
dynamic modeling within the NASA L2F concept, the potential applications go far
beyond the scope of aerodynamics to model more general systems with similar goals
and priorities, and several examples are used to demonstrate the broader SPLITR
capabilities.
1.6 Dissertation outline
This dissertation is organized into six chapters summarized as follows:
Chapter 1 describes the motivation behind examining the aircraft development
process within the context of a new and rapidly changing environment for air vehi-
cles, and compares the traditional aircraft development process with the efficiencies
offered by the L2F concept. It provides background on the system identification
process and modeling considerations applied to aircraft. It also includes a literature
review describing existing approaches to aerodynamic modeling. The chapter con-
cludes with an overview of the research approach discussed in this dissertation, as
40
well as the goals and contributions.
In Chapter 2, the method of LMNs is explored in depth, and the terminology,
nomenclature, and modeling approaches are discussed. A literature review of related
LMN work and an overview of existing algorithms are presented as well, which
extend outside the scope of aerodynamic modeling.
In Chapter 3, the Smoothed Partitioning with LocalIzed Trees in Real time
(SPLITR) modeling method is introduced, which builds on past LMN research and
expands the capabilities and applications to real-time modeling. Real-time param-
eter estimation, cell structure determination, and global weighting are discussed in
detail.
Chapter 4 explores the utility of the SPLITR method by testing it using sim-
ulated data from piecewise linear and higher order functions. The test data com-
plexity is gradually increased to provide clear conveyance and visualization of the
modeling process, and the sensitivities of the modeling results on SPLITR user
inputs are also explored in depth.
Chapter 5 discusses the results of developing aerodynamic models using SPLITR
with simulated and experimental flight data. First, simulated flight data are used
from a nonlinear F-16 simulation, and then SPLITR is applied to experimental flight
test data from the NASA E1 and T-2 test aircraft.
Finally, Chapter 6 summarizes the SPLITR modeling approach discussed in
this dissertation and provides future directions and applications.
41
Chapter 2
Local Model Networks
This chapter provides an overview of both the background of local model net-
works (LMNs), as well as practical existing approaches and research that employ
this theory. First, LMNs will be introduced, along with descriptions of various as-
pects including the cell structure, the leaf models, and the global model architecture.
These sections are intended to convey the scope and breadth of LMNs, and how they
can be customized to particular applications. Then a wide selection of LMN con-
struction algorithms is presented to tie together the selected LMN components into
unique approaches.
2.1 Introduction to local model networks
Modeling complex nonlinear systems is a challenging problem that remains an
active area of research. If the system can be sufficiently approximated as linear at
several relevant operating points, a common approach is to collect a set of linear
models into a database and to interpolate between those points. This technique
takes advantage of the insight and analytical tools offered from linear systems the-
ory, but can require extensive planning and lengthy execution for data acquisition.
One approach that can utilize the advantages of linear systems but also leverage au-
tomated global model development tools is LMNs. Through this method, the global
nonlinear function is expressed as a weighted combination of several simpler local
piecewise models that are partitioned across the range of input variables. LMNs
tend to offer advantages of interpretability, flexibility, simplicity, and modularity.
The concept of expressing a complicated global nonlinear function as a com-
bination of smaller, simpler parts is well known and widely used, in congruence
with a ?divide and conquer? strategy. Consequently, there is significant overlap
in the literature between what are referred to as radial basis function networks,
neural networks, Takagi-Sugeno fuzzy models, neuro-fuzzy models, local model net-
works, regression trees, model trees, hinging hyperplane trees, and data partitioning,
among others [22]. In fact, several of these methods can be considered equivalent
under certain constraints. Although similar work can be found across each of these
fields, exclusive semantics are often used within each category, and that can inhibit
a comprehensive literature review due to limited cross-referencing. Extensive dis-
cussion of the similarities and differences between some of these methods is given
in Ref. [22], but this dissertation proceeds under the label of local model networks.
Nevertheless, to ensure a broad overview is presented, several relevant methods and
algorithms that adopt other names across the fields of research mentioned above
will be referenced and discussed.
A local model network is an architecture that is used to approximate a nonlin-
43
ear function as a weighted combination of several piecewise local models, as shown
in Fi[g. 2.1 for a basic n]etwork consisting of k = 1, . . . ,M cells, with input variablesT
u = u1 u2 ? ? ? u and output y?.nu
Figure 2.1: Local model network architecture with M cells (adapted from Ref. [71]).
Each cell, also known as a leaf, includes a local polynomial model, and a
weighting function that describes the region of validity across the global domain of
the input variables. The output of the kth local model is
y?k = ?k0 + ?k1u1 + ?k2u2 + ...+ ?knuunu (2.1)
where ?kj is the parameter associated with the j
th input term for the kth local model.
The normalized validity function for the kth local model, wk(u), which is a nonlinear
function of u, weights each local model and represents the strength of influence of
the local model across the input variables. The validity functions commonly take
the form of normalized Gaussian functions in LMNs. If the weight associated with
44
the kth cell is w?k(u), then the normalized validity function is given as
w?k(u)
wk(u) = ? (2.2)M
j=1 w?j(u)
The validity functions are normalized to ensure a partition of unity such that the
weights across all models sum to 1 at every point in the input space.
?M
wk = 1 (2.3)
k=1
The global nonlinear model output y? is the superposition of the estimated local
model outputs y?k that are weighted with the normalized validity functions, and is
expressed as
?M ?M
y? = (?k0 + ?k1u1 + ?k2u2 + ...+ ?knuunu)wk = y?kwk (2.4)
k=1 k=1
A local model network is generally a modeling approach that is used to ap-
proximate a nonlinear function as a weighted combination of piecewise local models,
but there is a vast amount of flexibility in how the approach can be formulated. The
following sections will describe several LMN properties and associated options that
are relevant to this work to portray the extensive nature of LMNs. They will also
provide background that will show the limitations in current methods and justify
the new real-time LMN approach presented in Chapter 3.
For clarity, the LMN properties can be divided into three categories: 1) cell
structure, 2) local models, and 3) global model architecture. Throughout the liter-
45
ature, each LMN approach is characterized by a selection of these properties along
with a unique logic to guide the model development process through growing the
cell structure. Several related algorithms will be discussed in Section 2.5.
2.2 LMN cell structure overview
The LMN cell structure describes how the input space is partitioned into a set
of local cells, each of which includes a local model that describes the system behav-
ior in that region, and an associated validity function that describes the relevance
across the global domain. The cell structure is intended to partition the nonlinear
regions of the input space so that simpler local models are valid in each region. The
cell boundaries can be designated through prior intuition, analysis of the system?s
nonlinear behavior, or through properties such as data density. Alternatively, they
can be specified as a grid of constant or varying resolution across each dimension.
However, partitioning for nonlinear behavior can be difficult to reliably predict, and
if the cell locations are not customized based on particular nonlinear variations in
the data, an overly complex cell structure may result with redundant local models
and limited physical insight from the boundary locations. An alternative approach
that this section focuses on involves developing the cell structure automatically by
input space decomposition through recursive partitioning in a tree construction algo-
rithm. In this approach, an initial coarse cell structure can also be specified initially
to enforce certain cell boundaries, and then it can be further developed through
data-driven methods.
46
The input space decomposition can be visualized through a regression tree to
describe the sequential partitioning and to locate the cell to which a particular data
point belongs. An example of a 2D regression tree and the corresponding final cell
structure is shown in Figs. 2.2(a)?2.2(b) for two specified partitioning variables, ?1
and ?2. The root leaf at the top of the tree contains all of the training data across
the input space. Each leaf split represents a binary partition of the parent cell
into two child leaves, where the branches specify the locations across the variables
that divide the parent cell. A data point belongs to a cell if it falls within the
cell?s partitions, and to locate the final cell, a path is traced from the root leaf
through the interior leaves to a terminal leaf at the bottom of the tree by following
the sequence of inequalities through the branches. Classic regression trees provide
piecewise constant models in each terminal leaf. In other cases, a local polynomial
regression model is given in each cell, which is evaluated to produce the local decision
tree output.
Although the decision tree and cell structure appear to show ?hard splits?
which indicate a global piecewise model and imply that each data point belongs
exclusively to a single terminal cell, in the LMN approach each data point can
activate multiple cells due to overlapping interpolation regions defined by validity
functions. The global LMN output will then be a weighted combination of several
local cell outputs. Nevertheless, these depictions are still useful for visualizing the
sequential partitioning and final structure, and for locating the primary cell where
the model will be weighted most strongly for a particular data point. Note also
that the decision tree is not a unique sequence to a particular final cell structure,
47
nor does it clearly show the order of all splits, but it illustrates the sequential input
space decomposition and has implications for tree pruning. A unique depiction of
the cell structure development sequence is instead shown in Fig. 2.2(c).
(a) Decision tree (b) Final cell structure
(c) Cell structure development
Figure 2.2: Regression tree construction and visualization.
In many cell structure development methods, a binary recursive tree construc-
tion search algorithm automatically decomposes the input space through a series of
splits and data-driven decision making where at each step, an existing cell is split
into two child cells until a stopping criterion is met. Section 2.5 will present sev-
eral recursive partitioning algorithms and describe their individual approaches and
decision-making logic.
The cell structure shown in the example in Fig. 2.2 is constrained to both
axes-orthogonal and bisecting splits, but both the dimensionality and location of
the split in each step can be different.
48
2.2.1 Split dimensionality: axes orthogonal vs. axes oblique
Axes-orthogonal partitions are drawn through a single dimension so that the
cell boundaries are orthogonal to the input variable axes, as shown in Fig. 2.3(a) for
two dimensions. This approach maintains the benefit of simplicity and interpretabil-
ity of the resulting cells, and results in a hypercube cell composition. Axes-oblique
partitions, as shown in Fig. 2.3(b), allow arbitrary splits that extend through two
or more dimensions at once, and while they permit a more flexible cell structure,
multi-dimensional splits can add complexity to the split logic and can sacrifice model
transparency. Nevertheless, for high-dimensional problems with strongly coupled
nonlinear dependencies, axes-oblique partitioning offers a more efficient strategy.
(a) Axes orthogonal (b) Axes oblique
Figure 2.3: LMN split dimensionality options.
For axes-orthogonal partitioning schemes, the decision making surrounding
each split consists of choosing the parent cell, the split dimension, and the split loca-
tion within that cell. For axes-oblique partitioning, the boundary is often described
by a sigmoidal shape where the parameters are optimized, thus expanding the com-
plexity of the decision making. Consequently, axes-oblique partitioning strategies
49
tend to use more computationally expensive nonlinear optimization methods to cal-
culate the cell boundaries.
2.2.2 Split location: bisecting vs. customized
In many axes-orthogonal partitioning schemes, the parent cell is bisected
through a single dimension during each split as shown in Fig. 2.4(a), thus eliminating
the split location determination and simplifying the required logic and computation.
However, this approach can be highly restricting, particularly if the nonlinearities
are biased toward one side of the cell, and it can result in redundant cells, overly
complex models, and less meaningful cell locations.
For additional flexibility, a selection of candidate discrete split locations in
each dimension can be proposed, and the best location that minimizes a specified
cost function can be chosen. Alternatively, an arbitrary axes-orthogonal split lo-
cation can be determined through an optimization or other process. Both of these
options will produce a more customized and physically meaningful cell structure,
as shown for example in Fig. 2.4(b), at the expense of added computation. For
axes-oblique partitioning strategies, the split location is often optimized along with
the dimensionality, as discussed in the previous section.
50
(a) Bisecting (b) Customized
Figure 2.4: LMN split location options.
2.3 LMN leaf models overview
Each cell contains a local model that describes the data in the region of validity
of that cell. Most generally each local model can be considered independent of the
others, and can be specified through first principles, a priori information, or other-
wise separately. Despite this flexibility in specifying each local model independently,
it is more common that all of the local models are developed together through least-
squares regression during the cell structure development process. Even under the
constraints of linear regression methods, there are still several attributes that allow
flexibility in the resulting models, which will be discussed in this section.
2.3.1 Model inputs: explanatory variables and partitioning
variables
[ ]T
The LMN input variables u = u[ to each cell include both1 u2 ... unu ]T
the explanatory variables (EVs), x = x x ... x , which are used as re-1 2 nx
gressors for local function approximation, and the partitioning variables (PVs),
51
[ ]T
? = ? ? ... ? , which are used to partition the global model and to eval-1 2 n?
uate the validity functions. The variables x and ? can be selected independently
and may be specified to include all, some, or none of the same terms.
In the simplest case, and how it has been presented up to this point, x =
? = u, the EVs and PVs are equivalent, and the kth cell?s local model and validity
function are given as
fk(u) = y?k = ?k0 + ?k1u1 + ?k2u2 + ...+ ?knuunu , wk = wk(u) (2.5)
If x 6= ?, the EVs are different from the PVs, and the local model and validity
function are
fk(x) = y?k = ?k0 + ?k1x1 + ?k2x2 + ...+ ?knxxnx , wk = wk(?) (2.6)
where x and ? can either be disjoint to contain separate sets of variables, or they
can overlap to include common terms. The PVs that are not included in the EVs
can be interpreted as variables that define the operating point of the local model
and are considered scheduling variables for which the local model does not have a
functional dependency, e.g. system configuration variables.
Any a priori knowledge about the behavior of the response variable can be
used to specify the EVs and PVs. Many LMN construction algorithms will automat-
52
ically choose a subset of the specified PV pool with which to partition the model;
nevertheless, it is important to designate a meaningful, relevant pool to reduce the
curse of dimensionality, improve computational speed, and preserve model inter-
pretability and robustness by preventing illogical partitions that may result from
non-optimal split decisions.
There are generally two ways that the LMN is evaluated for each local model
by processing the EVs. The first approach is to use global coordinates, i.e. by
using the true values of the EVs across all of the cells in the entire input space.
This would imply evaluating the local model output equation as it is presented in
Eqs. (2.5?2.6). However, this is contrary to the typical representation of the Taylor
series expansion about a specified operating point, and in this case, the bias term
?k0 is not easily interpretable since it is not the output of the k
th local model at the
center of the associated validity function. The second approach is to evaluate the
local model output equation using relative coordinates. In this case, the reference
cell center points of each local model are subtracted from the EVs before evaluating
the output equation. Both representations produce the same model output, with
the difference lying in the magnitude and interpretation of the bias term [22].
2.3.2 Model order: constant, linear, or higher order
polynomials
Each local model is expressed as a polynomial and can be a constant (zero
order), linear (first order), or higher order polynomial. Linear polynomials are
53
commonly used because of the extensive linear systems theory for interpretation,
analysis, and control law design, and because they tend to provide sufficient small
perturbation models. However, for a highly nonlinear system or region of the PV
input space, an increase in local model complexity by employing a higher order
polynomial can reduce the LMN complexity by resulting in fewer required cells to
adequately capture the behavior in that region. Thus, there is a tradeoff between
local model order and the number of LMN cells, and as M decreases to 1, a single
multivariate polynomial model emerges, similar to the global polynomial modeling
approach discussed in Section 1.3.2. Again, any a priori information can be used
to specify the local model order, either in general across all cells, or particularly for
regions in which that insight is available.
2.3.3 Model structure: pre-specified vs. automated
selection
Many LMN algorithms have the capability of automatically choosing a subset
of the specified PV pool through which to partition the model. The selection of
relevant EVs for each local function approximation can be specified a priori. Alter-
natively, the EV selection may be automated using model term selection methods
such as stepwise regression to choose only those linear and/or higher order terms
that retain significant modeling capability. Even for the strictly linear model struc-
tures, data correlation among regressors or poor SNR can degrade model quality,
and model structure determination remains to be an important and involved step in
54
the modeling process. Furthermore, with no regressor customization, the premise is
that every modeling term is relevant across the entire PV input space, which may
not be the case, and for which a subset selection may produce a more parsimonious
local model. Nevertheless, local model structure determination adds complexity and
additional computation to the local model development.
2.3.4 Parameter estimation process: global learning vs.
local learning
Each local model is a polynomial that is linear in the parameters, so equation-
error least squares can be used for parameter estimation. For a given cell structure
and validity functions assigned to each cell, there are two main approaches to LMN
parameter estimation: global and local optimization [22,23,72].
In the global parameter estimation approach, the parameters for all of the
local models are optimized simultaneously in a single global regression where the
regressors in each cell are weighted according to that cell?s validity function. For
simplicity, the equations will be expressed for a case where all local models contain
the same terms. For an LMN with nx input terms, N data points, and M local
models, the least squares cost function is given as
?N
J(?) = (y(j)? y?(j))2 (2.7)
j=1
The global model output is then expressed as
55
? ?
[ ]??
?1 ?
??? ??
y? = X?? = X X ? ? ? X ????
? ??2 ???? (2.8)1 2 M ?? ..
. ??
?M
where the regressor matrix for the kth cell is given as
?? ???wk0(?(1)) wk1(?(1))x1(1) ? ? ? wknx(?(1))x? nx
(1)
? ?
??
wk0(?(2)) wk1(?(2))x1(2) ? ? ? wknx(?(2))xnx(2)
??
Xk = ???
?
??
?? (2.9).. . .. .. . ... . ????
wk0(?(N)) wk1(?(N))x1(N) ? ? ? wknx(?(N))xnx(N)
Then the parameters across all cells are simultaneously estimated as
?? = (XTX)?1XTy (2.10)
Global optimization directly recognizes that the global output of the model
is defined as a weighted combination of the local model outputs that are weighted
according to their validity functions, and therefore estimates all parameters at once
through a single cost function minimization. In this case, the regression matrix
consists of sets of columns that include the weighted regressors for each local model,
and the estimated parameter vector includes the sets of parameters across all local
models. As a result, this approach is computationally demanding for problems with
56
a large number of local models and/or regressors. Furthermore, global optimization
problems tend to have high correlation from the repetition of the regressors in each
local model that only differ through their respective weights, and because much of
the regression matrix is close to zero since only a small number of local models
are highly active at any point in the PV input space. Consequently, the estimated
parameters tend to have high uncertainties, and they are interdependent and cannot
be interpreted individually, but rather only collectively with the other local models.
As a result, the model loses much of its transparency, which is an important aspect
of the motivation for LMNs.
Alternatively, local learning can be performed by treating the local models
independently and posing a separate weighted regression problem for each local
model to estimate the associated parameters by neglecting interaction with the other
local models. Instead of a global regression to estimate all M(nx + 1) parameters
at once, M separate regression problems are performed to estimate the nx + 1
parameters for each local model individually. A weighted least squares optimization
is performed for each local model where the weighting function is equal to the validity
function.
The local learning approach is advantageous over the global optimization since
it is more computationally efficient, the regression matrix is better conditioned, each
local model can be estimated individually using different optimization methods, and
the local model parameters can be interpreted individually. Furthermore, the local
approach is more conducive to online learning with regards to both the numerical
stability of updating the local parameters using the recursive least squares regression
57
algorithm, as well as updating the cell structure since the global regression is directly
dependent on the cell structure.
While many LMN approaches use weighted local regression, the estimated
parameters are influenced by the chosen weighting function, which may be arbitrarily
specified based on the cell structure and not directly by the data. The weighting
functions are not only often dependent on the current cell structure, but since each
local model weight is normalized by the sum of the weights from all other local
models as in Eq. (2.2), each local model is in fact co-dependent to some degree,
even though the regression is performed independently. This is particularly relevant
in real time if a split is made in one area of the PV input space, and the weighting
functions ? and parameters ? of the other cells are still affected.
Another approach is to consider each local model entirely independent from
the neighboring cells and to estimate the local model parameters with ordinary
(unweighted) least squares using only the data points within the cell boundaries.
Then, the global weighting problem can be considered separately by blending the
local models afterwards. This method provides an isolated model of the data within
each region, and the flexibility of adjusting the weighting functions. For real-time
applications, this also offers the flexibility that the local models are independent
in each region and are therefore not affected by the changing cell structure and
associated weighting functions. In this approach, the least-squares cost function for
the kth cell with Nk data points belonging to it is
58
?Nk
Jk(?k) = (yk(j)? y?k(j))2 (2.11)
j=1
Further discussion comparing the parameter estimation approaches for LMNs
can be found in Refs. [22,23,72].
2.4 Global LMN architecture overview
As discussed at the beginning of this chapter, numerous modeling approaches
simplify a problem by breaking it down into several smaller simpler parts. Unlike
with classical decision trees, the ultimate goal is not a collection of discontinuous
piecewise models, but rather a global continuous model that is expressed as a com-
bination of the local models. This section will discuss several ways that the local
models can be combined, both through the weighting functions as well as the global
model construction.
2.4.1 Validity functions: Gaussian weights and a partition
of unity
Validity functions are used across a variety of disciplines under different names
such as basis functions in basis function networks, activation functions in neural net-
works, and membership functions in fuzzy logic. The common operational utility
is to provide a smooth transition between piecewise components in a global archi-
tecture, and to characterize the regions of validity of each local model by weighting
their contribution to the global output across the ranges of relevant input variables.
59
The validity functions can be chosen from a variety of different shapes, including
Gaussian, ramp, and boxcar functions, for example.
In local model networks, similar to radial basis function networks, Gaussian
functions are often used, which can be easily generalized to multiple dimensions.
Gaussian functions consist of two parameters: the center, which is located where
the associated model has the strongest validity; and the standard deviation, which
characterizes the influence of each local cell across the range of the PV. The validity
function for the kth local model is given as
[ ? ]n?
?1 (?j ? c
2
k,j)
w?k(?) = exp (2.12)
2 ?2
j=1 k,j
where ck is the center of the Gaussian function, and ?k is the standard deviation.
The Gaussian functions are often placed in the center of the cells, and the standard
deviation is a function of cell width such that ?k,j = 0.4?s(bk,j?ak,j), where ?s is the
smoothness factor that describes the influence of each cell, and [ak,j, bk,j] define the
range of the kth local cell across the jth PV. The scaling factor of 0.4 is an empirical
result from Ref. [52], such that a nominal smoothness factor of ?s = 1 can be used.
A partition of unity, which is introduced in Section 2.1, is used to ensure that
the weights from all local models add to 1 at every point in the input space. When
this is enforced through Gaussian validity function normalization, it can cause unin-
tended consequences by changing the shapes of the normalized Gaussian functions,
particularly for cells of largely varying widths. These side effects can include shift-
ing the maxima from the cell center, and reactivation, or multi-modal behavior [73].
60
The implications for the global model output can be significant if the output shows
waves as a result of the overlapping weights. To combat this, some approaches
use a hierarchical tree structure and sigmoidal weighting functions, which will be
introduced in the next section.
Finally, for simplicity the validity functions and the global input space are
typically defined in a normalized coordinate system from 0 to 1, so the validity
functions can be specified in this normalized space. For real-time applications, this
implies specifying the minimum and maximum expected values for each PV for
normalization.
2.4.2 Global LMN form: combination of local model
outputs vs. combination of local model parameters
Consistent across LMNs is that the global nonlinear model output is computed
as a weighted combination of the local models, but there are several ways that
this calculation can be formulated, and the chosen method will drive other LMN
properties. In the most common formulation, the global model output is expressed
as a linear combination of the local model outputs that are weighted according to
their validity functions, given in Eq. (2.13) for the ith data point, and shown in
Fig. 2.5.
?M ?M
y?(i) = y?k(i)wk(i) = x(i)??k(i)wk(i) (2.13)
k=1 k=1
61
Figure 2.5: Global model output as a weighted combination of local model outputs
(adapted from Ref. [74]).
A benefit of this formulation is that the global model can be constructed from a
collection of local models that are not constrained to have the same model structure.
Instead, the local models can each be expressed separately according to a mixture
of a priori information and data-driven methods. Even if the local models are
restricted to be polynomials, then this architecture allows the flexibility of various
model structures and orders in different parts of the operating envelope.
Another architecture can be used if all of the local models have the same
model form and model structure. In this case, the global model is a weighted
combination of each of the common parameters from the local models, and therefore
has the same model structure. This can be interpreted such that the global model
parameters are scheduled according to the PVs and change throughout the operating
62
envelope, but the model structure is constant, similar to a linear parameter varying
(LPV) model. Accordingly, the global model is much more transparent and can
be interpreted at each operating point using the scheduled parameters. However,
enforcing a constraint of a constant model structure across the entire PV input space
eliminates the customization of model structure at particular conditions. The LMN
output in this case is expressed in Eq. (2.14) for the ith data point, and shown in
Fig. 2.6.
?M
y?(i) = x(i) wk(i)?k (2.14)
k=1
Figure 2.6: Global model output as a weighted combination of parameters across local
models (adapted from Ref. [74]).
Although the cell structure is often developed in an iterative manner, as de-
63
picted as a tree in Fig. 2.2(a), the LMN for any given cell structure is considered
?flat? if the model output is dependent only on the terminal leaves and their validity
functions. A flat LMN output is straightforward to compute, and only the final cells
and validity functions need to be stored.
Another approach is to have a hierarchical cell structure where although only
the terminal leaf local models are used to compute the model output, the rest of the
tree structure is maintained to calculate the associated validity functions. These
hierarchical approaches tend to use sigmoidal functions for each binary split, which
can be optimized to allow axes-oblique splits. Although hierarchical models are more
complex to interpret, store, and compute, the nature of the binary sigmoidal validity
functions at each level mitigates the normalization side effects of classical Gaussian
validity functions. This approach has significant overlap with hinging hyperplane
methods [75-77].
2.5 A literature review of LMN construction algorithms
The previous sections introduced many of the components and properties of
LMNs to impart the depth and flexibility of this modeling approach. Accordingly,
there is a lot of research that has explored these properties and numerous tree
construction algorithms have been developed, each with a unique approach that
is characterized by a selection of attributes discussed previously (among others),
along with a splitting logic that guides the cell structure development process. This
section summarizes some of these individual approaches to tie together the various
64
aspects of LMNs that have been discussed previously, and to describe how they can
be implemented. Most work focuses on batch algorithms which operate offline, rely
on iterative processes, and for which all of the training data are available in advance.
A few techniques that address online adaptation are also introduced. Lastly, several
LMN algorithms that specifically address aerodynamic modeling are discussed.
As detailed in the previous sections, from the split dimensionality to the regres-
sion to the global model construction, there are inherent and reciprocal advantages
and drawbacks to different choices, and it is impossible to avoid all possible short-
comings in a single approach. Therefore, when choosing or developing a particular
approach, it is important to specify, understand, and balance the model priorities,
some of which are discussed in Section 1.2.1. One of the most common tradeoffs
is between model complexity vs. interpretability. For example, axes-oblique splits
may result in fewer cells, but the resulting model structure will be less transparent
than for a model with axes-orthogonal splits. Also, additional levels added to the
tree may improve the accuracy, but can reduce the robustness and predictive capa-
bilities. The tree construction algorithm stopping criteria are therefore important
factors that affects the model robustness, prediction, complexity, and overall per-
formance, with the ultimate goal of producing the smallest tree that provides the
desired accuracy, i.e. a parsimonious model. Splitting metrics and stopping crite-
ria can involve options such as minimum number of allowable data points in each
leaf, when a global cost function is not improving, statistical metrics such as RSS
or MSE, or hypothesis testing on the residuals. Many methods combine a ?grow-
ing? phase with an important ?pruning? phase to find the final cell structure. In
65
general, the splitting logic across different algorithms must balance computational
effort, effective stopping criteria, and performance of the resulting model. A real-
time requirement eliminates some of these possibilities, given current computational
capabilities. The following discussion of algorithms is intended to provide a broad
overview of related work.
2.5.1 Batch LMN construction algorithms
Classification And Regression Trees (CART) is a generic decision tree building
approach introduced by Breiman in 1984 that serves as a foundation for many deci-
sion tree algorithms [78]. At each level in the binary tree building process, all split
dimensions and locations are considered for the next split, and the ?best? candidate
is chosen through a quantitative metric. The basic CART approaches develop piece-
wise constant models and switch discretely between submodels, so the submodels
are discontinuous and there is no interpolation. Consequently, there is a sharp tran-
sition across cell boundaries and the global model is not smooth or differentiable.
CART methods strongly emphasize the step of pruning by greedily expanding the
tree size and then removing irrelevant leaves through cross-validation. MATLAB
also has classification and regression tree objects and software in the Statistics and
Machine Learning Toolbox that expand on CART theory [79]. RETIS [80] is an
early method that expands on CART to allow local linear models in the cells. It
also uses a Bayesian approach for the tree construction and the residual sum of
squares (RSS) as the comparative cell metric.
66
SUPPORT (Smoothed and Unsmoothed Piecewise-POlynomial Regression
Trees) [81] is an axes-orthogonal recursive partitioning algorithm that uses poly-
nomial local models of fixed order where partitioning is informed by analyzing the
distribution of residuals through t-tests. For a given leaf considered for splitting,
the distributions of positive and negative residuals are analyzed by testing for differ-
ences in the means and variances from each set of data points, with the underlying
premise that if the current model is sufficient, the residuals should not be signifi-
cantly different. Cross-validation is then performed to determine if the predicted
mean squared error (PMSE) exceeds a pre-defined threshold. If further splitting is
justified, the split location is chosen as the average of the mean values in each set
of residuals.
GUIDE (Generalized, Unbiased Interaction Detection and Estimation) [82]
expands on the SUPPORT split logic to allow detection of pairwise interactions
between variables. The data for each variable are divided into quartiles and used to
fill a 2 ? 4 contingency table where the rows are based on the sign of the residual.
The ?2 statistic and associated p-value are computed for each region in the curvature
test. Pairwise interaction between variables is explored by dividing the 2D space
into quadrants based on the sample median, building another 2 ? 4 contingency
table, and similarly computing the ?2 statistic and p-value for the interaction test.
The split variable is chosen based on comparing the p-values from each of the tests,
and the split location is the sample mean of the chosen variable. GUIDE also allows
the input variables to be chosen for regression, splitting, or both, and relies on a
pruning stage, similar to CART.
67
SECRET (Scalable EM and Classification based REgression Trees) [83] of-
fers axes-orthogonal or axes-oblique partitions for linear leaf models. It uses the
expectation-maximization (EM) algorithm to convert a regression problem into a
classification problem by finding two Gaussian clusters, and then assigning each
data point to the cluster to which it has the higher probability of belonging. Classi-
fication tree techniques are then used to determine the split location as a hyperplane
that minimizes a cost function.
MARS (Multivariate Adaptive Regression Splines) [84] is not considered a de-
cision tree method, but rather performs recursive partitioning using spline functions
in each cell. The approach emphasizes continuity, and the model developed through
MARS can only be interpreted globally, at the expense of local model interpretabil-
ity and transparency. MARS is a highly iterative, computationally expensive algo-
rithm that relies on a forward (growing) phase and a backwards (pruning) phase to
determine the number and locations of knot points. During each phase, numerous
candidate knot locations are considered for inclusion or removal, and the next model
iteration chosen is that which minimizes the residual sum of squares (RSS).
Model tree algorithms that have been developed by Nelles and others of-
fer a wide array of flexibility and user preferences. LOcal LInear MOdel Trees
(LOLIMOT) [22,71,85] is an axes-orthogonal algorithm that emphasizes simplicity
in model construction and interpretation over complex optimization. Beginning with
an initial model and cell structure, the local model that is performing most poorly
is identified as that with the largest cost function defined as the weighted squared
local model errors, and that local model is split next. Bisecting axes-orthogonal
68
splits are considered in each dimension, and a candidate model is calculated for
each possible split adding one additional cell, using weighted least squares regres-
sion with Gaussian validity functions. The best division for each iteration is chosen
from among the candidate models as that which most significantly improves a global
cost function. Finally, a convergence test is performed to determine if the model
meets a termination criterion to stop the additional splitting.
POLYnomial MOdel Trees (POLYMOT) [86, 87] builds on LOLIMOT such
that, in addition to considering candidate bisecting splits for the worst local model
during each iteration, the model could instead be modified by adding additional
(higher order) terms using weighted stepwise regression. This approach balances
the tradeoff of added complexity characterized by number of cells vs. number of
parameters within each cell using a specified tradeoff complexity factor.
Finally, HIerarchical LOcal MOdel Trees (HILOMOT) [88] is an axes-oblique
partitioning algorithm that uses a hierarchical tree structure, and can be considered
a hinging hyperplane tree approach. The tree construction logic is also similar to
LOLIMOT, except the worst local model can be split in an axes-oblique manner
by utilizing nonlinear optimization to determine the split location and direction
parameters of a sigmoidal splitting function.
2.5.2 Online LMN construction algorithms
Although less prevalent in the literature, there has been some work devoted to
online tree construction algorithms as well. Often, these methods are closely linked
69
to an offline batch approach with modifications to allow for real-time, recursive
operation. This can be useful for a time-varying model, an adaptive system, or for
when the data are not available in advance.
There are two main approaches to online adaptation: freezing the cell struc-
ture and validity functions and only optimizing the local model parameters with
new data in real time, or also allowing the cell structure to be grown as well. In
Ref. [22], LOLIMOT was extended to both forms of online adaptation. For parame-
ter optimization, recursive weighted least squares is used to develop the local model
in each cell with new information. Ordinarily, with normalized Gaussian validity
functions, every point theoretically activates every cell and is used to update every
local model?s parameters. To reduce the computational load in real time, an acti-
vation threshold can be implemented such that an individual point only adjusts the
parameters of the cell(s) that it most strongly impacts.
The LOLIMOT approach to online adaptation of the cell structure is similar to
the offline approach. For a given cell structure following a split, each cell initializes
two background bisecting models in each dimension that are updated along with
their active parent cell with the new information. If only the most active cell is
updated with the new data point, then 2n?+1 total cells will be updated, including
the parent cell and the background submodels. If the data in a cell indicate nonlinear
behavior, the partitioned submodels will perform better than the parent to justify a
split. The candidate splits in a certain dimension will replace the parent cell if the
sum of the local cost functions of the background models (multiplied by a factor) is
less than the parent cost function. However, the approach of background submodels
70
can be computationally expensive, and the choice of the improvement factor is very
difficult to specify in practice, and is not robust to noise variations.
Potts introduces two additional online axes-orthogonal approaches in an Incre-
mental Model Tree Induction (IMTI) algorithm [89,90]. The first method based on
residual differences (RD) is similar to online LOLIMOT, except instead of bisecting
background submodels, any number of evenly spaced submodels can be maintained
in each current cell. A Chow test is then performed using the RSS from the par-
ent cell and each of the two candidate submodels (left and right) to determine if
the submodels offer a statistically significant improvement in error. The second ap-
proach is based on residual analysis (RA) and builds on the approaches taken in
SUPPORT and GUIDE by performing t-tests on the means and variances of the
positive and negative residuals, which are recursively updated with new data. It
was shown that the RD method performed significantly better than the RA method
for the applications presented.
In Refs. [91], Hametner describes a batch axes-oblique approach for growing a
hierarchical logistic discriminant tree using the EM algorithm. In Ref. [92, 93], the
split logic is adapted to be compatible for an online, evolving LMN by assuming that
the residuals follow a ?2 distribution for Gaussian white noise given a proper model
structure, and splitting the model if the hypothesis test on the residual variance
fails under a pre-specified confidence. With its use of logistic sigmoid weighting
functions and a combined nonlinear least squares for both child models, this work
deviates from the other online LMN approaches.
71
2.5.3 Aerodynamic modeling LMN construction algorithms
Although the LMN approach to nonlinear modeling has been used across a
variety of disciplines, it has only had limited use within aircraft system identification.
Because the focus of this research is on aerodynamic modeling, two LMN-related
approaches that have been specifically applied to this problem will be introduced.
In Ref. [52], Seher-Weiss applies a ?classical? LMN algorithm to X-31 aerody-
namic data, which performs similarly to LOLIMOT with bisecting axes-orthogonal
splits and Gaussian validity functions. This work is then built upon by introduc-
ing two modified approaches known as extended LMNs, under the assumption that
classical LMNs allow only local linear models with EVs = PVs. In the nonlinear
LMN approach, the PVs are specified separately from EVs, and nonlinear coupled
terms are allowed in the local models as combinations of the EVs and PVs. The
structured LMN approach offers a unique parameter-centric alternative LMN archi-
tecture for a case where the model parameters have important physical meaning.
For the parametric model structure, a separate LMN is developed for each parame-
ter, the output of each LMN is multiplied by the corresponding modeling term, and
the collection of structured LMNs is summed to produce the global model output.
On the one hand, this approach provides vast flexibility such that each parameter
can be modeled individually, and splits made along PVs are localized only for the
parameter associated with that LMN. However, the cost functions across the LMNs
are interdependent, and so a change in one LMN causes the cost functions in the
other LMNs to change. Thus, in contrast to LOLIMOT?s more efficient algorithm
72
where only the worst LMN is modified during each iteration, in this case the entire
LMN must be updated during each iteration following an adjustment in a single
cell.
The second approach by Brandon and Morelli in Ref. [12] served as a foun-
dational motivation for the work developed in this dissertation. It is presented as
a Takagi-Sugeno fuzzy modeling method where the cell structure is iteratively de-
veloped through a search cycle. Although it was developed for real-time operation
and was tested onboard an MB-326M Impala aircraft, the cell structure must be
developed offline in an iterative search cycle using training data, and once the cell
structure is frozen, the local model parameters can be recursively updated in real
time. In this work, EVs = PVs, and ramp-based membership (validity) functions
were used to partition the input space. In contrast to many other LMN-related
work where the poorest performing local model is further partitioned with an addi-
tional (often bisecting) split during each iteration, in this approach the entire LMN
is reconstructed each time. A single evenly spaced partition is added to each of the
input variables one at a time, and the entire LMN is re-estimated with this new
cell structure using global optimization where all of the local model parameters are
estimated together. Because the total number of cells is the product of the number
of partitions for each input variable, more than one additional cell can be added
during each iteration. The best candidate model is chosen by comparing model fit
criteria such as coefficient of determination (R2). Several heuristic stopping criteria
are possible, including comparing model fit metrics from subsequent iterations to
ensure sufficient improvement.
73
2.6 Summary
The particular modeling goals that this work set forth to accomplish are dis-
cussed in Chapter 1, which led toward utilizing the LMN approach. Since the field of
LMNs and related work is so vast, this chapter was devoted toward further exploring
some of the properties of these approaches and related algorithms that apply the
methods. The cell structure can be developed in various ways including customizing
the split dimensions and locations; the leaf models can be adjusted to fit the data
in a particular cell through choosing the model variables, order, and structure; and
the global model architecture can reflect how the local models best fit together to
form the global model output.
Many offline and online methods employ the approach of candidate sub-models
where before a split is performed, several possibilities are considered, and the best
one is chosen from among the options based on a cost function. In this approach, the
resulting cell structure is guaranteed to perform better than the previous one through
the metric of that cost function. For online computation, however, candidate models
can be inefficient and require high computational and memory loads. Other methods
have explored the use of residual analysis to inform the splits, but these methods tend
to show inferior results compared to candidate models, and have not been explored
in depth for real-time applications, particularly for aerodynamic modeling. The
study performed throughout this chapter, both regarding the background material
as well as the literature review, led toward developing the new approach discussed
in Chapter 3.
74
Chapter 3
Smoothed Partitioning with LocalIzed Trees
in Real time (SPLITR)
3.1 Introduction
A novel approach to real-time global nonlinear modeling, known as Smoothed
Partitioning with LocalIzed Trees in Real time (SPLITR), is developed and pre-
sented in this chapter. As discussed in Chapter 1, this novel LMN method can be
applied to other systems outside of aircraft; however, the attributes of aerodynamic
modeling strongly influenced the approach itself within the context of the NASA
L2F concept, and so it will be described within the scope of aerodynamics. Some of
the specified modeling goals and priorities include an interpretable model that of-
fers physical insight, both local and global prediction capabilities, and compatibility
with real-time updates of both the cell structure and parameters.
Although the L2F concept generally assumes no a priori modeling information
is available, it is more typical that some information is obtained from first principles,
wind tunnel tests, CFD, prior flight testing, or other sources. Therefore, it is impor-
tant that the modeling approach be compatible with incorporating a range of depth
of prior insight. As discussed in Chapter 2, LMNs offer great flexibility through
various features that can be specified in advance or customized automatically, and
some of these properties will be discussed in the context of the SPLITR approach
and aerodynamic modeling throughout this chapter.
The SPLITR approach overview is shown in Fig. 3.1. The experiment design is
an important step where the goal is to efficiently collect test data that include high
information content within the bandwidth of the dynamics that are modeled. For
the flight test data in this work, the experiment design methods developed under
the NASA L2F concept and discussed in Section 1.2.2 were employed. The test data
are then used to recursively update the associated local model to which they belong.
The information about the model performance in each cell is used to inform the cell
structure determination, which will partition the model to a finer resolution where
the system is highly nonlinear. Finally, the global weighting ties the cells together
into the local model network.
Figure 3.1: Overview of SPLITR modeling approach.
The SPLITR approach is divided into three main sections in this chapter to
address aspects associated with the leaf models, which include the recursive param-
76
eter estimation process; the cell structure determination, which consists of the cell
splitting logic; and the global weighting. The chapter ends with a detailed summary
of the various user inputs and specifications associated with the SPLITR algorithm,
along with recommendations and insights for how to designate them.
SPLITR software developed through this work used parts of the software tool-
box called System IDentification Programs for AirCraft, or SIDPAC [21].
3.2 Properties and aspects of leaf models
Within each cell, a local model is estimated for each response variable y that
represents the data contained within the cell boundaries defined in terms of the PVs.
Most generally, the modeling data can be obtained from any singular data source,
or the data can be calculated analytically and/or processed based on several sources
of data for any variable to be modeled. The response variable is generally expressed
as
y = f(x,?) (3.1)
where x are the explanatory variables (EVs) used in each local model, and ? are
the partitioning variables (PVs) that partition the input space into local regions.
In this work, the estimated model is a linear expansion in terms of explanatory
variables that are measured. Other model forms can also be used, such as general
polynomials, state space representations, or transfer functions, among others.
77
3.2.1 Selection of model explanatory and partitioning
variables
In the SPLITR approach, the EVs can be specified separately from the PVs to
include all, some, or none of the same variables. Each response variable is modeled
separately and independently, so the EVs and PVs can be specified for each model
based on any a priori insight. This flexibility can allow experience to inform the
modeling process. In this work, the relevant terms through which to partition the
model are automatically chosen from a specified pool of possible PVs. However, in
the interest of limiting computational requirements for a large number of options,
this pool should be specified carefully and sparingly.
Particularly for conventional fixed-wing aerodynamic modeling, the EVs may
include air flow angles ?, ?, body-axes angular rates p, q, r, and control surface
deflections ?. The PVs are typically the variables through which an aerodynamic
database lookup table might be scheduled, such as Mach number, power setting, or
nominal angle of attack. For rotorcraft modeling, the wind and stability axes are not
defined in hover, so EVs ? and ? are typically replaced with body-axis translational
velocities u, v, w, and PVs may include V0 or q? as well.
In this work, for simplicity and interpretation of the simulation examples,
global EV values were used in the local models, so the EV data were not defined
relative to the reference point at the center of each cell.
For each test case shown throughout this work, the PV input space was nor-
malized for validity function evaluation, so the maximum and minimum expected
78
values for each PV were specified in advance. If the range is specified as too small,
the validity function weights will become saturated at the edges. If the range is
too large, then there may not be enough resolution for the validity functions to
adequately weight the range of actual data within the PV input space.
3.2.2 Specification of local model order
LMNs emphasize the idea that an arbitrary nonlinear function can be rep-
resented by a number of smaller linear components, so the examples discussed in
this dissertation are restricted to linear modeling terms. The parameter estimation
regression methods presented though are in no way restricted to linear terms.
For aerodynamic modeling from flight data in particular, the traditional ap-
proach involves linear approximations at many relevant operating points throughout
the flight envelope. Other approaches utilize global multivariate nonlinear polyno-
mials by selecting the relevant modeling terms through methods such as stepwise
regression or multivariate orthogonal functions. These two sets of approaches are
different methods to account for nonlinearities, i.e. by utilizing many linear models,
or employing a single global multivariate nonlinear model. LMNs can take advan-
tage of the fact that traditional aerodynamics tend to be linear throughout much of
the flight envelope by collecting a set of local linear models for the cells. However,
highly nonlinear stall, post-stall, or spin regimes may be captured better with a
single higher order polynomial model instead of many linear models.
79
3.2.3 Determination of model structure
In this work, a local linear model structure is assumed, and the EVs are
fully specified a priori, so model structure determination is not incorporated into
the automated modeling process. If higher order model terms are allowed, then
a model structure determination method would be necessary. Even with strictly
linear terms, model structure determination could be an important step to choose
the relevant EVs from a broader pool. However, LMNs already face the challenge
of adapting the cell structure in real time, so adapting the modeling terms as well
is an additional difficulty for onboard real-time computation. For this reason, the
local model structure is assumed linear, fixed, and uniform for all local models.
3.2.4 Parameter estimation process
The parameters for each model are estimated by minimizing the equation error
in the least-squares sense, with this process described in detail in Refs. [21,22]. The
regression equation can be expressed as
z = X? + ? (3.2)
where z ? RN is the response variable data to be modeled over N measurements,
X is the N ? nx matrix of explanatory variables, ? ? Rnx is the vector of model
parameters, and ? ? RN is the modeling error.
The cost function to solve for the best estimate ?? that minimizes the sum of
80
squared differences between the measured response variable and the model output
is
1
J(?) = (z ?X?)T (z ?X?) (3.3)
2
The solution that provides the optimal result is
?? = (XTX)?1XTz = DXTz = M?1XTz (3.4)
where M = XTX contains the information content in the regressor data, and
D = M?1 is the dispersion matrix.
The Cramer-Rao lower bound, which is an asymptotic estimate of the param-
eter covariance matrix, is a measure of uncertainty in the parameter estimates. If
the residuals are assumed to be white, it is defined as [21]
[ ]
?(??) = E (?? ? ?)(?? ? ?)T = ??2(XTX)?1 (3.5)
The model fit error variance is then approximated as
2 (z ? y?)T (z ? y?)?? = (3.6)
N ? nx
While the formulation defined above can be used to estimate a model with
batch data, ?? can also be updated recursively for real-time parameter estimation.
If the model is defined at the (i ? 1)th data point, then as new measurements z(i)
and x(i) are received, the parameter estimates can be updated through Eqs. (3.7
81
?3.9) [21].
D(i? 1)x(i)
K(i) = (3.7)
? Tff + x (i)D(i? 1)x(i)
D(i) = (1/? Tff )[I?K(i)x (i)]D(i? 1) (3.8)
??(i) = ??(i? 1) +K(i)[z(i)? xT (i)??(i? 1)] (3.9)
The forgetting factor ?ff is included to weight past information to allow the model to
update more quickly with new information, which is particularly useful in this work
to allow the cells to adjust more quickly to new information following splits. The
parameter covariance is updated with a recursive estimate of the fit error variance
as
( )
2 i? 1 1?? (i) = ??2(i? 1) + ?2(i) (3.10)
i i
?[??(i)] = ??2(i)D(i) (3.11)
After the modeling process has been completed, the effectiveness of the re-
sulting model can be described by several modeling metrics. In particular, the
coefficient of determination (R2), defined in Eq. (3.12), is a model fit quality mea-
sure that varies from 0 to 1 and describes how much of the variation in the data
82
about the mean value is captured by the model.
??N2 ? i=1[z(i)? y?(i)]2R = 1 (3.12)N 2
i=1[z(i)? z?]
The equation-error formulation enables recursive parameter updates as data
are obtained in real time, and it is performed in the time domain to allow points
that are not contiguous in time to be incorporated into the associated models in
each cell. For example, Fig. 3.2 shows F-16 simulation data that are divided into
three cells across the complete range of ?, where the data associated with the ?
range in each cell are incorporated into the respective local models of low-, mid-,
and high-? ranges.
Figure 3.2: F-16 simulation time history data with ? partitioning.
The parameter estimation can be performed using either a global or local
optimization process, as discussed in detail in Section 2.3.4, and which will be sum-
83
marized here. Prior work employs global learning where the parameters across all of
the cells are optimized simultaneously in a single global weighted regression prob-
lem [12]. This centralized approach can lead to an ill-conditioned regression problem
and interdependent parameter estimates with high uncertainties that cannot be in-
terpreted individually [23]. Other work utilizes local learning by posing a separate
weighted regression problem for each local model [52]. However, the individual cell
weights are a normalized function of the weights in all of the cells. So when a
new measurement or a new cell is added, the estimates from all other cells would
change, meaning the parameter estimates from all cells are still interdependent. In
this work, the local parameter estimation problem is completely decoupled from the
global modeling by first estimating the parameters in each cell, and then overlaying
the validity functions afterwards. This approach retains the interpretability of the
model in each cell, and allows new measurements to simply be incorporated into the
model to which they belong, irrespective of the other cells. It also offers flexibility
for validity function customization to blend the local models together.
3.3 Cell structure determination process
The cell structure is developed through a decision-making process that is used
to successively partition the data into relevant local regions. The guiding objective
is to improve the global model accuracy by accounting for nonlinearities, while
maintaining a parsimonious model with as few cells as possible.
84
3.3.1 Cell structure problem statement
This work constrains the splits to be axes-orthogonal because of the real-
time operation constraint, emphasis on interpretability of the cell structure, and
because the PV pool for aerodynamic modeling is typically low-dimensional, with
some of the relevant PVs discussed in Section 3.2.1. The common method of bi-
secting axes-orthogonal splits, however, can be highly restricting, in particular if
the nonlinearities are biased toward one side of the PV input space as is typical for
aerodynamics, and which can result in redundant cells and overly complex models
with less meaningful split locations. The SPLITR approach more generally allows
variable breakpoints that are informed by a new method of residual analysis, which
forgoes the advantages of batch processing and iterative optimization methods for
real-time compatibility.
Accordingly, the cell structure determination problem can be summarized by
posing the following questions:
Is a split needed? Given the current cell structure, the model is evaluated to deter-
mine if it is sufficient, or if the data in one or more cells are not characterized well
by the local models.
Which cell to split? Which local model is performing poorly? Is it due to noisy
data, or is it due to a nonlinearity that this local model is not sufficiently capturing?
The approach needs to keep track of estimates of noise in the data, as well as model
fit statistics.
85
Which dimension to split? Once a particular cell has been identified as having a poor
model due to unmodeled behavior, then allowable axes-orthogonal split dimensions
include those identified as PVs. If the relevant possible sources of nonlinearity are
known in advance, this pool can be more carefully specified to reduce computation.
Where to split along a dimension? Where along a certain dimension in a specified
cell should the next breakpoint be placed? If the developed cell structure is intended
to offer physical insight through the partition locations, then these breakpoints must
be carefully selected.
When to split? How much data or statistical evidence is needed to split the cell?
If new cells are added too rapidly, the result can be an overly complex model with
redundant cells, whereas if splits are performed too cautiously, then the model is
degraded in real time.
3.3.2 Preliminary guiding aspects and overview of cell
structure determination
Many past methods use an iterative, offline cell structure determination pro-
cess which is driven by improving a global model fit performance metric. During
each iteration, a binary axes-orthogonal split is typically proposed in all existing
cells with all possible dimensions, and the entire model is recomputed for each case,
resulting in a series of proposed models that all have one more cell than the previous
iteration. The proposed cell structure that improves the global performance metric
the most is chosen for the next model iteration [52, 71, 79]. With the advantage
86
of being offline, this iterative approach has access to all of the modeling data at
once, and estimates a new set of parameters for each proposed model. This is not
the situation for a real-time approach, where all past data cannot be practically
stored and used to iteratively recompute parameter estimates after a new cell is
added. Therefore, a different real-time decision-making procedure for cell structure
determination is necessary.
A challenge with the offline approaches, and one that is apparent in any sys-
tem identification problem, is choosing an ideal metric that adequately characterizes
the model fit quality, and serves as a reliable stopping criterion to preclude exces-
sive splits and ensure a parsimonious model. Common model fit statistics such as
the coefficient of determination (R2), Akaike information criterion (AIC), residual
sum of squares (RSS), etc., or a combination thereof, are used to characterize the
model fit during the model development process for these approaches. But while
they can be useful tools for model assessment, they are global measures of the fit,
and can provide misleading information about the localized model fit quality if not
interpreted correctly. Furthermore, a satisfactory threshold can be arbitrary across
different data sets with varied data quality and noise levels.
The SPLITR approach to cell structure determination, similar to that of pa-
rameter estimation, examines the performance of the local models instead of the
global model fit. This localized approach infers that if the local models perform well,
then the weighted combination that comprises the global model will provide a good
fit too. Instead of considering model fit statistics, SPLITR examines the essence
of the information contained in these metrics, which are the residuals, i.e. the dif-
87
ferences between the measured data and the estimated model output, as shown in
Eq. (3.13) for the ith measurement.
??(i) = z(i)? y?(i) (3.13)
The residuals represent samples of the modeling error, and they can be used to
assess model adequacy and reveal model deficiencies through a residual character-
ization procedure, which will be described in the data processing sequence next.
Furthermore, the individual local model residuals are discrete samples that describe
the errors locally, thus providing information regarding the magnitude of the model
error as well as the location within the ranges of measurements. Ideally, the resid-
uals should have constant variance, and be mutually uncorrelated. In flight data,
the residual variance magnitude is often a function of certain flight variables, but
should still not display deterministic character. Figure 3.3(a) shows the residuals
as a function of ? from a SPLITR model using the F-16 data set. Notice that the
variance tends to increase at higher ?, which is consistent with the larger ampli-
tude of noise in that region. Figure 3.3(b), on the other hand, shows residuals with
a deterministic character that indicates poorly modeled aerodynamics, and which
results from a model with only a single linear cell for the same F-16 data. Ideally,
the whiteness of the residuals in each cell would be evaluated through the autocor-
relation, but since the data are not necessarily contiguous in time in each cell, this
method is not feasible, and another approach must be formulated.
88
(a) SPLITR nonlinear model (b) Linear model
Figure 3.3: Residuals for F-16 models of CL = f(?).
The SPLITR process can operate under two separate update rates, with the
process overview shown in Fig. 3.4. At a faster rate, each new measurement is
processed immediately so that all of the useful information it contains is extracted,
and the model parameters of the associated cell are recursively updated. Since the
cell structure is not expected to change rapidly, the cell splitting procedure operates
at a slower rate, which is chosen to be outside of the expected range of dynamics
that are being modeled, so that the cell structure may be modified at least as fast as
the dynamics. The global model blending, which is a function of the cell structure, is
updated along with the cell structure changes. These update rates can be modified
for other systems and data rates. A detailed depiction of the data processing and
cell splitting procedures is presented in Fig. 3.5 and will be descried in detail in the
next section.
89
Figure 3.4: SPLITR process overview.
(a) Data processing procedure (b) Cell splitting procedure
Figure 3.5: Data processing and cell splitting procedures for SPLITR algorithm.
90
3.3.3 Data processing procedure
The data processing sequence is shown in Fig. 3.5(a), and each step will be
summarized below. The ith measurement supplied to the model consists of the
response variable z(i), the associated vector of explanatory variables x(i), and the
vector of specified partitioning variables ?(i).
1. Identify host cell: For a given cell structure, each response variable measure-
ment is parametrized by its associated PVs and belongs to a single cell.
2. Update residual threshold: Residual characterization is the method used to
distinguish between random and deterministic properties in the residuals, the
latter of which would be attributed to unmodeled dynamics, or nonlinearities
in the data that are not captured by the local model. It is important not to
mistake noisy data for a poor model fit. Therefore, an acceptable threshold
is needed to characterize the residuals such that those that lie within the
bound represent reasonable model error that includes noise influences, and
those that lie outside represent deterministic model deficiency. To compute
the threshold, each response variable to be modeled is passed through a real-
time discrete high-pass filter to obtain an estimate of the noise. Note that
this estimate is a wide-sense classification of noise that includes any other
content that will not be modeled [12]. A running root-mean-square (RMS)
estimate of the noise is computed for the response variable data in each cell,
and it is multiplied by a factor ?? , to account for statistical variations and to
91
describe an acceptable residual threshold based on the noise level. These noise
estimates are binned across the range of each PV, as will be detailed later in
this section, so different cells across the ranges of PVs would be expected to
have varied thresholds based on the noise levels in the data they contain.
3. Calculate local model updates: The parameter estimation scheme discussed in
Section 3.2.4 is used to recursively calculate the updated local cell parameters
with the information contained in each new measurement. The local model
output for the ith measurement incorporated into the kth cell is
y?k(i) = x(i)??k(i) (3.14)
The associated local residual for that point is
??k(i) = z(i)? y?k(i) (3.15)
4. Characterize residual: Using the calculated residual threshold, the absolute
value of each residual is characterized in one of three ways. An unrestricted
residual is from a measurement that is unconditionally allowed into the model.
This can be at the beginning of the modeling process as it waits to receive
enough information for a reliable initial model, or as will be discussed later in
this section, immediately following a split so the child cell models can adjust.
92
A measurement associated with an unrestricted residual is consequently not
regarded for splitting purposes. An acceptable residual is one that is not con-
sidered an unrestricted residual, that is below the calculated residual threshold
for that cell, and is regarded as supported by the local model accounting for
noise. An unacceptable residual is one that is also not characterized as unre-
stricted, that is outside the residual threshold, and is therefore considered an
indication of a poor model fit in the associated cell.
5. Process residual: To provide discretized error information within each cell that
will inform a possible split location, the absolute values of the residuals are
binned across the range of each PV based on a prescribed resolution that is the
minimum acceptable width of a cell. The bins are designated at the beginning
of the modeling process, and as the cell structure is modified, the cell to which
they belong changes accordingly with the cell boundaries. Determining the
split location is thereby reduced from a continuous problem across the range
of a cell to a discrete choice along one of the pre-defined bin boundaries, which
is more conducive to real-time operation and decision making.
In real-time operation, the residuals in each bin must be processed in a
way that their influence toward a possible split can be retained and recursively
updated. The properties of the absolute values of the residuals in each bin are
therefore represented by the mean and variance, which are given for N data
points as
93
? ?
|?1|+ |?2|+ ...+ |? NN | i=1 |?i|
N 2
? = = , ?2 = i=1
(|?i| ? ?N)
N (3.16)
N N N N
As new measurements are received, the mean and variance calculations of the
associated bins are recursively updated as
? N + |? | 2N N+1 2 ?NN (|? 2N+1| ? ?N+1)?N+1 = , ?
N + 1 N+1
= + (3.17)
N + 1 N
Each bin contains a count of the different types of residuals, and a set of four
recursively updated associated statistics: the mean and standard deviation of
the acceptable residuals, ?A, ?A, and those of both the acceptable and unac-
ceptable residuals together, ?B, ?B. Both types of residuals are combined in
?B, ?B as a measure of the total mean and standard deviation in the resid-
uals. Note that the unrestricted residuals are not incorporated into the bin
statistics.
Since the parameter estimates in each cell are changing over time as new
data are obtained, the real-time residuals that are calculated and processed
here are obsolete as soon as the parameters are next updated. Theoretically,
if all of the past data were saved then the residuals could be recomputed each
time the parameters are updated. However, this is considered an artifact of
the constraints of real-time operation.
If the residual is characterized as unrestricted or acceptable, the parameter
94
updates computed in step 3 above are accepted, and the local model is updated
accordingly. If the residual is considered unacceptable, then the local model is
returned to its prior state, and the associated measurement is stored. Note that
although some data are subsequently stored and later accessed, this amount
of data is small compared to the total amount of data, and is considered
reasonable for onboard computation. Alternatively, this data storage step
could be skipped, which would effectively discard the measurements associated
with the unacceptable residuals and rely solely on future data, as discussed in
step 5 of the cell splitting sequence below.
3.3.4 Cell splitting procedure
The cell splitting procedure is shown in Fig. 3.5(b), and is described next.
During this sequence, the residuals are evaluated to determine if the cell structure
is adequate, or if a cell should be split. This procedure is detailed below.
1. Identify active cells: Because the cell structure determination is directly asso-
ciated with residuals characterized as unacceptable, only those cells that have
received measurements whose residuals are characterized as such since the last
cell structure check need to be examined. Additionally, only the active bins
within the cell, or the bins that actually contain data, are analyzed.
2. Combine bins: A user-defined minimum cell width may be conservatively small
in order to capture higher order nonlinearity in certain ranges. However, a
fine resolution may not be necessary if data are obtained quickly across a wide
95
range, and larger cell widths may therefore be considered first. Furthermore,
with such a fine initial resolution, it might take a while for each bin to collect
enough information to produce useful, informative bin statistics. Therefore, a
maximum number of bins across each cell is also pre-specified, so that early
on in the cell division process only larger splits will occur, and later on, the
finer resolution can be considered.
This can be visualized through an example in Fig. 3.6 which shows a single
cell with 5 bins across the range of the PV where the residual information
is stored. The cell boundaries are shown as thick red lines. There are 10
minimum-width bins spaced throughout the cell and shown as gray dashed
lines. If a maximum number of cell bins is specified as 5, these bins are
combined as shown by the bin boundaries outlined in blue.
Figure 3.6: Sample cell with 5 bins.
The minimum bin width cannot be changed in real time since the associ-
96
ated past residuals are not accessible in order to compute new bin statistics.
Therefore, from the beginning the statistics are computed in the minimum bin
widths, and then those statistics from adjacent bins are combined during the
cell structure checks up to the maximum number of bins. Equations (3.18 ?
3.19) show how the means and variances, respectively, are combined for two
bins using their individual values as well as the number of data points in each,
designated as N1 and N2.
?N1N1 + ?N N2?N1+N2 =
2 (3.18)
N1 +N2
2 2
2 ?N N1 + ?N N2 ?
2
N N1 + ?
2
N N2 ? (N +N )?2? = 1 2 + 1 2 1 2 N1+N2N +N (3.19)1 2 N1 +N2 N1 +N2
3. Evaluate bin quality: A pass/fail status of each (combined) bin is then deter-
mined to describe if the bin contains an acceptable amount of error, or if the
indication of deterministic error is strong. If the mean of both the acceptable
and unacceptable residuals is pulled beyond a specified number ??, of standard
deviations from the mean of the acceptable residuals, as shown in Eq. (3.20),
then the bin is considered failed. This can be visualized through a depiction
in Fig. 3.7.
?B > ?A + ???
A (3.20)
97
Figure 3.7: A bin fails if the mean of both types of residuals is pulled beyond a specified
number of standard deviations from the mean of acceptable residuals.
A bin may contain both acceptable and unacceptable residuals, so Eq. (3.20)
ensures not only that there are unacceptable residuals in the bin, but that their
magnitude ?outweighs? the acceptable residuals found there, according to this
specified comparative metric. Furthermore, for a bin to fail the status check,
it must contain a minimum number of residuals to ensure that sufficient in-
formation is obtained. If a bin fails the status check, the normalized severity
of the unacceptable residuals is computed as in Eq. (3.21) to represent how
many standard deviations the mean of both types of residuals is drawn away
from that of the acceptable residuals, and is also saturated at 1.
?B ? ?A
bin severity = (3.21)
?A??,norm
The bin severities from adjacent failed bins are summed to obtain a group
severity, which is then compared to a prescribed maximum total severity. If
98
it surpasses the limit, then that bin group is a candidate to inform the next
cell split. Within a single cell, the group of bins with the largest total severity
beyond the limit is chosen to inform the split. The above analysis can be
performed in parallel across multiple PVs to choose both the split dimension
and location along that dimension in a single cell through the same decision
process. However, note that the discretized binning and calculations will need
to be performed across all PV dimensions (n? times), so it is advisable to limit
the number of partitioning options within the understanding of the system
behavior and possible sources of nonlinearity.
4. Split cell: If a split has been justified, then using the selected set of bins, two
split locations are considered. The cell may be split at the right boundary of
the right-most bin or at the left boundary of the left-most bin. The chosen
split location is that which partitions the failed bins toward the side with the
smallest distance to the edge of the active PV input space, which is often the
side with the fewest additional bins. This is in order to leave intact the largest
region that may already be performing well. In the depiction in Fig. 3.8, the
left boundary of bin 3 would be chosen.
99
Figure 3.8: The split location is chosen as the right or left boundary of the collection of
failed bins.
The parent cell is then divided into two child cells, with one on each side of
the specified split location. The bin counts and statistics discussed in step 5
of the data processing sequence are zeroed for the child cells.
5. Initialize child cells: When a parent cell is divided into two child cells, the
presumption is that the parent model was inadequate at least across a certain
range of the data it included. However, its parameters were estimated using
data across the ranges of both child cells, so when a split is performed, the
parameter estimates associated with the parent model no longer equally rep-
resent both sides of the split. Four possible approaches to initializing the child
cell models immediately following a split are discussed.
First, ideally if all past modeling data are stored and accessible, then when
a cell is split, the parameter estimates for the child cells can be recomputed
in batch with all of the past data that belong in each cell. This approach,
100
however, is not practical in real time.
Second, all possible configurations of cell structures can be posed a priori,
and as the data are received in real time, all possible cell models will be
recursively updated, while only the models in the current cell structure will
be activated. Note that in this approach, each measurement can belong to
more than one cell. Then when a split is performed, the respective latent
models will be activated. While this approach would also provide a model for
the exact data in each cell, this would only be practical with a very coarse
resolution of possible splits and few split dimensions. If there are multiple split
dimensions and a fine resolution, the curse of dimensionality implies not only a
large number of minimum resolution cells, but also an astronomical number of
intermediate cell structures as the model is successively partitioned. If there is
prior knowledge of the system behavior that can inform a reasonable number
of possible cell structures, then this approach may be more practical.
Third, an approach that would be feasible in real time is to zero the model
in each of the child cells and begin from scratch to estimate a new model
with the future data obtained in the cell. Again, this would provide a model
with the correct data; however, it is inefficient to discard all of the past data
collected in the region of the parent cell, particularly because much of it could
still be relevant to the child models.
Finally, a practical approach for real-time model initialization accepts the
fact that the parent model was built using data contained in both child cells,
101
and despite that, each child cell?s model parameters are initialized with those
of the parent cell. Since the information matrix cannot be recomputed in real
time with past data, the child cell is initialized with a percentage of the parent
cell?s information matrix, M = XTX, that is proportional to the number
of past measurements in the parent cell that are within its range. These
measurement counts are binned similar to the residuals and are summed across
the bins that belong to each child cell. Note that the inherent assumption
behind this approach is that each measurement contributes equally to the
information matrix in terms of information content, which is not necessarily
true, but which provides a good starting guess within the real-time constraints.
The dispersion matrix, D = M?1, is then used to continue the recursive
parameter updates for the child cell, as in Eqs. (3.7 ?3.9). An additional cell
initialization proportion factor may also be used to further reduce the reliance
on parent model information if the child cells are expected to differ significantly
from the parent models. Lastly, the stored measurements associated with the
unacceptable residuals from each side of the split are incorporated into the
respective child models so that they are adjusted instantaneously to include
the data that were deemed not fit well by the parent model. Following the
child cell initialization, a specified number of new unrestricted measurements
are also automatically allowed into the model to allow it to adjust to the new
region it is defined over, before residuals are further characterized. The child
cell?s residual threshold can be initialized as well, based on the RMS of the
noise estimates of the response variable data recorded in the bins belonging to
102
the cell.
3.4 Global weighting and combination of local models
Up until this point in the SPLITR modeling technique, the parameter estima-
tion and cell structure determination processes within the LMN architecture have
been decoupled from the model weighting process, and instead have focused on the
local cells and preserving their accuracy and interpretability. The global nonlin-
ear model, which is the output of the LMN, is then a weighted combination of the
piecewise local models multiplied by the validity functions. In this work, the validity
functions were chosen as normalized Gaussian weighting functions that are scaled
based on the cell width, as discussed in Section 2.4.1. The weighting function shown
in Eq. (2.12), as well as the associated scaling factor, was used in this work. The
global model output over M cells for the ith point is then given in Eq. (3.22), where
the global model output is expressed as a weighted combination of the local model
outputs, as discussed in Section 2.4.2.
?M ?M
y?(i) = y?k(i)wk(i) = x(i)??k(i)wk(i) (3.22)
k=1 k=1
Note that since the validity function is not directly incorporated into the re-
gression and is simply overlaid on the local models, the resulting weighted nonlinear
model is no longer optimal in the least-squares sense. The role of the validity func-
tions, however, is primarily to ensure smooth transitions over the cell boundaries
without strongly impacting the parameter estimates in the individual cells.
103
3.5 SPLITR user specifications and inputs
Chapter 3 so far has described in detail the SPLITR algorithm, along with
several user-specified parameters that can be customized for a particular application
or data set. This is intended to allow the user to incorporate any physical or other
insight to tailor the method to provide the best results. For systems with little to
no prior insight, these parameters can still be specified more generally. Some of the
basic user specifications include the EVs associated with each response variable, a
PV pool to be considered for partitioning each response variable, and the expected
minimum and maximum value for each PV for normalization. Consistent across
many modeling approaches, it is important that any user insight into the system
to be modeled is used to inform these and other specifications. This section will
summarize the particular SPLITR input parameters and offer guidance for how to
designate them. The relative importance and impact of each parameter will also be
highlighted to point out those that impact the results more strongly.
A list of the user-specified parameters for the SPLITR algorithm is given in
Table 3.1, along with the baseline specifications used throughout this work, where
applicable, and each will be described below.
104
Model Complexity
Minimum cell width ?
Maximum # of cell bins 10
Maximum # of allowable cells Inf
Residual Threshold
High-pass filter order 4
High-pass filter cutoff frequency 3 Hz
Residual threshold factor, ?? 2 or 4
Filter window 100 points
Bin Status
Standard deviation factor, ?? 0.75
# of standard deviations to normalize, ??,norm 1
Total bin severity threshold 2
Minimum # of points in bin for failure 20
Unrestricted Residuals
# of unrestricted points before initial model 250
# of unrestricted points following a split 150
Update Rate
Parameter update rate 50 Hz
Cell splitting update rate 5 Hz
Global Model
Smoothness factor, ?s 1
Model Information
Forgetting factor, ?ff 0.995
Cell initialization proportion 0.2 or 1
Table 3.1: SPLITR algorithm parameters.
The first set of parameters are related to controlling the model complexity by
supplying the resolution required to partition properly, while not allowing the model
to become too complex.
? Minimum cell width: This specifies the minimum width for the cells in
each dimension, as well as informs the number of bins in which the residual
information is recorded. These minimum-width bins can then be combined up
until the maximum number of cell bins (see below). This parameter is speci-
fied individually for each of the PVs. The split locations can only occur along
105
the discretized boundaries of these minimum-width bins, so this specification
strongly impacts the eventual partitions, but is also linked to the maximum
number of cell bins, which collectively will determine the choices for split lo-
cations at a given time and range of PV data. In many cases, the smaller
the minimum cell width, the greater the flexibility for high-resolution cells.
Nonetheless, the cell widths may actually never be defined as quite so narrow
if wider cells are sufficient. On the contrary, the higher the resolution, the
more bins and more computational power and memory required to store and
process the information. Additionally, if there is moderate to substantial noise
in the data, a high resolution may be unnecessary if a better split location is
hidden by noise. If the partitions are too small, there may also not be enough
data points or a high enough SNR to capture sufficient modeling information.
If the minimum cell width is too large, then the resolution may simply not
exist to capture certain nonlinearities that lie in the data, and the local lin-
ear model assumption is stretched. Currently, the cell resolution is constant
across a given PV, whereas future work could allow for a variable resolution
to reflect expectations of nonlinearity. For example, a larger cell width could
be specified for low-? aerodynamics which tend to follow linear trends, and
a smaller width at high-? to allow the model to be split in accordance with
physical expectations and insight.
? Maximum # of cell bins: This parameter specifies the maximum number
of combined bins that are considered in each cell for splitting purposes at a
106
given point in time. Even if a relatively small minimum cell width is specified,
it is possible that early on in the model development, wider splits will be
desirable and/or preferable to allow the model to capture larger regions first
before proceeding to capture smaller nonlinear regions. It will also increase the
likelihood that there are sufficient data points contained in the larger region
through which to inform the splits. The minimum-width bins are used to
store the residual statistical information, and then the means and standard
deviations of adjacent bins are combined for analysis purposes. The examples
in this work used a maximum of 10 cell bins.
? Maximum # of allowable cells: This will specify the maximum number of
cells that are allowed to partition a single model, across all PV dimensions. It
is meant as a failsafe to ensure the data are not overfit, or to allow the user to
specify a reasonable limit. Since the SPLITR model development in real time
is not an iterative process, overfitting is less likely to occur, but can still be
a concern if e.g. the residual threshold factor is too low. An upper limit does
not need to be specified.
The next set of specifications describes how the residual threshold is computed
for each cell using an estimate of the noise content in the data.
? High-pass filter order: The high-pass filter order will designate how sharply
the Butterworth filter Bode plot drops off at the specified cutoff frequency. In
the examples shown throughout this work, a 4th order filter was specified.
107
? High-pass filter cutoff frequency: The high-pass filter cutoff frequency
is an important specification to indicate the cutoff below which the expected
modal frequencies to be modeled lie, and above which frequency content should
be considered noise or other unmodeled dynamics. A scaling factor was applied
to typical general aviation aircraft flight dynamics to estimate a 3 Hz cutoff
frequency to use in this work, which is outside of the expected range of scaled
rigid-body dynamics that are modeled in the examples in this work. For other
test data, estimates can be performed using sample data to determine the
cutoff frequency. For the simulated test cases in this work, a 3 Hz cutoff
frequency was used.
? Residual threshold factor, ??: The residual threshold factor can be consid-
ered the most impactful SPLITR input parameter that influences the splitting
results, and ultimately the entire model. This factor multiplies the noise esti-
mate in each cell, and specifies how many standard deviations beyond the noise
levels the splitting threshold should lie. It determines how easily a new split
could be made, or how ?trigger-friendly? the method is to allowing further
splits. If ?? is too low, then the residuals will too easily be characterized as
unacceptable, which can result in over-splitting. If ?? is too high, then splits
will be very difficult to obtain, and will require exceptionally large residual
magnitudes. For the basic simulated test data discussed in the next chap-
ter with relatively low magnitude Gaussian white noise added to the data, a
smaller residual factor of 2 was sufficient. However, for experimental flight
108
test data that include other unmodeled dynamics and noise content, distur-
bances, and other effects, a higher factor of 4 was used. If any test data are
available in advance, they can be used to help inform the specification of this
factor. For cases with no prior insight, these factors of 2 and 4 can be guiding
estimates as a starting point. The risks of too many or too few splits may also
be taken into account. If model simplicity is desirable, then fewer splits may
be preferable. In general, additional splits should not deteriorate the model
fit when there is a high split initialization proportion, in which case there may
just be redundant cells. Future work can investigate adjusting this important
parameter automatically.
? Filter window: An RMS of the high-pass filtered modeling data is used to
inform the residual threshold over a specified window of data points. If that
window is too wide, then it will require a significant amount of new data in a
given cell to adjust the noise estimates of that cell. If the window is too small,
then the noise estimate will not converge well. Prior data can be tested for
noise convergence, but this work used a 2 second (100 data points) window
for cases with variable noise content. If the noise is expected (or known) to be
constant, then a much larger allowable window can be used for a more stable
noise estimate.
The pass/fail status of each bin is used to determine if and where to split, and
the following parameters influence the bin status.
? Standard deviation factor, ??: This factor arises in Eq. (3.20) to determine
109
how far the mean of both types of residuals needs to be drawn beyond the
acceptable mean for a bin to fail. If it is specified as too low, then bins will
fail and be conducive to splitting too easily, while if it is too high, bins will
not fail easily enough and splits will be inhibited. In this work, ?? = 0.75 was
used.
? Number of standard deviations to normalize, ??,norm: This factor
in Eq. (3.21) is used to determine the degree to which a bin fails, for the
purpose of comparing the relative failure with other groups of failed bins to
choose the group that is most relevant to inform the split. This is considered
the normalized severity that describes how many standard deviations of the
acceptable residuals the mean of both types of residuals is pulled beyond the
mean of the acceptable residuals. In this work, ??,norm = 1 was used.
? Total bin severity threshold: To choose the group of failed bins that will
be used to inform the split, the grouped severity must surpass a minimum
bin severity threshold. Since the bin severity is saturated at 1 to account
for cells that have only unacceptable residuals, this specification extends to
determining the minimum number of grouped failed bins to split. In this work,
this was set to 2 which requires at least 2 maximally failed bins, or otherwise
greater than 2 failed bins.
? Minimum # of points in bin for failure: This specifies the minimum total
number of points in a bin (regardless of residual characterization) for a bin to
be considered failed. This will ensure that a bin with only few unacceptable
110
residuals is not used to inform a split, and that minimum data are obtained
to make a determination on the bin status. In this work, this was set to 20
points. If this is too low, a bin may fail too quickly with too little information,
whereas if this is too high, then it will have to wait for a large number of data
points to inform a split.
Unrestricted residuals are an important part of the SPLITR logic to allow the
local models to adjust prior to residual characterization that could lead to splitting.
The associated user inputs are summarized below.
? Number of unrestricted points before initial model: This specifies how
many points are allowed into the model as unrestricted to develop the initial
global model with 1 cell, before points can be characterized. In this work, this
was set to 250 points.
? Number of unrestricted points following a split: This specifies how
many unrestricted points are allowed into the child cell models following a
split before residuals are characterized as acceptable or unacceptable. If this
is too low, then the model may not be given enough new data to adjust to the
new region it is defined over, and further (unnecessary) splitting may result
too quickly. If it is too large, then the splitting will take place too slowly. In
this work, this was set to 150 points.
The SPLITR model update rates can vary between the parameter estimates
and the split decisions, and these rates are described below.
111
? Parameter update rate: The parameter estimates in the associated local
model are updated as each new data point is received. This update rate was
set at 50 Hz for this work, as that is the update rate of the onboard flight
control computer used for the experimental flight data.
? Cell splitting update rate: This specifies how often an additional split
is considered. Depending on the dynamics of the system, it is usually not
necessary to split the cells to reflect changing nonlinear dynamics as quickly
as the parameter update rate. Additionally, there is a computational load
associated with considering cell splits, so it is recommended to limit these
checks. This specification should be set to a frequency near the maximum
expected range of modal frequencies to be modeled to ensure that splitting
does not occur slower than the dynamics of the system may change. In this
work, it was set to 5 Hz, but could be much slower to mitigate computational
requirements for the examples shown.
The global model is constructed as a weighted combination of the local models,
and the smoothness applied can be varied.
? Smoothness factor, ?s: This determines the width of the Gaussian weight-
ing functions, as discussed in Section 2.4.1. A large smoothness factor can
be used for a system in which the cells have a significant amount of influence
on each other, whereas a small factor is used for a system where the localized
models in each cell should remain largely local, and the weighting functions are
primarily to smooth through the cell boundaries. The individual cell Gaussian
112
standard deviations are also a function of cell width. This factor is set to 1 by
default.
The final set of user inputs describe the balance required in remembering
vs. forgetting information as the model is developed in real time.
? Forgetting factor, ?ff : As discussed in Section 3.3, there is a constant
tradeoff between passing along too much parent information, or effectively too
much certainty that is hard to overcome with new data, versus not enough in-
formation and essentially throwing away past data. If ?ff is too high (i.e. close
to 1), then it will require a substantial amount of new data to overcome the
parent cell?s information matrix, and further unnecessary splits may occur as
a result. This may be preferable though if the system is expected to be mostly
linear, and the child model will not differ much from the parent. However, for
a highly nonlinear system, if the child cells are expected to vary significantly
from the parent, then a forgetting factor may expedite the convergence by
?forgetting? more of the parent model?s information. This forgetting factor
can be applied only while the unacceptable residuals are incorporated into the
child cells during the splits to expedite the convergence, or it can be applied
continuously to allow the models to adjust as new data are received. Note
that if a continuous forgetting factor is too low, splits may not occur at all
because there will effectively be a single model that rapidly changes in time
to adapt to the current conditions. In this work, a constant forgetting factor
of 0.995 was used.
113
? Cell initialization proportion: This factor is an additional proportion used
to multiply the parent information matrix to initialize the child cell informa-
tion matrix. Note that this matrix is already proportioned based on the num-
ber of parent data points belonging to each child cell. To pass along more
information, this can be set to 1. To enable the child cells to adjust more
quickly, it can be specified as less than 1. In this work, a value of 0.2 was
used for a piecewise linear test case, while a value of 1 was used for all other
examples.
The preceding discussion on SPLITR user inputs, as well as the recommended
initial settings and specifications, was based upon the simulation testing and sensi-
tivity investigations discussed in Chapter 4 and flight data examples in Chapter 5.
Note that several inputs, particularly the number of minimum points in a bin for
failure, the number of unrestricted points before the initial model, and the number
of unrestricted points following a split, do not directly reflect the performance or
convergence properties within each bin or cell. Future work can aim to automate
these specifications (and others) based on further analysis and statistical metrics.
The SPLITR user inputs summarized in this section contain many heuristic
specifications that may be tuned with additional insight into a particular application
and data set to improve the results, and future work can look at automating many of
them. However, the examples shown throughout the next two chapters demonstrate
that most of these parameters do not need to be modified at all from the baseline
specifications to use the SPLITR algorithm well. Additionally, small changes in
114
many of them only have a minor impact on the modeling results. For the parameters
that play a larger role in driving the model development, these explanations can
provide insight for initial values which can be modified if necessary once data have
been obtained.
3.6 Summary
This chapter presented a novel approach to automated global nonlinear mod-
eling using local model networks in real time. Although this method can be gen-
erally applied to any data set, this research was conducted within the scope of
aerodynamic modeling under the NASA L2F concept, which strongly impacted the
technique development. In this approach, local linear models were used with a con-
stant user-specified local model structure, but these can be generalized in future
work to allow local nonlinear models and/or automated local model structure de-
termination. Additionally, the splitting logic developed in this chapter was limited
to adding additional cells based on a user-specified maximum resolution, while fu-
ture work can consider model ?pruning,? or effectively removing past splits that are
deemed unnecessary.
In this work, recursive equation-error least-squares parameter estimation was
used to update the local models in each cell using data in the time domain. It is
recursive to allow the local models to be updated with new data as they are received
in real time. Equation-error least squares is a straightforward non-iterative modeling
method that is compatible with recursive updates. Time domain data were used so
115
that the data in each cell do not need to be contiguous in time for each local model.
The cell structure determination process was uniquely developed for the chal-
lenges associated with real-time onboard cell adaptation. A residual analysis proce-
dure was described as a way to determine split locations based on residual charac-
terization in a set of bins across the range of the PVs. Residuals are characterized
as acceptable to indicate that they are fit well by the associated current local model,
unacceptable to indicate that they are not captured well by the current local model
and are not attributed to noise, or unrestricted to allow the local model to develop.
The residual characterization takes into account the noise content in the data in
each cell to adjust the expectations of model fit. Finally, the SPLITR algorithm
user inputs described throughout this chapter are summarized and explanations of
the impact of each input are offered in a concise overview of the algorithm specifi-
cations.
The SPLITR method developed in this chapter will be further explored and
results will be shown using simulated test data in Chapter 4 and flight data in
Chapter 5.
116
Chapter 4
SPLITR Case Studies and Sensitivity
Investigation with Simulated Test Data
SPLITR was introduced in Chapter 3 as a novel real-time automated global
nonlinear modeling approach that was motivated by aerodynamic modeling in the
context of the NASA L2F concept. This chapter will test the SPLITR approach and
evaluate the results using simulated test data from simple specified functions before
moving on to application-based test data in Chapter 5. This allows the SPLITR
method to be presented, visualized, and understood better with known functions
before expanding the problem complexity. The sensitivity of SPLITR modeling
results to some of the user specifications discussed at the end of Chapter 3 will also
be investigated in this chapter.
There are several descriptive figures shown throughout Chapters 4 and 5 that
aim to clarify how the SPLITR method operates and to understand and visualize
the results. They will be explained in detail only for the first example through
which they are shown. Early test cases will also describe the entire model develop-
ment process in detail, while subsequent examples will serve as demonstrations that
highlight only particular features of the results, and/or explore various aspects such
as sensitivity to user specifications. Unless otherwise specified, all of the default
specifications described in Section 3.5 were used throughout this chapter.
4.1 1D piecewise linear data
Because the SPLITR method operates by determining local linear regions
through which to partition a nonlinear function, the first simplified test case looks
at a piecewise continuous linear model that is defined in Eq. (4.1) for 0 ? x ? 1.
?
??????????10x 0 ? x < 0.2???
??2x 0.2 ? x < 0.6y = ????????
(4.1)
?10x 0.6 ? x < 0.8????2x 0.8 ? x ? 1
This function will be used to demonstrate the SPLITR model development in detail
through a baseline test case, as well as explore the sensitivity of the results to the
minimum cell width parameter and to additive Gaussian white noise in the modeling
data. In each example, two passes through the input space were simulated, with
2500 evenly spaced data points from x = 0 to x = 1, and then an additional 2500
points from x = 1 to x = 0. A 50 Hz sample rate was assumed, with 100 seconds of
test data. The child cell initialization proportion was set to 0.2 since the child cells
are expected to vary significantly from the parent models. The minimum cell width
118
was specified as 0.05.
4.1.1 Baseline case
In this example, the piecewise linear function in Eq. (4.1) was used to demon-
strate how the SPLITR algorithm operates based on the detailed explanation given
in Chapter 3.
The time history of the modeling data generated from this function is shown
in Fig. 4.1. Gaussian white noise was added to the output data with a standard
deviation of 0.075 to obtain an SNR of 20. As shown in Fig. 4.1, the splits occur
in time during the first pass of the input space, shortly after each breakpoint in
the piecewise linear data when the algorithm recognizes nonlinear behavior outside
of the linear representation of the previous segment. The second pass of the input
space repeats the full range of x-values to allow the parameters in each cell to adjust
with new data following the splits.
The evolution of the cell structure in Fig. 4.2 shows how the cells are succes-
sively partitioned and new ones are added across the range of x during each split.
Initially, there is one cell defined across the entire range of x. The first split par-
titions the piecewise linear data in 0 ? x < 0.2 from the rest of the input space,
resulting in cells 2 and 3. Accordingly, the second split partitions the second linear
segment defined over 0.2 ? x < 0.6 from the rest of cell 3, resulting in a structure
of cells 2, 4, and 5. Finally, the last split partitions the third segment defined over
0.6 ? x < 0.8, resulting in final cells 2, 4, 6, and 7, with the cell boundaries placed
119
at the piecewise linear breakpoints. This cell numbering scheme is used throughout
the results shown in this work, and the cell structure evolution depictions can be
used as a source for other figures and discussions that reference the cell numbers.
The final cells are shown in Fig. 4.3 across the range of x, along with the
modeling data and the piecewise linear models. The bin locations, and correspond-
ingly all possible split locations, are shown as light gray dashed lines across x, and
indicate that the ?correct? split locations were chosen. The local models in each cell
are shown to represent the data in that region well, with the residuals confirming a
good model fit.
Figure 4.1: Time history of response variable for piecewise linear baseline case.
120
Figure 4.2: Cell structure evolution for piecewise linear baseline case.
Figure 4.3: Local model fit and residuals for piecewise linear baseline case.
The split logic and decision making that resulted in the final cell structure and
model parameters are based on the residual characterization procedure discussed
121
in Chapter 3. A closer look through a depiction of the cell structure decision-
making process will clarify how the cells evolve over time based on the available
data. Figure 4.4 displays the binned local residuals just prior to each split, i.e. it
shows a characterization of the local model fit quality that triggers a split. The last
plot shows the final cell structure once all of the data are obtained. The residuals
are plotted across the complete range of x in each plot, and the thick vertical dashed
red lines mark the boundaries of the cells. The light gray lines indicate the minimum
cell width bins across x. Although these bins are constant across all cell structures,
the cell to which each of them belongs may change. The bin boundaries are outlined
in blue to indicate active grouped bins within the range of new data in a given cell,
and are also assigned an active bin number shown at the center and top of each
bin. The numbers for the bins that are considered failed at the time of the split are
also colored red. The active range of bins that contain new data in each cell is also
outlined on each side with the bin boundaries shown in red. Note that since the
maximum number of cell bins is 10, minimum-width bins may be combined to form
the active bins (although that is not necessary in this case).
The acceptable residuals are shown as black dots, the unacceptable residuals
as red ??s, and the unrestricted residuals as cyan circles. The mean and standard
deviation of the residuals in each bin are shown in green, red, and blue, for the ac-
ceptable residuals, the unacceptable ones, and both of the two together, respectively.
Only residual data obtained in each cell following a prior split are shown, such that
data from the parent cell are excluded. The exception to this is for the stored parent
unacceptable residuals that are incorporated into each child cell following a split.
122
Recall that those stored and inherited data points are considered unrestricted, and
are not used to define active bins.
Beginning with the first linear model developed for 0 ? x < 0.2, notice that
although there are several residuals that are characterized as unacceptable, the
mean of both types of residuals is still small, and so the bins pass the status check.
Beyond the breakpoint at x = 0.2, however, the unacceptable residuals outweigh the
acceptable ones as the linear model with a constant slope can no longer represent all
of the data adequately. Bins 3?4 are considered failed, so the first split is performed
at the left boundary of bin 3. Correspondingly, the second split partitions the data
in 0.2 ? x < 0.6 with the split location at the left boundary of bin 8 in cell 3.
Similarly, the last split occurs at the left boundary of bin 4 in cell 5 to obtain the
four final piecewise linear models across the range of x. Most of the residuals shown
in the last plot for the final cell structure were obtained during the second pass of the
data, and they exhibit a good fit in each cell. There are some unacceptable residuals
shown throughout the cells, but there are no failed bins since most of the residuals
are acceptable. Many of the unrestricted residuals in cyan are designated as such
due to the minimum number of allowable points following a split. In this simplified
example, 150 points may be more than necessary, but it was used consistently across
all examples.
123
Figure 4.4: Binned local residual characterization for piecewise linear baseline case.
The residual characterization is used to inform the cell splits, and this pro-
124
cedure can be better understood by describing how the residual threshold is com-
puted, since that is what determines the cutoff between acceptable and unacceptable
residuals. The residual threshold distinguishes between residuals associated with
acceptable modeling error due to noise, and those that are indicative of behavior
inconsistent with the current local model. The modeling data have constant mag-
nitude Gaussian white noise across the range of x as shown in the high-pass filtered
modeling data in Fig. 4.5(a). The RMS of the noise in each cell is shown with a
black dashed line, with a global RMS of 0.075. A moving window of the RMS of
the noise in each cell, multiplied by the residual threshold factor of 2, is shown in
Fig. 4.5(b). Given the constant magnitude noise estimate, the residual threshold
across all cells is shown to be around 0.15. Variations in the residual thresholds are
due to the relatively small noise RMS window of 2 seconds. The residual threshold
is a time-varying value in each cell that is dependent on the noise estimates for that
individual cell. Throughout time, if a residual point lies above the current residual
threshold for the cell that it belongs to, it is characterized as unacceptable; other-
wise, it is considered acceptable. Since each measurement belongs only to a single
cell, only the threshold for at most one cell is updated at each point in time, with
these points connected with a dashed line.
125
(a) High-pass filtered modeling data (b) Residual threshold
Figure 4.5: Residual threshold characteristics for piecewise linear baseline case.
Finally, once the cell structure decision making is understood, the final models
are presented. The time histories of bias y0 and slope yx parameters across all cells
are shown in Fig. 4.6. Notice the sharp jumps following the splits when the child cell
parameters are initialized with high uncertainty on the parent cell information, as
well as by incorporating the parent cell?s unacceptable residuals immediately. The
initialized uncertainty in the dispersion matrix is also responsible for the relatively
quick convergence of each child cell model?s parameters. Since each data point only
belongs to a single cell at each point in time, a dashed line is used to extend the
estimates from other cells while they are not being updated. The final parameter
estimates are shown in Fig. 4.7 for the local linear models, along with 95% confidence
intervals. The true parameter values from Eq. (4.1) are also shown, indicating
acceptable fits within the uncertainty.
126
Figure 4.6: Time history of parameter estimates across all cells for piecewise linear baseline
case.
Figure 4.7: Local model parameter estimates across all cells for piecewise linear baseline
case.
4.1.2 Minimum cell width
This example demonstrates how the minimum cell width specification can
impact the resulting cell split locations. The piecewise linear function was used to
run 25 simulation test cases with a minimum cell width of 0.05, and the resulting
split locations and model fits are shown for all cases in Fig. 4.8. All possible split
127
locations are shown as dashed gray vertical lines. Every test case was successful in
determining the correct split locations at x = 0.2, x = 0.6, and x = 0.8, as well as
providing good local model fits to the data.
Figure 4.8: Local model fit for piecewise linear model with 25 simulation cases and a
minimum cell width of 0.05.
When the minimum cell width was reduced to 0.025, the resulting cell struc-
tures are shown in Fig. 4.9 for 25 simulations, with the chosen splits indicated, along
with how many runs resulted in each split location. Although the local model fits are
still acceptable in each cell, the increased allowable split resolution makes it more
difficult to distinguish the true split locations through the noise content in the data.
If split location precision is highly prioritized, it is important that the minimum cell
width is wide enough so that the discrete locations are not hidden by noise. It is
also important that the proposed locations lie at those true split points, which may
not be known in advance. Specifying a reasonably low maximum number of cell
bins can help ensure that initially wider split locations are considered. However,
this is an extreme example of a piecewise linear function, whereas for more practical
128
higher order data sets, the ?true? split location is less straightforward.
Figure 4.9: Local model fit for piecewise linear model with 25 simulation cases and a
minimum cell width of 0.025.
4.1.3 Modeling data noise sensitivity
As demonstrated in the previous example, the noise content in the data may
have a small influence on the resulting split locations, but the SPLITR method
results are not overly susceptible to overfitting the noise in the data, considering
that a sufficient estimate of the noise is obtained. This is because the residual
threshold, which determines the splits, is based on an estimate of the noise content
in the data. When the SNR for the modeling data was reduced to 4, the resulting
split locations and piecewise linear model fits are shown in Fig. 4.10(a), with the
global smoothed model in Fig. 4.10(b). Although there is a significant increase in
noise magnitude compared to the baseline case, and the residuals are accordingly
much larger, there are still only 4 cells placed in reasonable locations. In this case,
the global smoothed model plays a role in outputting a more useful final model.
129
The high-pass filtered modeling data are shown in Fig. 4.11(a) with an RMS of
0.386, and the residual threshold over time is shown in Fig. 4.11(b) with ?? = 1.5.
Compared to the baseline cases of noise and residual threshold shown in Fig. 4.5(a)
and Fig. 4.5(b), respectively, notice that it was more difficult to trigger a split in
this case given the low SNR, and correspondingly large residual threshold. This is
an important characteristic of the SPLITR method to prevent excessive splits to
overfit data with significant noise.
(a) Local model fits (b) Global smoothed model fit
Figure 4.10: Local and global model fits for piecewise linear data noise sensitivity test
case with SNR = 4.
130
(a) High-pass filtered modeling data (b) Residual threshold
Figure 4.11: Residual threshold characteristics for piecewise linear data noise sensitivity
test case with SNR = 4.
4.2 1D quadratic data
The next set of examples looks at 1D quadratic data to demonstrate how
higher order functions may be modeled as a collection of linear components, while
relying on the global smoothing to tie the local models together. The quadratic test
data are generated through the function shown in Eq. (4.2) with ?5 ? x ? 5, and
a minimum cell width of 0.25.
y = 0.5x2 (4.2)
This section shows test data results for a baseline test case, as well as to
demonstrate the sensitivity of the results to the filter window, the smoothness fac-
tor, the residual threshold factor, varying noise magnitude, varying noise through
multiple simulations, and model adaptation.
In each example, 2 passes through the input space were completed, with 2500
131
evenly spaced points during each pass. In contrast to the baseline piecewise linear
test case in which each piecewise segment changes drastically from the previous
one, a local linear model would be constantly varying for quadratic data, with the
child cell models not significantly different from the parent models. Therefore, this
example used a cell initialization proportion of 1, as discussed in Section 3.5.
4.2.1 Baseline case
This baseline quadratic test case shows how the SPLITR method may be used
to model higher order test data. Figure 4.12 displays the time history of the modeling
data. Gaussian white noise with an RMS of 0.21 was added to the output data for
an SNR of 18. Similar to the piecewise linear case, the splits occur during the first
pass of the x-data, while the second pass is devoted to updating the parameters.
The cell structure development is shown in Fig. 4.13 with relatively evenly
spaced partitions to characterize this higher order function with linear segments.
The final split locations and local linear models are presented in Fig. 4.14(a), with
the global weighted model in Fig. 4.14(b). The Gaussian validity functions that
provide the global weighting are shown in Fig. 4.15(a), with the normalized validity
functions in Fig. 4.15(b). Recall that the Gaussian weighting functions are centered
in each cell, with a standard deviation that is proportional to the cell width. As
a result, the cells with smaller widths have reduced validity as indicated by the
magnitude of w and the more limited reach of those functions, whereas the larger
cells have a stronger influence on the global model output. The high-pass filtered
132
output data in Fig. 4.16(a) show the constant magnitude Gaussian white noise in
the output data. This is used to inform the residual threshold across the cells, as
presented in Fig. 4.16(b) with ?? = 2.
Figure 4.12: Time history of response variable for quadratic baseline case.
Figure 4.13: Cell structure evolution for quadratic baseline case.
133
(a) Local model fits (b) Global smoothed model fit
Figure 4.14: Local and global model fits for quadratic baseline case.
(a) Validity functions (b) Normalized validity functions
Figure 4.15: Validity functions for quadratic baseline case.
134
(a) High-pass filtered modeling data (b) Residual threshold
Figure 4.16: Residual threshold characteristics for quadratic baseline case.
The parameter estimates for the final cells are shown in Fig. 4.17, along with
the true estimates obtained from analytical derivatives. Each local model parameter
matches the true value within the estimated uncertainty, except for the parameters
in cell 15, where the true values lie just outside the 95% confidence intervals. While
the forgetting factor is useful to allow the child cells to more rapidly adjust from
the parent models, it may also cause information to be unnecessarily lost once the
initial adjustment has occurred.
The time history evolution of the parameters is also shown in Fig. 4.18. It is
apparent that, in contrast to the piecewise linear example that exhibited very sharp
transitions in the parameters between cell models, for this higher order data the
child cell models require less of an adjustment from the parent models.
Finally, Fig. 4.19 shows the global parameter estimates for each of the pa-
rameters that are weighted by the normalized Gaussian validity functions. Since all
of the local models have the same model structure, the effective global parameter
estimates can be visualized individually as the weighted combination of each of the
135
local parameter estimates. While the true parameter estimates for y0 and yx across
x should be parabolic and a straight line, respectively, the weighting introduces
oscillatory behavior between cells. This is also apparent in the global weighted
model output in Fig. 4.14(b), which is obtained as a weighted combination of the
local model outputs. This characteristic could be improved with a more appropriate
weighting that is customized for a given data set and cell structure.
Figure 4.17: Local model parameter estimates across all cells for quadratic baseline case.
Figure 4.18: Time history of parameter estimates across all cells for quadratic baseline
case.
136
Figure 4.19: Local and global parameter estimates for quadratic baseline case.
4.2.2 Filter window
This example shows the effect of the filter window specification. In the baseline
quadratic test case, the filter window was set to 2 seconds, and as a result, the
residual threshold factor shown in Fig. 4.16(b) fluctuates significantly about the
true point. If instead the filter window was set to infinite time, the resulting cell
structure development is shown in Fig. 4.20, with the model fits in Fig. 4.21, and
with the results not differing significantly from the baseline case. The high-pass
filtered noise data are displayed in Fig. 4.22(a), and the residual threshold is shown
in Fig. 4.22(b). This example exhibits a much more stable estimate of where the
residual threshold lies across all cells, given the constant magnitude Gaussian white
noise added to the data. If the noise characteristics are expected to be constant
across the entire data set, a larger filter window will ensure a more stable residual
threshold, and therefore a more consistent characterization of residuals across the
cells.
137
Figure 4.20: Cell structure evolution for quadratic model with filter window variation.
(a) Local model fits (b) Global smoothed model
Figure 4.21: Local and global model fits for quadratic model with filter window variation.
138
(a) High-pass filtered modeling data (b) Residual threshold
Figure 4.22: Residual threshold characteristics for quadratic model with filter window
variation.
4.2.3 Smoothness factor
This example shows the impact of the global weighting smoothness factor
?s, which influences how the local models are combined. For the baseline case in
Fig. 4.14(b) with ?s = 1, notice that there are apparent waves in the global model
fit, particularly in the regions crossing over the cell boundaries between the local
models.
If ?s = 0.25 instead for the same baseline test data, the global smoothed
model is shown in Fig. 4.23 with even more sharp variations in slope across the
cell boundaries. The validity functions in Fig. 4.24 are also much more narrow
compared to the baseline validity functions in Fig. 4.15, and appear almost as boxcar
functions. This could be used for a case in which the model is expected to exhibit
highly localized behavior and where a smooth transition is less critical, such as in
the piecewise linear case in Section 4.1.1.
139
When instead ?s = 1.5, the resulting global model is shown in Fig. 4.25. While
the global output is much smoother in this case, there is added error particularly
around x = 0 where the local models far outside this region are influencing the
global model output at that point. This is confirmed in Fig. 4.26 which shows the
significant overlap in validity functions across the cells, with the inner cells affected
most strongly. This characteristic could be highly desirable if, for example, smooth
derivatives of the model are required for purposes such as control law design. The
unique composition of the SPLITR model as a combination of smaller parts also
indicates that the best representation of the data may not be a model with the
minimum error if there are other considerations such as smoothness. There may be
tradeoffs between obtaining a good local model fit and a continuous global smoothed
model. One advantage of the SPLITR method in this context is that the local models
are estimated independently of the adjacent cells and of the validity functions, which
offers flexibility in designing and customizing the weighting functions for a specific
application.
140
Figure 4.23: Global smoothed model fit for quadratic test case with smoothness factor
?s = 0.25.
(a) Validity functions (b) Normalized validity functions
Figure 4.24: Validity functions for quadratic test case with smoothness factor ?s = 0.25.
141
Figure 4.25: Global smoothed model fit for quadratic test case with smoothness factor
?s = 1.5.
(a) Validity functions (b) Normalized validity functions
Figure 4.26: Validity functions for quadratic test case with smoothness factor ?s = 1.5.
4.2.4 Residual threshold factor
The next example demonstrates the impact of changing the residual threshold
factor, ?? . For quadratic test data generated using the same specifications as in
142
the baseline case, ?? was raised from 2 to 3, and the resulting local and global
model fits are shown in Fig. 4.27. Only 2 splits were performed, resulting in an
altogether poor model fit. Despite the RMS of the noise in this case in Fig. 4.28(a)
exhibiting similar character to the baseline quadratic test case shown in Fig. 4.16(a),
the residual threshold for this case in Fig. 4.28(b) is larger than that shown for the
baseline case in Fig. 4.16(b). As a result, the residual characterization in Fig. 4.29
shows residuals characterized as acceptable for a wider range of nonlinear behavior,
thus resulting in fewer splits allowed.
The residual threshold factor can be considered the most impactful SPLITR
algorithm user specification due to its direct influence on the residual threshold,
and therefore on the split determinations. If ?? is too small, then excessive splits
may result, there may not be sufficient data in each cell to accurately estimate the
parameters, and the resulting global smoothed model may be overly choppy. If ??
is too large, then necessary splits may not be recognized. For cases with known
Gaussian white noise such as in this chapter, ?? = 2, i.e. 2? is shown to provide a
good tradeoff between these extremes. For cases where the noise content is unknown,
and/or contains nonuniform frequency distribution and other unmodeled influencing
factors such as what is typically found in experimental data, ?? must be larger. For
the experimental flight test data discussed in the next chapter, ?? = 4 was used.
This was chosen based on studying the NASA E1 flight data, and was also applied
to flight data from another aircraft. Future work may automatically adjust ?? based
on further analysis of the noise content.
143
(a) Local model fit (b) Global smoothed model fit
Figure 4.27: Local and global model fits for quadratic model with residual threshold factor
variation.
(a) High-pass filtered modeling data (b) Residual threshold
Figure 4.28: Residual threshold characteristics for quadratic model with residual threshold
factor variation.
144
Figure 4.29: Binned local residual characterization for quadratic model with residual
threshold factor variation.
145
4.2.5 Varying noise magnitude
The next test case shows the sensitivity of the SPLITR results to varying
magnitude Gaussian white noise in the modeling data that is a function of y. This
scenario, while undesirable from a statistical standpoint, is very common in practice,
such that the noise magnitude is a function of one or more variables. In aerodynamic
modeling, for example, there may be increased noise at higher angles of attack where
there are unsteady effects and additional structural vibrations. Since the noise
content estimates are customized for each cell in the SPLITR model development,
variations in noise magnitude can be accounted for in a localized manner.
In this example, Gaussian white noise was added to the data with the magni-
tude proportional to y. The time history of modeling data is shown in Fig. 4.30, with
2 passes through the x-data. Note that the splits only occur during the periods of
data with reduced noise. The cell structure development is shown in Fig. 4.31, and
the local and global model fits in Fig. 4.32. Although there is a significant increase
in noise at larger y values, there are in fact fewer splits in those regions as in the
less noisy area of y < 4 because the residual threshold was adjusted accordingly in
an automated way. The plot of the high-pass filtered modeling data in Fig. 4.33(a)
shows the proportional noise content, and Fig. 4.33(b) shows the resulting residual
threshold that varies significantly in each cell. As a result, a residual that is char-
acterized as unacceptable in cell 6, for example, may be considered acceptable in
cell 2, resulting in fewer splits allowed in the regions with higher magnitude noise
to prevent overfitting.
146
Note also that since there is a relatively small RMS window of 2 seconds, the
estimate of noise content even in a single cell changes over time. At around t = 35
seconds in Fig. 4.30, the final split is performed and the data obtained for the next
30 seconds or so lie in cell 9. However, Fig. 4.33(a) shows that the noise magnitude
in cell 9 in fact varies significantly across the range of x. This is reflected in the
residual threshold in Fig. 4.33(b) where initially the threshold is smaller, and then
increases around t = 50 seconds to reflect the largest magnitude noise around x = 5,
and then reduces once again. A similar trend can be seen in the threshold for cells
1 and 2.
Finally, it is worth acknowledging that the global model fit in Fig. 4.32 is not
particularly good around x = 0 compared to the local model fits. This is due to
the validity functions, shown in Fig. 4.34, which cause the local models in the wider
cells 2 and 9 to influence the global model output in the narrower cells toward the
middle.
Figure 4.30: Time history of response variable for quadratic model with varying noise
magnitude.
147
Figure 4.31: Cell structure evolution for quadratic model with varying noise magnitude.
(a) Local model fit (b) Global smoothed model fit
Figure 4.32: Local and global model fits for quadratic model with varying noise magnitude.
148
(a) High-pass filtered modeling data (b) Residual threshold
Figure 4.33: Residual threshold characteristics for quadratic model with varying noise
magnitude.
(a) Validity functions (b) Normalized validity functions
Figure 4.34: Validity functions for quadratic model with varying noise magnitude.
4.2.6 Multiple simulations
The quadratic baseline case was run 25 times with constant magnitude Gaus-
sian white noise added to the data to show the reliability of the SPLITR results
across variations in the noise content of a given data set. The smoothed global
149
model fits for all cases are overlaid on all of the test data in Fig. 4.35, and show
good quadratic fits with variations in the waves seen across the cell boundaries.
Figure 4.36 shows the local slope parameter estimates for all test cases. Since there
was a relatively small minimum cell width of 0.25 and there is no ?true? breakpoint
location for fitting a straight line to a curved one, each test case placed the local
models at slightly different points based on the noise in each simulation. Nonethe-
less, the parameter estimates lie on or near the true values, and the groupings of
colors show the clustering of split locations across the different simulations.
This example demonstrates the reliability of the SPLITR modeling results
despite variations in the constant magnitude random noise added to the data. In
particular, it shows that despite the split locations being chosen at slightly different
points along the range of x, the slope estimate in each local cell was adjusted to
model the data contained in that region. While this method focuses on obtaining
sufficient local model estimates as shown in Fig. 4.36, it also demonstrates that the
global smoothed models in Fig. 4.35 are able to consistently combine the piecewise
models to obtain the quadratic shape.
150
Figure 4.35: Global model fits for multiple simulations with quadratic data.
Figure 4.36: Local parameter estimates for multiple simulations with quadratic data.
4.2.7 Model adaptation
This example shows how the SPLITR model can adapt to a function that
is modified during the model development. An advantage of the SPLITR model
is that if the system were to adapt or sustain damage that would affect only a
151
small part of the input space, then only the relevant localized models would need
to be adjusted, whereas the rest of the models could remain intact. This example,
however, investigates a significant global model change, as if the system were to
become unstable.
The initial baseline configuration is a quadratic model as given in Eq. (4.2).
Data were generated with an SNR of around 18 in this case. Two passes of the x-data
were completed from x = ?5 to x = 5 with all of the baseline user specifications,
and the resulting local and global models are shown in Fig. 4.37.
A sudden system adaptation is then simulated by generating the next set of
data from a new function as shown in Eq. (4.3).
y = 25? 0.5x2 (4.3)
The only other change required during the adaptation phase was that the
child cells are initialized with a much larger level of uncertainty to allow the models
to adapt more quickly from the no longer relevant parent cell parameter initializa-
tions. Otherwise, it would require a significant amount of new data to overcome
the certainty from the parent models. This modified specification of initializing
the dispersion matrices with high uncertainty was performed throughout the entire
simulation, not just during the adaptation.
Two additional passes of the x-data were then completed with data from the
new quadratic function. The final local and global model fits are shown in Fig. 4.38,
where the model effectively ?forgot? about the initial model and fully adapted to the
152
new function. This was accomplished through the ability to add partitions where
the existing model is not performing well. Note that splitting the model is the only
tool that the SPLITR method has to use if the existing local model is not performing
well. As a result, for cases of large adaptation such as in this example, it is difficult
to balance the tradeoffs between forgetting too much prior information and adapting
quickly to new information. For this reason, in this example the minimum cell width
was set to 1, instead of 0.25 as it was for the previous quadratic cases. This was to
ensure that the model is not overfit with excessive partitions that may occur during
the adaptation due to the poor initial model fit, and that the cells are wide enough
to obtain data with a high enough SNR. Furthermore, the initial resolution of 0.25
was to enable the model to choose the split locations from among a wider range of
possibilities, but not intending to allow a split at every 0.25 increment. Nevertheless,
when this simulation was run with a minimum width of 0.25, the results were still
reasonable.
The cell structure evolution throughout the combined data is shown in Fig. 4.39,
with the combined time history of modeling data in Fig. 4.40. Although the com-
plexity of the quadratic function did not change for the new configuration, new cells
were still added to capture the changes. If the adapted configuration was instead
the baseline, only the 4 original cells would be needed for the SPLITR model. The
additional splits as utilized in this example are meant to be a tool that can be used
to hasten the adaptation. For the final cell structure, methods such as pruning may
be appropriate in future work to effectively combine adjacent cells for which the
parameters are not sufficiently different from each other, to obtain a simpler final
153
model. The SPLITR logic would need to be further developed to more effectively
and autonomously detect a fault or model adaptation, and to adapt the modeling
accordingly.
(a) Local model fit (b) Global smoothed model fit
Figure 4.37: Local and global model fits for quadratic model with model adaptation using
the first set of data.
(a) Local model fit (b) Global smoothed model fit
Figure 4.38: Local and global model fits for quadratic model with model adaptation using
the full set of data.
154
Figure 4.39: Cell structure evolution for quadratic model with model adaptation.
Figure 4.40: Time history of response variable for quadratic model with model adaptation.
4.3 1D cubic data
The final set of 1D test cases looks at data generated from a cubic function
defined in Eq. (4.4) from x = ?5 to x = 3.
155
y = 0.25(x3 + 3x2 ? 6x? 8) (4.4)
A baseline test case will be shown for data simulated with constant magnitude
Gaussian white noise added to the data, and then a second example with varying
magnitude noise. Data were generated with 2 passes of the input space and 2500
evenly spaced points across the range of x in each pass.
4.3.1 Baseline case
For the baseline cubic test case, Gaussian white noise with an RMS of 0.277
was added to the data, resulting in an SNR of 10. The final split locations and local
and global models are shown in Fig. 4.41, with the high-pass filtered modeling data
in Fig. 4.42. Aside from several sharp points in the smoothed model that result
from the Gaussian global weighting, the SPLITR model captures this higher order
data well with piecewise local linear models by offering increased resolution around
the highly curved regions. As discussed in Section 4.2.3, additional smoothing for
this higher order case could improve the global model fit and reduce the sharpness
of the breakpoints.
156
(a) Local model fit (b) Global smoothed model fit
Figure 4.41: Local and global model fits for cubic baseline case.
Figure 4.42: High-pass filtered modeling data for cubic baseline case.
4.3.2 Varying noise magnitude
In this example, the magnitude of the Gaussian white noise added to the data
was varied linearly across x, with minimum magnitude at x = ?5 and maximum
157
magnitude at x = 3. The resulting local and global model fits are shown in Fig. 4.43,
with the high-pass filtered modeling data in Fig. 4.44. This example demonstrates
that despite the varying noise magnitude compared to the previous case, and the
fact that the residuals will be larger toward higher x in this case, the resulting split
locations and models are very similar, indicating the insensitivity of the splitting
logic to the random noise variation. If the noise magnitude was significantly larger
in a particular region, it is possible that splits may not occur there, as the residual
threshold would be modified accordingly.
(a) Local model fit (b) Global smoothed model fit
Figure 4.43: Local and global model fits for cubic test case with varying noise magnitude.
158
Figure 4.44: High-pass filtered modeling data for cubic test case with varying noise mag-
nitude.
4.4 2D test data
This section shows SPLITR modeling results for test data that is a function of
two dimensions, z = f(x, y). The SPLITR splitting logic, as discussed in Chapter 3,
stores the residual information across each specified PV dimension, and can therefore
choose both the relevant split location and dimension based on the residual evidence
contained in the bins. This will be demonstrated through three test cases presented.
4.4.1 2D quadratic data, case 1
This test case uses data generated from the function in Eq. (4.5), and looks
at a quadratic contribution from only the y dimension.
159
z = x? (y + 20)2 (4.5)
The input space was defined over ?100 ? x ? 0, and ?100 ? y ? 0. There
were 41 evenly-spaced x and y data points generated, and then a mesh was formed,
resulting in 1681 points throughout the two-dimensional input space. These points
were organized into an array to traverse the input space by holding the y-dimension
constant and traveling back and forth across the range of x-data, with the resulting
time history shown in Fig. 4.45. Given the relatively small increment of EV data, a
second pass through the input space to update the parameters was not performed,
nor necessary. Only a very small amount of Gaussian white noise was added to
the output data in this case, which resulted in a large number of splits because
the residual threshold was relatively low. Note that since the SPLITR model is
developed in real time as the data are received, the manner in which the input
space is traversed may have a significant impact on the resulting split locations, and
by extension, the model fit. The baseline residual threshold factor of 2 was used for
this case.
Although allowable PVs included both x and y, the SPLITR model resulted in
several splits across only the y-dimension, as shown in Fig. 4.46. This is consistent
with the presence of nonlinear behavior in the output data as a function of y. Figure
4.47 shows the modeling data along with the SPLITR model fit, as well as the split
locations that divide the quadratic y dependency into many cells. Finally, Fig. 4.48
shows validation data at x = ?100, as well as the SPLITR model fit and split
160
locations. With R2 ? 1, the SPLITR model captures the data extremely well for
this test case.
Figure 4.45: Time history of input data for 2D quadratic case 1.
Figure 4.46: Cell structure evolution for 2D quadratic case 1.
161
Figure 4.47: Global model fit for 2D quadratic case 1.
Figure 4.48: Validation test data and model fit for 2D quadratic case 1.
4.4.2 2D quadratic data, case 2
This second 2D test case looks at test data from the function in Eq. (4.6), with
?5 ? x ? 5, and ?5 ? y ? 5. This case looks at modeling the interaction effect
between x and y.
162
z = xy ? 10x (4.6)
Gaussian white noise was added to the output data, with standard deviation
equal to two-percent of the RMS of the response variable. The input space of 1872
data points at 0.5 increments was covered twice in both a forward and backward pass
as shown in the time history in Fig. 4.49. In contrast to the previous example, in this
case the input space was traversed in a box-like formation as shown in Fig. 4.50. The
outer box was first drawn by holding y constant and covering the x range along the
bottom of the plot, and then proceeding up the right side, across the top, and down
the left side. The remaining boxes are then filled in, beginning with the bottom left
corner.
The cell structure evolution is depicted in Fig. 4.51, which shows each cell
structure from the initial configuration through the fourth split. The final cell
structure has 3 splits in the x-dimension and 1 in the y-dimension.
The SPLITR model is pictured in Fig. 4.52, showing the partition planes,
model surface, and modeling data. Note that there are some waves in the surface
that result from the Gaussian weighting, and that the fit is poorer at the edges of
the input space. The model fit can also be more easily visualized in Fig. 4.53 which
shows the modeling data and model output, with R2 = 0.996.
163
Figure 4.49: Time history of input data for 2D quadratic case 2.
Figure 4.50: Test data for 2D quadratic case 2.
164
Figure 4.51: Cell structure evolution for 2D quadratic case 2.
Figure 4.52: Modeling data and SPLITR model fit for 2D quadratic case 2.
165
Figure 4.53: Test data and model fit for 2D quadratic case 2.
4.4.3 2D quadratic data, case 3
This final 2D quadratic test case builds a model using data generated from
the function in Eq. (4.7), with ?5 ? x ? 5, and ?5 ? y ? 5. This case looks at a
quadratic influence in the x dimension, as well as an interaction effect.
z = xy ? 0.05x2 (4.7)
Two-percent Gaussian white noise was added to the output data, similar to
the previous case. Due to the large amount of curvature in this case, the input space
of 1872 data points at 0.5 increments was covered eight times through a combination
of forward and backward passes as shown in the time history in Fig. 4.54. The input
space was traversed in the same box-like formation as in the previous example.
This example is a case where the modeling data show significant curvature,
and a piecewise linear composition of highly nonlinear data results in a large number
of splits, as shown in Fig. 4.55. The model fit using validation data spanning the full
166
range of the input space is presented in Fig. 4.56, which shows the highly nonlinear
nature of the modeling data across the y-dimension, the resulting model surface,
and the split locations. The waves in the model surface associated with the global
weighting are also apparent, as well as a relatively poor fit at the corners. Finally,
Fig. 4.57 shows the model fit across only 2 passes of the input space, with a total
R2 of 0.99. It is clear there too that the model fit is poorest at the peaks, or edges.
While axes-oblique partitioning methods might be more conducive to modeling
highly nonlinear functions, this example demonstrates that a large number of axes-
orthogonal partitions can still provide a reasonable model fit.
Figure 4.54: Time history of input data for 2D quadratic case 3.
167
Figure 4.55: Cell structure evolution for 2D quadratic case 3.
168
Figure 4.56: Global model fit with validation data for 2D quadratic case 3.
Figure 4.57: Validation test data and model fit for 2D quadratic case 3.
169
4.5 Summary
While Chapter 3 presented the theory behind the novel SPLITR modeling
approach, Chapter 4 provided an in-depth study and illustration through a series of
simple examples that both described the algorithm logic, and explored the sensitivity
to specified parameters or data characteristics.
The 1D piecewise linear examples offered straightforward data sets to describe
in detail the model development process. The minimum cell width parameter was
shown to impact the precise allowable split locations. In practice, this means that
the user-specified minimum cell width must offer sufficient resolution to capture
discrete changes in the model at particular locations. For the piecewise linear test
data, this could be extremely important. However, most systems to be modeled
exhibit higher order behavior, and the ensuing examples showed that the precise
location of the splits was less important, as there are many possible breakpoints
that could still produce a sufficient global model. The SPLITR method robustness
to high-magnitude Gaussian white noise was also demonstrated.
The 1D quadratic test cases showed the SPLITR modeling approach effectively
applied to higher order data. The filter window size was shown to play a role in
the convergence of the residual threshold, but not in the resulting model fit. The
smoothness parameter ?s influenced the overlap between the local models, while
the residual threshold factor ?? directly impacted the residual characterization, and
therefore the tendency to split. The SPLITR method?s robustness to varying mag-
nitude Gaussian white noise was also demonstrated. Consistent acceptable global
170
model fits were shown for multiple simulations with random noise. A final 1D
quadratic test case demonstrated how the SPLITR model can adapt to a modified
function, which can be useful for an adaptive system with changing behavior.
The 1D cubic data examples were used to show results for a function with
increased curvature compared to the quadratic cases, and the SPLITR models were
able to capture the data with similar results when both constant magnitude and
varying magnitude Gaussian white noise were added.
Finally, results were shown with 2D quadratic test data to demonstrate how
the SPLITR method can be used to correctly identify the single dimension with non-
linear dependencies, as well as offer 2D splits when the nonlinear function requires it.
The first test case showed 1D splits in quadratic data where the nonlinear dependen-
cies were restricted to one dimension. The second case showed a more complex 2D
example using data generated from a function with an interaction term and a linear
term, and the final example showed data from a function with an interaction term
and a quadratic term. These cases resulted in a large number of axes-orthogonal
splits, and model fits that were altogether acceptable, yet with notable deficiencies
in certain areas of the input space.
Each test case presented in this chapter was devoted to showcasing how SPLITR
can be used to model data with specified dependencies, such as linear, quadratic, cu-
bic, and interaction effects. Nonlinear polynomial models for physical systems tend
to include a combination of one or more of those sets of modeling terms. Collec-
tively, these examples show how SPLITR could be used to capture the nonlinearities
associated with the higher order and multivariate terms by partitioning the input
171
space into smaller components that can be approximated as linear. This simple
representation can be applied even to complex functions, although the number of
cells required may be large.
With these example test cases offering clarity and insight into the SPLITR
method, the next chapter will explore applying SPLITR to more complex physical
systems for aerodynamic modeling. Applications include using simulated test data
of the F-16, and experimental data from the NASA E1 and T-2 aircraft.
172
Chapter 5
SPLITR Applied to Aerodynamic
Modeling with Simulated and
Experimental Flight Data
After the SPLITR modeling approach was introduced in Chapter 3, Chapter
4 demonstrated how the method operates, as well as the sensitivity of the results to
various input parameters, through a series of test cases using generic simulated test
data. It also showed how higher order nonlinear dependencies and interaction terms
could be captured with several local linear models. While the SPLITR method is
practical for any data set, it was motivated by, and developed within, the context
of aerodynamic modeling for the NASA L2F concept. This chapter will therefore
expand the test case complexity from that which was presented in Chapter 4 to
applied examples of aerodynamic modeling using both simulated and experimental
flight data.
First, the modeling process generally discussed in Chapters 2 and 3 will be
specifically applied to aerodynamic modeling for the aircraft applications discussed
in this chapter. Then an F-16 simulation will be introduced, and SPLITR modeling
results will be shown using simulated flight data. The NASA E1 and T-2 Generic
Transport Model (GTM) test aircraft will then be described, and the SPLITR re-
sults will be presented for the experimental flight test data obtained from each of
these test vehicles. These conventional aircraft will be useful to allow for interpreta-
tion and validation of the results together with experience-based expectations. Each
example highlights various practical aspects and benefits of the SPLITR modeling
capabilities, while collectively the test cases described in this chapter showcase the
utility and versatility of the SPLITR modeling method for experimental applica-
tions. In particular, this chapter demonstrates how SPLITR can be practically,
easily, and successfully used to develop aerodynamic models using real experimental
data sets that have low SNRs.
5.1 Aerodynamic modeling problem setup
The least-squares parameter estimation process used in this work is described
in Sections 2.3.4 and 3.2.4. Up until this point, the parameter estimation has been
presented in a generalized way that is agnostic to a particular function or system to
be modeled. This section shows how the response variables are calculated for the
aircraft modeling examples in this chapter.
For conventional fixed-wing aerodynamic modeling, the nondimensional aero-
dynamic force and moment coefficients are calculated using the following equations,
174
which are defined in the aircraft body axes and retain the full nonlinear aircraft
motion, while ignoring propulsion gyroscopic terms [21].
max ? T
CX = (5.1)
q?S
may
CY = (5.2)
q?S
maz
CZ = (5.3)
q?S
[ ]
Ix ? Ixz (Iz ? Iy)Cl = p? (pq + r?) + qr (5.4)
q?Sb Ix Ix
[ ]
Iy (Ix ? Iz) Ixz
Cm = q? + pr + (p
2 ? r2) (5.5)
q?Sc? Iy Iy
[ ]
Iz ? Ixz ? (Iy ? Ix)Cn = r? (p? qr) + pq (5.6)
q?Sb Iz Iz
The force equations for CX and CZ can alternatively be expressed in the wind axes
as
CL = ?CZ cos? + CX sin? (5.7)
CD = ?CX cos?? CZ sin? (5.8)
175
These dependent variables can be calculated in flight using body-axis measure-
ments of angular rates and translational accelerations, along with dynamic pressure
measurements and geometry and mass properties. For the E1 flight test data in
this chapter, a smoothed local differentiation method was used to compute the
angular accelerations in flight using a fixed-lag smoother with a two time sample
delay [14, 21]. When the thrust in Eq. (5.1) is not measured, CX is commonly
estimated to include the lumped body x-force as well as the thrust force, as was
done in this work [14]. Note that Eqs. (5.1 ? 5.6) are the typical representation
for rigid-body fixed-wing aerodynamics, but some of the modeling considerations in
Section 1.2.1 could require modified representations. For example, in rotorcraft sys-
tem identification the aerodynamic models are more often expressed using body-axis
dimensional forces and moments, so Eqs. (5.1 ? 5.6) are modified as in Eq. (1.3).
Or if there are significant and coupled aeroelastic properties, then additional sensor
data would be required to estimate both the aerodynamic and structural modes.
The response variables are expressed in terms of explanatory variables mea-
sured in flight, such as air flow angles (?, ?), body-axis angular rates (p, q, r), control
surface deflections (?), and an explanatory variable for thrust, such as advance ratio
(J). Section 3.2.1 describes in more detail the model variables, both the EVs and
the PVs, for aerodynamic modeling applications.
With the emphasis on model interpretability in this work, the estimated pa-
rameters can be considered as stability and control derivatives that describe the
local aerodynamics in each cell, and if each cell has the same model structure, these
influence coefficients can be compared across the global flight envelope.
176
5.2 F-16 simulation data
This section demonstrates the SPLITR modeling approach through a simpli-
fied aircraft example using data from an F-16 simulation found in SIDPAC [21],
and draws together several sample plots used for clarification throughout Chapter
3. The nominal geometry and mass properties of the F-16 aircraft used in the non-
linear simulation are summarized in Table 5.1. The static and dynamic data used to
build the aerodynamic database to support the simulation were obtained from wind
tunnel tests. A detailed description of the simulation can be found in Ref. [21].
Symbol Value Unit
c? 11.32 ft
b 30 ft
S 300 ft2
m 647.2 slug
Ix 9,496 slug-ft
2
Iy 55,814 slug-ft
2
Iz 63,100 slug-ft
2
Ixz 982 slug-ft
2
Table 5.1: F-16 geometry and mass properties.
The simulated flight data consisted of a maneuver that began at a trimmed
angle of attack of 4 deg at 25,000 ft. The nominal elevator input was slowly varied
along with simultaneous manual perturbations intended to excite the longitudinal
dynamics across a wide range of ?, as shown in Fig. 3.2. Effectively, two passes of the
? space are performed across the duration of the maneuver. Gaussian white noise
was added to the EVs with standard deviation set at two-percent of the magnitude
in the data. As a result, the noise content in the response variable CL increases with
177
? in a way that is similar to what is seen in flight, since there tends to be higher
noise or unmodeled dynamics at higher ?.
Using this simulation data, a single linear model was estimated for CL = f(?),
and the resulting residuals in Fig. 3.3(b) clearly show that deterministic modeling
information is not captured, which is the nonlinear behavior of CL at high ?. When
the SPLITR approach was used to estimate a model with ? = [?], ? was divided into
three cells, as shown in Fig. 3.2, and the corresponding residuals in Fig. 3.3(a) are
improved. The associated piecewise linear models shown in Fig. 5.1(a) represent the
modeling data in each cell well, and the weighted global nonlinear model is shown
in Fig. 5.1(b), with R2 = 0.98.
This simplified 1D example demonstrates how nonlinear flight data can be
successfully modeled with several piecewise linear functions using the SPLITR ap-
proach, that the resulting model captures the local and global aerodynamics well,
and that the partitions placed along the familiar lift curve are consistent with a
physical interpretation of the various ? regimes. However, aerodynamic forces and
moments are generally functions of multiple explanatory variables, and they are
more difficult to estimate from flight data that contain unknown errors, including
instrument noise, additional vehicle dynamics, and other disturbances. This F-16
example provides a useful visualization of the model fits, whereas the next section
will apply the SPLITR approach to experimental flight data obtained from the E1
aircraft.
178
(a) Local model fits (b) Global smoothed model fit
Figure 5.1: SPLITR model fits of CL = f(?) for F-16 data.
5.3 Practical aspects of modeling flight data
The SPLITR algorithm is intended to be an automated modeling method
that can be applied to various aircraft, so this section will summarize the algorithm
parameters and other specifications that still must be user-prescribed.
It is important to note that aside from ?? , all of the baseline user input speci-
fications and guidance described in Section 3.5, and used for the baseline simulation
cases in Chapter 4, were used for modeling the E1 and T-2 test data. Although
SPLITR contains several user inputs to allow increased flexibility where more in-
sight is available into the system, in general the modeling process was found to be
insensitive to small variations across the input specifications and throughout differ-
ent sets of flight data. The most impactful parameter was found to be the residual
threshold factor ?? , which is related to the cutoff between acceptable and unac-
179
ceptable residuals, and influences the split determinations. A value too large will
prevent any splits from occurring, and a value too small will result in an overly
complex model. In the simulated test data examples in Chapter 4, ?? = 2 was suf-
ficient for many data sets with small amplitude Gaussian noise. For flight test data
that contain unknown noise and other unmodeled information content, the thresh-
old needs to be raised to ensure that excessive splitting does not occur. For these
experimental data examples, ?? was set to 4, which effectively places the residual
threshold at 4?. Since the E1 and T-2 are conventional aircraft, the expectations are
that the aerodynamics are linear throughout much of the flight envelope. Therefore,
the models across neighboring cells are not expected to be significantly different, and
so the cell initialization proportion was set to 1, similar as for the quadratic test case
in Chapter 4, which will still allow the flexibility to account for nonlinear regions
such as stall. This is in contrast to the piecewise linear case where the child cell
models were expected to be drastically different from the parent cell models. In
these results, the maximum number of cells was not limited in order to demonstrate
the effective model complexity management.
For scaled aircraft, the dynamic modes are scaled according to the geometry
properties as
faircraft
fmodel = ? (5.9)
s
where s is the scale factor of the model [21]. General aviation aircraft typically
exhibit rigid-body dynamics of less than 1 Hz, so for the scaled aircraft used in this
180
work, the dynamics were estimated to be less than 2 Hz. Therefore, a fourth-order
high-pass Butterworth filter was used with a cutoff frequency of 3 Hz to estimate
the noise, similar to what was used in Chapter 4 [14].
Other information that must be defined in advance for the modeling approach
discussed in Section 5.1 includes the aircraft mass, geometry, and inertia proper-
ties to nondimensionalize the forces and moments. For each model, the regressors,
available PVs, and expected ranges of PVs must also be specified.
In this work, complex nonlinear behavior in the response variables is addressed
by partitioning the flight data to capture the simpler aerodynamics locally. However,
this approach can be viewed more generally such that any deterministic behavior
in the residuals as a function of the PVs outside of the residual threshold will be
treated the same by triggering a split. The regressors are specified in advance,
separate from the cell structure determination. So if an inadequate regressor pool is
chosen, either one that contains irrelevant or strongly correlated regressors, or one
that is missing important terms, then the algorithm may try to compensate for the
insufficient modeling information by splitting unnecessarily. These successive splits
may therefore not actually improve the model fit.
As discussed in Section 3.2.2, only linear regressors were chosen based on the
flight data, which results in an LMN consisting of local linear models. Higher-order
and multivariate terms can be included as well in each cell, which would yield a
network of local nonlinear models. This would likely reduce the model complexity
by resulting in fewer cells, but could also pose identifiability problems for each cell,
unless local model structure determination techniques were applied.
181
In these results, the modeling process was not performed onboard the aircraft,
but rather using data obtained from the test flights that were post processed in a
real-time simulation on a laptop computer. Since the flight computers recorded data
at 50 Hz, the data processing sequence operated at 50 Hz as well. The cell splitting
procedure ran at 5 Hz, which was beyond the expected range of rigid body dynamics
for the test aircraft. It could also have been set to run much slower to reduce the
computational load, since the splits do not occur frequently. The flight data used
in these results were obtained from several flights during which the pilot performed
low-? flight operations as well as high-? stall maneuvers. The pilot inputs were
overlaid with excitation inputs added to the control surface commands. The goal
was to obtain global data with high information content across a wide range of flight
variables.
5.4 E1 flight test data
First, the E1 test aircraft will be described, and then the SPLITR results will
be shown through in-depth case studies that describe the model development and
results for Cm and CX .
5.4.1 E1 test aircraft
The test vehicle used in this work was a 40% scale Extra 330SC radio-controlled
battery-powered aircraft, designated as E1 and shown in Fig. 5.2. The cruise speed
for E1 was approximately 50 kts, and the nominal altitude throughout the flights
182
used in these results was 600 ft above ground level. The control surfaces are ailerons,
flaps, elevator, and rudder. The geometry and mass properties for the flight vehicle
are summarized in Table 5.2, and were determined through ground-based tests. The
aircraft was instrumented with an inertial navigation system (INS) that provided
3-axis translational accelerations, angular rates, Euler angles, GPS position, and
velocity. An air flow angle vane mounted on a boom on the right wingtip measured
the angle of attack and sideslip angle, a pitot-static tube provided static and dynamic
pressure measurements, encoders measured each of the control surface deflections,
and a Hall-effect sensor measured the propeller speed. These measurements were
used to calculate the aerodynamic forces and moments given in Eqs. (5.1 ?5.6), as
well as for the explanatory variables used for modeling. In particular, the calculation
for advance ratio is given as
V
J = (5.10)
?d
where V is the airspeed, ? is the propeller angular rate, and d is the propeller
diameter. Additional information about the E1 test vehicle can be found in Refs. [6,
13].
183
Figure 5.2: E1 test aircraft (credit: NASA Langley Research Center).
Symbol Value Unit
c? 1.97 ft
b 10.17 ft
S 19.26 ft2
m 1.910 slug
Ix 2.964 slug-ft
2
I 8.776 slug-ft2y
Iz 11.716 slug-ft
2
Ixz 0.750 slug-ft
2
Table 5.2: E1 geometry and mass properties.
5.4.2 Case study of Cm model
For the case of modeling Cm, the model structure was defined as
qc?
Cm = Cm0 + Cm?? + Cmq + Cm ??e e + Cm J (5.11)2V J0
The PV was specified as ? with a minimum cell width of 1 deg, since it is expected
that the nonlinearities in Cm are based on ?.
Figure 5.3 shows the time history of the response variable Cm and the corre-
sponding PV ? over a period of flight test data used for modeling. The modeling
184
data of Cm were estimated to have an SNR of 5.2. Note that the nonlinear depen-
dency of Cm on ? is evident as the splits occur during the high-? maneuvers as
the flight envelope is expanded and new local models are required. There were 4
large-amplitude stall maneuvers performed throughout the duration of the modeling
data, with 2 splits triggered. The remaining flight data were used to update the
resulting cells.
In flight test data, particularly when the PTIs are active across all axes simulta-
neously, there are variations in all of the flight variables at once. Visually discerning
where the nonlinear breakpoints may be placed as a function of the known PVs can
therefore be unclear, as shown by the plot of Cm vs. ? in Fig. 5.4. This is in contrast
to the simplified example using the F-16 data, and implies that the algorithm logic
must be relied upon to determine the appropriate breakpoints.
Figure 5.3: Time histories of response variable and PV modeling data for E1 Cm model.
185
Figure 5.4: Cm vs. ? modeling data for E1 Cm model.
The cell structure evolution for Cm parametrized across the range of ? is shown
in the hierarchical depiction in Fig. 5.5. The first split partitions the high-? from
the low-? aerodynamics, while the second split partitions what appears to be the
stall region from the post-stall region. The final cell structure shown with three cells
is consistent with a physical interpretation of an individual linear model across a
confined range of low-, mid-, and high-? data.
186
Figure 5.5: Cell structure evolution for E1 Cm model.
A closer look through a depiction of the cell structure decision-making process
in Fig. 5.6 clarifies how the cells evolve over time based on the available data. The
minimum bin widths of 1 deg used to discretize the possible bin boundaries are
shown in light gray.
Beginning with one linear model across the range of ?, notice that at low ?,
although there are numerous residuals that are characterized as unacceptable, their
relative magnitude is still small compared to the acceptable residuals, and so the
bins pass the status check. At mid ?, however, the unacceptable residuals begin to
outweigh the acceptable ones, and bins 8-9 are considered failed, so the first split is
performed at the left boundary of bin 8. Note that there are not yet enough data
points in bin 10 for it to fail. Correspondingly, the second split partitions the mid-?
data from the highest-? region before bin 4 of cell 3. Although bins 7?10 in cell 3
187
also contain numerous unacceptable residuals, the total number of residuals in each
bin is below the specified minimum amount, and so these bins are not considered
failed yet. After the splits are performed and additional data are gathered across
all cells, the model fit in cell 2 appears reasonable, and while there are still large
magnitude residuals in cells 4 and 5, no further splits were made. Note also, for
example, that the magnitude of the acceptable residuals in cell 5 is much larger
than the magnitude for those allowed in cell 2 because the residual threshold is
dependent on the noise levels. It is possible that further high-? maneuvers would
provide additional information to perform another split in that region, but the model
fit at high-? is not necessarily anticipated to be of high quality due to the noise and
other unmodeled content.
188
Figure 5.6: Binned local residual characterization for E1 Cm model.
To illustrate the dependency of the split decision making on the residual
threshold more clearly, Fig. 5.7(b) shows the residual threshold over time for all
189
cells. Notice that the noise estimates increase across ? as expected, with the high-
pass filtered modeling data reflecting this in Fig. 5.7(a). Since the filter window was
only specified to contain 2 seconds of data, the residual threshold fluctuates sub-
stantially, which is particularly noticeable in cell 2 which contains a large amount
of low-? data. However, this small window provides the localized resolution to be
able to capture the rapidly changing noise content during the stall maneuver that
occurred around 75 seconds, and after which no split was performed. The small
window size enabled the residual threshold to quickly adapt to the current condi-
tions. The residual threshold, based on the current assessment of noise content in
the data, is the gateway toward both allowing further splits, but also preventing
them unnecessarily to avoid splitting due to poor model fit brought about by noise.
One of the pervasive challenges throughout this work has been shown to be
balancing the tradeoff between remembering past information and adapting quickly
to new data. In particular, the noise content in experimental data can be expected
to vary as a function of one or more variables, so it is advantageous for the SPLITR
method to adjust the model fit expectations based on rapid adaptability to the
current conditions. This will ensure that excessive splitting or overfitting does not
occur where the residuals exhibit poor model fit due to increased noise content, and
not due to a poor characterization of deterministic modeling information.
190
(a) High-pass filtered modeling data (b) Residual threshold
Figure 5.7: Residual threshold characteristics for E1 Cm model.
Figure 5.8 shows the final parameter estimates for each cell, along with their
associated 95% (2?) uncertainty bounds, plotted as a function of ? at the center
of the associated cells. The model fit error variance in Eq. (3.6) was computed in
batch using only the measurements that were actually contained within the range
of each cell, as it was done throughout this work. This plot offers further valuable
physical insight into the aerodynamics. The static longitudinal stability Cm? is
appropriately negative for a statically stable aircraft, and the magnitude gets larger
across the cells, indicating a stronger restoring moment as ? is increased. The
pitch damping derivative Cmq describes the pitching moment that is caused by
pitch rate. It is related to the added lift generated on the horizontal tail due to
aircraft rotation about the center of mass, and is negative to indicate dynamic pitch
stability. It is largest in magnitude at low ? and decreases across the range of ? as the
horizontal tail effectiveness is reduced. Accordingly, the elevator effectiveness Cm is?e
maximized at low ? and reduced at higher ?. Advance ratio did not have an apparent
impact on pitching moment at low ?, but the dependency increased at higher ?,
191
corresponding to when power was added during the stall maneuvers. Note that the
uncertainty tends to be larger for the estimates in the higher-? cells where less data
are obtained, where noise levels are larger, and where the aerodynamics are more
difficult to model, with unsteady and coupled lateral-directional behavior possible.
The physical insight offered by the stability and control derivatives estimated across
the range of ? was consistent with engineering expectations in this case. For future
test cases of vehicles with unknown aerodynamics, these parameter estimates can
offer valuable insight into localized aerodynamics across different parts of the flight
envelope. The local aerodynamic models can also be useful for local linear control
law design in particular flight regimes.
Figure 5.8: Local model parameter estimates across all cells for E1 Cm model.
Figure 5.9 shows an example of the time history of the local parameter estimate
Cm? through all of the splits. Note the definitive discontinuities in the estimates
following each split, as this is when the unacceptable residuals are incorporated into
the child cell models instantaneously. The parameter estimate for cell 2 appears to
192
converge due to much data, whereas those for the other final cells are only updated
intermittently during the remaining high-? maneuvers following the last split. Note
also the perturbations in the estimates for cells 4 and 5 at around 184 seconds and
286 seconds, as that is when the latter two stalls occur in the modeling data.
Figure 5.9: Time history of Cm? estimates across all cells for E1 Cm model.
As discussed in Section 3.2.4, each cell?s model is locally approximated irre-
spective of the adjacent cells or of the global weighting, which retains the physical-
ity and interpretability of the resulting parameter estimates. The global nonlinear
model is then calculated using the validity functions shown in Fig. 5.10 that apply
weights to the local models in each cell and describe the influence of each model
across the range of the PV. For example, Fig. 5.11 shows the weighted global pa-
rameter estimate for Cm? across all cells, along with the local estimates relevant
most strongly at the center of the respective cells. The waves in the global estimate
across the cell boundaries are a result of the normalization of Gaussian validity
193
functions that is required to enforce a partition of unity, and which can potentially
cause significant and unintentional deformation of the weighting functions for many
cells of varying widths [73]. This effect could be mitigated with a wider Gaussian
standard deviation that is specified with a larger ?s.
(a) Validity functions (b) Normalized validity functions
Figure 5.10: Validity functions for E1 Cm model.
Figure 5.11: Local and weighted global parameter estimates of Cm? for E1 Cm model.
194
Conventional aircraft exhibit linear aerodynamics about an equilibrium point,
and a linear model could be sufficient at each point to describe most of the operating
envelope. For the purposes of global modeling, a nonlinear model is shown to offer
improvements over a single linear model defined across the entire data set. The error
from the SPLITR model with 3 cells was compared to that of a batch linear model
developed using all of the same data. The linear comparison model can be thought
of as having the initial cell structure shown in Fig. 5.5, with only 1 cell across the
entire range of ?. The SPLITR model recognized the inherent nonlinearities in Cm,
and this model with 3 local cells provides a better fit with R2 = 0.69, compared to
the linear model, which has R2 = 0.58 for the modeling data.
Figure 5.12 further shows the RMS of the global residuals across 15 bins for
each model. The SPLITR model shows a reduction in error at low ? and high ?
compared to the linear model, but little to no improvement around where stall is
expected to occur. Stall behavior is a complex and nonlinear phenomenon with
coupled and unsteady effects that are not accounted for in these simplified local,
linear models. Note in Fig. 5.6 that there is scarce information obtained in the
region of the right-most bin shown here in cell 5, and the error is also largest there.
Sources of modeling error can also include measurement noise, model structure
errors resulting from local nonlinearity or an inadequate regressor pool, in-flight
disturbances, neglected dynamics such as unsteady, structural, or aeroelastic effects,
and violations of the quasi-steady assumption inherent in the model structure. It
is important for model predictive capabilities that the noise content is not overfit
by the model, and SPLITR has been shown to successfully partition the model and
195
provide a good model fit. Validation data, presented next, will confirm that the
model provides adequate prediction as well.
The basis of the cell structure determination process is to reduce the RMS
magnitude of the residuals as a function of the PVs that are associated with the
nonlinearities in the data. For this conventional E1 vehicle, the aerodynamics are
fairly linear throughout much of the flight envelope, which is why the linear model
also performed well, but the logic behind this method is also applicable for less
conventional vehicles. Still, improvements were seen in this case, and the SPLITR
model succeeded in reducing the RMS magnitude of the global residuals as a function
of ? compared to the linear model, as shown in Fig. 5.13.
Figure 5.12: RMS of binned global residuals for modeling data for E1 Cm model.
196
(a) SPLITR nonlinear model (b) Linear model
Figure 5.13: Global residuals vs. PV for modeling data for E1 Cm model.
A separate set of flight data that was not part of the modeling process was used
for model validation to assess the model?s predictive capabilities. The prediction
error for the SPLITR model was also compared to that of a batch linear model.
The time histories of Cm and ? for this data set are shown in Fig. 5.14. The
validation data included additional low-? maneuvers and 3 stalls to ensure a wide
range of ? was represented so that the full extent of the model could be tested.
The linear model achieved an R2 of 0.63 for the validation data, while the SPLITR
model produced an R2 of 0.74. Figure 5.15 shows the RMS of the binned residuals
across the cells, and also indicates improvements compared to the linear model
across most of the ? range. Figure 5.16 shows how the residuals were effectively
reduced in magnitude compared to the linear model, particularly at high ?.
197
Figure 5.14: Time histories of response variable and PV validation data for E1 Cm model.
Figure 5.15: RMS of binned global residuals for validation data for E1 Cm model.
198
(a) SPLITR nonlinear model (b) Linear model
Figure 5.16: Global residuals vs. PV for validation data for E1 Cm model.
The localized regions and associated models identified through SPLITR can
provide further insight into where additional modeling data need to be gathered, and
can potentially be used to inform an autonomous envelope expansion and model de-
velopment algorithm that drives the experiment design. Each local model can be
evaluated based on its fit error, the noise levels of the data it contains, its parameter
estimate uncertainties, and the number of measurements, among other metrics. The
regions with poor local model fits due to lack of data can then be more easily identi-
fied, and these regions can be returned to in order to obtain additional information.
It is important to note that a crossplot of the data coverage of certain variables
alone, or another form of data density depiction, does not provide this information
since the relative number of data points in a certain area of the flight envelope is
not a definitive indicator of the model quality in that region. A model fit can be
sufficient with few data points if the dynamics are linear and well behaved, and
199
conversely, a poor fit can result even in a highly dense region if the model structure
is not appropriate and/or the experiment is not performed well. Additionally, larger
noise content in particular regions can lead to a poorer fit, regardless of the amount
of data.
Figure 5.17 illustrates this with a histogram of the ? modeling data that is
colored with the binned RMS of the modeling error of the SPLITR model, which
is shown quantitatively in Fig. 5.12. As expected here, the error is in fact inversely
correlated with N at high ?, but note that the model performs well in the lowest-?
region where relatively few measurements also lie.
Figure 5.17: Histogram of PV modeling data with RMS of error for E1 Cm model.
5.4.3 Case study of CX model
This section discusses the SPLITR CX model developed using the same E1
flight data as for the Cm case study in the previous section. The model structure
200
for CX is defined as
qc?
CX = CX0 + CX?? + CXq + CX ?e + CX J (5.12)2V ?e J0
The time history of the CX response variable data is shown in Fig. 5.18 with
an estimated SNR of 5.4. CX is generally difficult to represent without a separate
propulsion model, and since thrust was not measured directly for E1, the CX calcula-
tion in Eq. (5.1) includes both the thrust and aerodynamic forces. As a result, there
is a strong nonlinear dependency of CX on J , as shown in Fig. 5.19, and so J was
chosen as the PV for this case, with a minimum cell width of 0.05. In this case, this
insight allowed for the explicit specification of J as the sole PV. In other cases where
insight is not apparent, a wider pool can be specified, and the primary indicators
of nonlinearity can be automatically chosen. However, for computational efficiency
and to avoid the possibility of unexpected splits, the PVs should be specified as
carefully as possible.
All of the same SPLITR user specifications as for the Cm example were used for
this case as well, and so no further SPLITR user specification tuning was required.
Note that power was increased during many of the stall maneuvers, which is apparent
by comparing the plot of J in Fig. 5.18 to that of ? in Fig. 5.3.
201
Figure 5.18: Time histories of response variable and PV modeling data for E1 CX model.
Figure 5.19: CX vs. J modeling data for E1 CX model.
The SPLITR model for CX resulted in 5 cells across the range of J , as shown
in Fig. 5.20. Note that the clustering of partitions lies in the regions of large slope
change across the modeling data in Fig. 5.19. It is possible that fewer cells may
have been required if the cell initialization proportion was set lower since there is
202
stronger nonlinearity apparent in this case compared to Cm.
Figure 5.20: Cell structure evolution for E1 CX model.
The final cell parameters for the SPLITR model are presented in Fig. 5.21.
Consistent with the dependency of CX on J shown in Fig. 5.19, CX0 and CX areJ
maximized in magnitude in the lowest-J cell, and shift toward 0 at higher J . The
CXq parameter increases across higher J , while CX? and CX do not appear to show?e
a clear trend associated with J . Recall that much of the higher-J data occur during
the stall maneuvers, so the parameter estimates in the higher-J cells may also be
influenced by the high-? stall behavior there.
Figure 5.22 shows how the validity functions weight the global CX parameterJ
across all of the cells, with the constant and near zero slope value in cell 7 reflecting
the behavior in the modeling data in Fig. 5.19. The corresponding validity functions
across all cells are presented in Fig. 5.23.
203
Figure 5.21: Local model parameter estimates across all cells for E1 CX model.
Figure 5.22: Local and weighted global parameter estimates of CX for E1 C model.J X
204
(a) Validity functions (b) Normalized validity functions
Figure 5.23: Validity functions for E1 CX model.
Similar to the case for Cm, the SPLITR model for CX was compared to a
single linear model developed using the entire data set. In this case, the strongly
nonlinear dependency of CX on J is much better captured using the SPLITR model
which automatically accounts for the nonlinear effects. The RMS of the error using
the modeling data is displayed in Fig. 5.24, which shows significant improvement
for the SPLITR model with 5 cells compared to the linear model. Additionally,
the error is shown to be relatively constant across the range of J for the SPLITR
model, indicating a consistent model fit in each cell. The SPLITR model has an R2
of 0.92, while the linear model had an R2 of 0.66. Figure 5.25 also shows how the
residuals are effectively more constant in RMS magnitude for the SPLITR model
results compared to the linear model results.
205
Figure 5.24: RMS of binned global residuals for modeling data for E1 CX model.
(a) SPLITR nonlinear model (b) Linear model
Figure 5.25: Global residuals vs. PV for modeling data for E1 CX model.
Validation data were obtained from the same data set as in the Cm example,
and the validation data time histories of CX and J are given in Fig. 5.26. The
SPLITR validation results were also compared to that of the linear model, with
206
an R2 of 0.89 and 0.60, respectively. The RMS of the residuals in Fig. 5.27 shows
improvement in error for the SPLITR model compared to the linear model, as well
as displays similar RMS(?) characteristics compared to the modeling data. Figure
5.28 also shows the more uniform magnitude of the residuals of the SPLITR model
compared to the linear model.
This CX example demonstrates both how significant nonlinearities can arise
in practical flight test data, and how they can still be well-captured with local linear
models using the SPLITR technique to automatically partition the aerodynamics in
real time. Additionally, maintaining the flexibility of real-time updates allows the
model to adapt quickly and automatically throughout the course of the envelope
expansion.
Figure 5.26: Time histories of response variable and PV validation data for E1 CX model.
207
Figure 5.27: RMS of binned global residuals for validation data for E1 CX model.
(a) SPLITR nonlinear model (b) Linear model
Figure 5.28: Global residuals vs. PV for validation data for E1 CX model.
5.5 T-2 Generic Transport Model flight test data
This section will show a final test case to develop a SPLITR model for CL
using flight data from the NASA T-2 Generic Transport Model (GTM) test aircraft.
208
This example uses all of the same SPLITR user specifications as the E1 cases,
and is intended to show the consistency and versatility of SPLITR across multiple
experimental platforms without the need to tune algorithm parameters.
5.5.1 T-2 test aircraft
The T-2 GTM is a 5.5% dynamically scaled twin-turbine powered test air-
craft that represents a generic transport aircraft, and which was used as a research
testbed at NASA Langley to explore modeling and control capabilities for aviation
safety. It was flown by a remote pilot with real-time telemetry using the NASA
Airborne Subscale Transport Aircraft Research (AirSTAR) capability. The T-2 is
shown in Fig. 5.29, with the mass properties summarized in Table 5.3. Further
information and details on the T-2 vehicle development and functionality can be
found in Refs. [94,95].
Figure 5.29: T-2 GTM test aircraft (credit: NASA Langley Research Center).
209
Symbol Value Unit
c? 0.915 ft
b 6.849 ft
S 5.902 ft2
m 1.585 slug
Ix 1.179 slug-ft
2
Iy 4.520 slug-ft
2
Iz 5.527 slug-ft
2
Ixz 0.211 slug-ft
2
Table 5.3: T-2 geometry and mass properties.
5.5.2 Case study of CL model
For the case of modeling CL, the model structure was specified as in Eq.(5.13),
with a PV of ? and a 1-deg minimum cell width.
qc?
CL = CL0 + CL?? + CLq + CL ?e (5.13)2V ?e0
The T-2 flight data used for modeling in this test case were obtained at a
nominal altitude of 1300 ft, at a nominal speed of 140 ft/s, and with the PTIs active
on the control surfaces to provide dynamic excitation.
Figure 5.30 shows the time history data of CL across the duration of the flight
data used for modeling, with an SNR of 8. The data contain several periods of
low-? maneuvers, as well as 3 very high amplitude stalls across 5.5 minutes of flight
time. Two splits were performed during the first and last stalls. Ideally, additional
flight data containing stall maneuvers would be obtained following the final split
to update the cells, but this example shows reasonable results even without a large
amount of further data required.
210
Figure 5.30: Time histories of response variable and PV modeling data for T-2 CL model.
Figure 5.31 shows the response variable of CL as a function of the PV ?, which
indicates linear aerodynamics in the low-? range, a stall region, and a post-stall
region. The unsteady effects in the data cannot be captured with non-contiguous
data that are sorted into each cell, but this behavior can be approximated with a
varying CLq term in each cell. The cell structure evolution for the CL SPLITR model
is shown in Fig. 5.32. The first split partitions what appears to be the post-stall
region from the lower-? data, and the second split offers a finer resolution around
the onset of stall, ending with low-, mid-, and high-? cells that are consistent with
the physical interpretation of the crossplot, and with the partitioning pattern for
the E1 Cm case and F-16 CL case.
211
Figure 5.31: CL vs. ? modeling data for T-2 CL model.
Figure 5.32: Cell structure evolution for T-2 CL model.
The final parameter estimates across each of these cells are presented in Fig. 5.33,
and they can provide physical insight into the aerodynamic behavior in each cell. As
indicated in Fig. 5.31, there is a strong nonlinear dependency of CL on ? throughout
212
the data. The lift curve slope CL? is shown as positive, largest in magnitude in the
low-? cell 4, and then much smaller in the stall and post-stall regions. The effect of
positive pitch rate q is to create a larger effective angle of attack on the horizontal
tail, which offers additional lift. This is reflected in the positive CLq parameters,
which also show an increased effect at higher ?, but then a drop off in the post-stall
region. The elevator term CL is positive to show increased lift with a positive?e
elevator deflection. The effectiveness also appears to decrease with ? as the elevator
becomes blanketed in the wing?s flow wake. Because the last split was performed
close to the end of the duration of the flight data used for modeling in this example,
with no further stalls performed afterwards, only a limited amount of additional
data were obtained after the last split to update the final child cell parameters. If
those child cell models were initialized with no parent cell information inherited,
then the CL? parameter in cell 3 was actually larger than that in cell 5 to reflect
the increase in the lift curve slope in Fig. 5.31. Nevertheless, this example shows
that acceptable local models were obtained even without a large amount of data
following the final split. Recall also that global EV values were used throughout the
modeling work for local model parameter estimation, so the bias term is not defined
as the reference value at the center of each cell.
This physical insight into the aerodynamics can be used for explanatory pur-
poses, local linear control law design, envelope expansion, or other purposes. Since
it is being provided and updated in real time, it also offers significant time-saving
advantages and model-based vehicle autonomy facilitation. The modularity also
provides robustness to poor modeling data in certain parts of the flight envelope, as
213
well as localized insight into model performance and uncertainty.
Figure 5.33: Local model parameter estimates across all cells for T-2 CL model.
The SPLITR model was compared to a single linear model that was developed
using all of the flight data, and the RMS of the error is shown in Fig. 5.34. The
SPLITR model offers significantly reduced error across the range of ?, although
with the largest error still in the highest-? region, as it is difficult to model well. It
is possible though that further flight data in the highest-? regime or another way of
accounting for nonlinear unsteady effects could further improve the model fit there.
The SPLITR model provided an R2 of 0.972, while the linear model had an R2 of
0.945. Figure 5.35 shows significant reduction in magnitude of the residuals from
the SPLITR model compared to the linear model.
214
Figure 5.34: RMS of binned global residuals for modeling data for T-2 CL model.
(a) SPLITR nonlinear model (b) Linear model
Figure 5.35: Global residuals vs. PV for modeling data for T-2 CL model.
A separate set of flight data not used in the modeling process, and from a
separate flight, was used for model validation. Figure 5.36 shows the time history of
the validation data, which includes many high-? and rapid stall maneuvers, as well
215
as data across the full range of ? that was modeled. The RMS of the global residuals
is shown in Fig. 5.37, which shows significant reduction in error across the full range
of ? compared to the linear model. The R2 for the validation data was 0.935 for
the SPLITR model, and 0.810 for the linear model. The plots of the residuals as a
function of the PV ? in Fig. 5.38 show reduction in magnitude of the residuals in
the SPLITR model, similar to the plot in Fig. 5.35 for the modeling data.
Figure 5.36: Time histories of response variable and PV validation data for T-2 CL model.
216
Figure 5.37: RMS of binned global residuals for validation data for T-2 CL model.
(a) SPLITR nonlinear model (b) Linear model
Figure 5.38: Global residuals vs. PV for validation data for T-2 CL model.
The test cases presented in this dissertation were conducted through a real-
time simulation on a laptop computer by streaming the data one point at a time.
The computation time for each simulation was significantly shorter than the total
217
duration of data for the 50 Hz data rates assumed and/or used throughout this
work. For the T-2 case, which had the largest data set consisting of 5.5 minutes
of data, the computation time after the algorithm initialization was around 20 sec-
onds using MATLAB on a Dell Precision 7510 laptop computer. This timing was
achieved by removing much of the data logging and other extraneous computation
that was used to provide the figures presented for each example, while retaining
only the relevant information to evaluate the model. The code was not optimized
for improved timing, but future work on improving the efficiency of the code could
reduce the computational load even more. Additionally, the code can be compiled
and converted to C to improve the algorithm speed. Further testing and validation
would be required to ensure it can operate onboard the flight computers used to
achieve the flight data for the test cases in this chapter.
5.6 Summary
Following the introduction of the SPLITR modeling method in Chapter 3,
and exploring its utility and versatility using simplified test data in Chapter 4, this
chapter culminated in applying the new SPLITR approach to aerodynamic model-
ing using simulated and experimental flight data. First the modeling process was
discussed, particularly in the context of conventional fixed-wing aerodynamic model-
ing, to describe how the modeling data are obtained. Then the F-16 simulation was
used as a straightforward initial aircraft example to demonstrate how the SPLITR
method partitions the familiar lift curve, and to tie the results to previous related
218
plots shown in Chapter 3.
The E1 flight test vehicle was then introduced, and two case studies were
presented to show the SPLITR modeling results for the Cm and CX flight data. The
results from the SPLITR models were compared to those from a single linear model
to demonstrate the improvements in the model fit achieved by the localization. Even
with only linear regressors and one-dimensional splits, SPLITR produced models
with improved accuracy and valuable insight. Finally, the T-2 test aircraft was
described, and SPLITR modeling results were discussed for the CL flight data. These
conventional aircraft examples enabled the results to be validated and interpreted
in conjunction with aerodynamic expectations. They also showed the robustness of
the SPLITR method to experimental data, and the ease of application in congruence
with the simplified examples in Chapter 4.
Each of these flight test examples demonstrated that SPLITR can successfully
and effectively model nonlinear aerodynamics with real-time partitioning, provide
valuable physical insight into the aerodynamics through the partitions and the local
parameter estimates, and offer good predictive capabilities, consistent with the goals
that were specified at the beginning of this dissertation in Section 1.5. Furthermore,
these goals were realized without the need to tune the SPLITR algorithm user
specifications across different test cases throughout both Chapter 4 and Chapter
5, and with the use of real experimental flight data that contain significant and
unknown noise content.
These examples, and the visualization of the results, are used to emphasize
the need for a nonlinear global aerodynamic model, and the ability to capture the
219
nonlinear effects successfully with several strategically and automatically placed
linear models.
220
Chapter 6
Conclusions
This chapter will summarize the dissertation, discuss the capabilities, advan-
tages, and limitations of the SPLITR modeling method, specifically highlight the
novel contributions of this work, and address directions for future research.
6.1 Dissertation summary
The rapidly changing air vehicle environment prompted by novel UAV capa-
bilities introduces a demand for new, innovative, and efficient modeling and control
development tools. The NASA L2F concept offers a paradigm for real-time onboard
aerodynamic modeling tools that addresses these challenges. Model identification
was recognized as the central component of aircraft system development, and a de-
tailed background on system identification tools and processes was then presented.
A literature review of aerodynamic modeling methods, outlined in Chapter 1, sum-
marized many current methods, as well as their advantages and limitations, includ-
ing global nonlinear polynomial modeling, data partitioning, and other approaches
such as neural networks. The specific modeling goals that this work set forth to
accomplish included developing a model that offers physical insight into the system,
prioritizing both good local and global prediction capabilities, and abiding by the
constraints of real-time operation and recursive updates. The field of LMNs was
identified as offering an amenable framework for the particular specified modeling
goals, but needed to be further developed from the current practices of manual
post-processed data partitioning in aerodynamic modeling.
Chapter 2 then explored the underlying theory of LMNs, provided background
on various properties, and described many existing LMN construction algorithms.
Across the leaf model properties, the global architecture, and the particular con-
struction logic, a wide breadth and flexibility within LMNs was portrayed. The ad-
vantages and limitations associated with certain customizations and design choices
related to the LMN formulation were detailed and evaluated for the application of
real-time and global nonlinear modeling. These discussions motivated and informed
the final structure applied in the SPLITR framework.
The novel SPLITR modeling method that was developed and explored in this
dissertation was then described in Chapter 3. The particular chosen LMN properties
were discussed, from the leaf model attributes through the global weighting method.
The unique SPLITR real-time cell structure determination process was developed
and described in detail, including the recursive data processing and cell splitting
procedures. A detailed explanation of the SPLITR user inputs was then provided,
along with insights to inform the specifications.
In Chapter 4, the SPLITR method was explored through a series of simple
222
simulated test cases that were chosen to convey important SPLITR properties and
sensitivities. These results were enhanced by informative visualizations and expla-
nations that clarified how the SPLITR model is developed, and the logic that drives
the split decision making. The test data complexity was gradually increased across
piecewise linear, quadratic, and cubic test data across one and two dimensions to
demonstrate the applicability of SPLITR across these varying model complexities.
The sensitivity of the SPLITR results to several user inputs was also demonstrated
to provide insight into designating those specifications.
Finally, in Chapter 5, SPLITR was applied to simulated and experimental
flight test data for conventional aircraft test vehicles. This chapter expanded the
complexity from Chapter 4 by testing SPLITR on data that represent true physical
systems with real experimental flight data. Two additional aspects of the results
provided by those test cases are the confirmation that SPLITR can effectively pro-
cess experimental data with unknown and large noise content, and the verification
and validation of the results to be consistent with physical engineering expectations.
In addition to providing a model with improved predictive capabilities compared to
a single linear model, SPLITR also supplied a cell structure that offered valuable
physical insight into the local linear aerodynamic regions. The results from Chapter
5 confirmed that the modeling goals specified in Chapter 1 were fulfilled through
the SPLITR modeling method.
223
6.2 SPLITR summary, advantages, and limitations
This dissertation presented a novel approach to automated global nonlinear
modeling using local model networks in real time. SPLITR successively partitions
complex nonlinear behavior into local regions to develop a global model composed of
weighted local models that also accurately captures the local behavior, and provides
valuable physical insight and interpretability. This method uses a unique residual
characterization scheme based on the noise content in the data. The statistical in-
formation of the residuals is binned across the ranges of specified PVs to recursively
preserve the information in real time. The result of the partitioning is that the resid-
uals are reduced in magnitude as a function of the PVs. SPLITR provides three sets
of informative results: the cell structure, the local models, and the global nonlinear
model. The decision-making process inherent in the cell structure evolution can be
visualized and understood, and the resulting model can also be analyzed and vali-
dated, both globally and locally. The cell structure and model composition offer a
modularity that provides robustness to poor modeling data in certain parts of the
flight envelope, as well as localized insight into model performance and uncertainty.
The model complexity, as characterized by the number of cells, is also inherently
restricted due to the reliance on noise estimates to prevent overfitting.
During the model development stage a global model is often desirable, and
the examples shown throughout Chapter 5 demonstrated significant nonlinear effects
throughout the global envelope, where a single linear model is inadequate. For these
cases, strategically and automatically placed partitions can be successfully used to
224
split the model into meaningful local, linear regions. For unconventional vehicles
and other strongly nonlinear systems, the benefits offered by the SPLITR model
may be even more appreciable than for these conventional cases.
Model quality is often assessed with global model fit statistics, such as R2 and
RSS. While these metrics offer a general assessment of the model quality, they often
fail to provide localized insight into particular regions of the input space where the
model may be performing at different quality levels. For example, a model with an
R2 of 0.98 may be accepted as satisfactory, but it is possible there is a large amount
of data in one region that is fit well, and little data in another region that are fit
poorly. The localization offered by the SPLITR models readily provides localized
fit statistics across the input space as a way to gauge the model fit across particular
regions.
There were several technical challenges that were faced in this work. Most
importantly, the real-time compatibility constraint implied that all of the past data
could not be saved to be processed in batch during the model development. As a
result, all of the computation and method logic relied on instantaneous measure-
ments and recursive updates to a small number of stored variables. This introduced
a challenge that was repeatedly discussed throughout this work, namely balancing
the tradeoffs between remembering vs. forgetting information. This became relevant
with regards to initializing the child cell model with the parent cell model according
to a factor of certainty based on how similar they were expected to be, as well as re-
garding the forgetting factor and the filter window. SPLITR was initially tested on
conventional fixed-wing aircraft, but the L2F concept demanded the aerodynamic
225
modeling methods to be compatible with other air vehicles. Additionally, this work
sought to develop a method that could be agnostic to the particular testbed, and
with potential to extend beyond the scope of aerospace.
The current limitations of the SPLITR approach are summarized throughout
the remainder of this section. The residual characterization procedure identifies
deterministic properties in the residuals, and responds by splitting cells. If the EV
and PV pools are inadequate and lack important modeling information, multiple
splits can occur without model improvement. Note, however, that all modeling
methods are subject to the quality of the input variables, so this is not unique to
SPLITR. Ultimately, this modeling approach is designed to be automated, but any
a priori knowledge of the system behavior can be useful to specifying the algorithm
parameters and EV and PV pools to lead to more efficient computation. The data
density and data information requirements are also more strict for a partitioning
strategy where sufficient information is required, not only in a global sense, but
particularly in each local region, to ensure a reasonable local model fit.
Additionally, while the residual information in each bin is recursively updated
and used for split decision making, the residuals are technically a function of the
current model parameters, which are updated with every new measurement. As
a result, the residuals from each prior point in time are ?outdated? when the pa-
rameters are updated next. This is an unavoidable artifact of recursive real-time
parameter estimation in general. Since real-time applications are restricted to a
limited amount of information at a given time, and must be able to quickly adapt to
new information, these methods must balance the tradeoffs of increasing the amount
226
of data saved, and ensuring quick adaptation to new information. This aspect could
also be mitigated by updating the model parameters at a lower rate than the data
are received.
The child cell parameter initialization proved to be a significant challenge
in real time that led to limitations in the SPLITR logic. Without access to past
data that can be used to update parameter estimates, the input space needs to be
repeatedly covered to obtain new information and ensure each new local model can
be updated following a split. From an experiment design standpoint, this would
require additional test data and test time to obtain new data throughout the same
regions of the input space.
Although the model structure is transparent and offers improved physical in-
sight into the system with axes-orthogonal splits, strongly coupled nonlinearities
can be better resolved by allowing multidimensional axes-oblique splits. However,
partitioning through multiple dimensions often requires a nonlinear optimization,
which would be impractical to compute in a real time method.
The local parameter estimation is treated independently from the global model
weighting process, and as a result, the validity functions are simply overlaid on
the local models, offering a global model with parameters that are not optimized.
Furthermore, Gaussian normalization for many cells of varying widths can cause
undesirable side effects to the shapes of the normalized validity functions.
Finally, the real-time constraint enforced in this work strongly affects the mod-
eling results, and while foregoing the advantages of batch processes and optimization
procedures, it is possible that the SPLITR modeling results could be inferior to batch
227
nonlinear modeling results from the same data. However, since SPLITR is uniquely
designed for recursive updates, it would be difficult to compare the results to those
from another modeling method that has the advantage of iterative processing.
Despite the significant constraints of real-time operation, which limited the
choice of tools and denied access to most past data, SPLITR was developed and
demonstrated as a successful nonlinear modeling technique using methods that are
compatible with real-time applications.
6.3 Contributions of this work
The central contribution of this work is the development of a novel real-time
modeling approach known as SPLITR. This method extended current data parti-
tioning approaches from offline to online applications, and enabled automated data-
based partitioning and model development without requiring extensive data analysis.
SPLITR relies on a unique new cell split decision making logic that was developed
based on automated residual analysis. It also expands upon existing methods by
accounting for unknown noise content in the data to set modeling expectations and
prevent overfitting.
The benefits of the SPLITR method were described and demonstrated through-
out this work, and summarized in the previous section. Test data were simulated
and used to explore the SPLITR method, as well as the sensitivity of the results to
the user specifications. Finally, using the same user specifications as for the simple
simulated test data, the SPLITR method was successfully tested using experimental
228
flight data that contain an unknown underlying model and noise content.
This work strongly emphasized the benefits of transparency applied to both
the final model, as well as the model development. Effective visualization of the
model development, decision making, and results were therefore important tools
to convey the complete modeling process. Numerous unique figures and diagrams
were designed to provide visualizations and understanding of the entire process and
results. Some of the depictions of the model development included a time history
plot of the response variables with specified split timing to show when a split was
triggered; a cell structure evolution plot to show how the input space was successively
partitioned; a residual characterization plot to display the residual-based decision
making; and a plot of the residual threshold over time in each cell to show how
it varied, and ultimately affected the split decision making. The final model was
expressed through plots of each parameter estimate across all of the cells to show,
for example, how the stability and control derivatives varied across the PV input
space; and a single parameter estimate over time to show how the estimates change
as a function of time, among others. For the flight test examples, the error from the
SPLITR model was also compared to that of a batch linear model through both a
plot of the RMS of the error, as well as a depiction of the residuals across the PV.
This showed the reduction of the SPLITR model error compared to the linear model.
Each of these figures conveyed useful information about the modeling process.
The effectiveness of the SPLITR approach was demonstrated using conven-
tional aircraft, for which traditional understanding of aircraft aerodynamics could
be used for validation. This real-time modeling capability has the potential to
229
improve the efficiency of the aircraft modeling process and enable novel aircraft
configurations to be developed and tested more rapidly and flown more reliably.
Although SPLITR was motivated by and developed for the purpose of aerodynamic
modeling, its potential applications and impact can also extend far beyond the scope
of aircraft.
The original contributions of this research are summarized below.
? Developed Smoothed Partitioning with LocalIzed Trees in Real time (SPLITR)
? a novel approach to global nonlinear modeling using automated LMNs in
real time.
? Extended current data partitioning approaches applied to aerodynamic data
to allow automated data-based partitioning and model development without
requiring extensive analysis.
? Developed a new approach to data partitioning with variable split locations
based on residual analysis that can be used for online applications.
? Described and demonstrated the benefits of SPLITR, which are summarized
as follows: This method can successively partition complex nonlinear behav-
ior into local regions to develop a weighted global model that also accurately
captures the local behavior, and provides valuable physical insight and inter-
pretability. It provides three sets of informative results: the cell structure, the
local models, and the global nonlinear model. The decision making inherent in
the cell structure evolution process can be visualized and understood, and the
230
resulting model can also be analyzed and validated, both globally and locally.
Furthermore, the model complexity, as characterized by the number of cells, is
restricted based on noise estimates to prevent overfitting and ensure a model
that is not overly complex. It also estimates the noise content in the data to
set those modeling expectations.
? Showed results of applying SPLITR to simulated test data to demonstrate and
visualize the modeling process and to explore the sensitivities of the modeling
results to user specifications.
? Developed unique and informative visualizations of the model development
and results to convey useful information and offer transparency.
? Identified aerodynamic models using SPLITR for experimental flight test data
from the NASA E1 and T-2 test aircraft, validated the results using separate
flight data, and showed that the split locations and estimated stability and
control derivatives offered physical insight.
6.4 Future directions and applications for SPLITR
The directions for future work are divided into three parts in this section.
First, several possible extensions to the SPLITR method are discussed, including
automation for user specifications, modifications to the SPLITR logic, expansion of
LMN properties, and improvements in validity functions. Second, several additional
and alternative uses of the SPLITR model are mentioned, in addition to offering
231
insight into the physical system. Last, additional applications and testbeds are
discussed.
Although SPLITR was developed to operate automatically with minimal a
priori knowledge needed, there are still several user specifications that are required.
These parameters could also potentially be better specified with some prior insight.
Many of these user inputs, particularly those that have a strong impact on the
results and that may be difficult to predict with no prior data, may be automated
in real time based on the data. In particular, the residual threshold factor could be
modulated based on a real-time assessment of the residuals and the impact of a split.
The split initialization could also be improved by automating the cell initialization
proportion through comparing the parent data with the child data to determine how
much information to inherit. Each of these improvements, however, would need to
be studied to determine what information can be extracted and used in real time.
The SPLITR logic was shown to have the capability of placing the splits in
meaningful and effective locations to partition the input space and improve the
model fit. There are other possibilities for ways to improve the logic with additional
sources of information. For example, instead of relying on bin failure solely for the
split decision and location, the decision could also be made based on parameter
convergence in each cell. However, this could also curtail splits since a large amount
of data in each cell may be required, which could be difficult to obtain in hard-
to-reach regions of the input space. A split could also be placed ?on hold? until
new data are obtained to update the child cell models, although this could require
additional memory and complexity.
232
The simplified LMN properties used in this work could also be modified to
allow for greater flexibility. Local nonlinear models could be allowed to reduce the
number of cells required for highly nonlinear systems. An automated model struc-
ture determination process in each cell can also improve the local modeling capa-
bilities, even if the regressor pool is still restricted to include only the linear terms,
but particularly for the inclusion of nonlinear terms. Real-time or post-process
pruning methods can also be explored to remove redundant cells that are deemed
unnecessary based on comparing the prediction with that of adjacent cells, or based
on recognizing similar parameter estimates within the parameter uncertainties. If
this is performed in real time, there is an additional challenge of pruning cells with
limited available information. Alternatively, this technique may be used after the
model has already been developed, if all of the data are stored and available for
batch post-processing.
The global weighting technique used in SPLITR overlaid the validity functions
on the identified local linear models, thereby providing a global nonlinear model
that is not guaranteed to be optimal. The validity functions determine how much
of each local model contributes towards the global model at each point in the PV
input space. Only Gaussian validity functions were considered in this work, and the
standard deviations were determined based solely on the cell width. These Gaussian
functions can also be customized based on the predictive capability of each cell and
the adjacent neighbors. Alternative weighting functions can also be explored and
customized based on the real-time quality assessment of each local model. This
approach could weight the local models that are considered more reliable with higher
233
validity, based on uncertainty metrics computed from the data.
The partitioning offered by SPLITR can also be used for other applications
such as input space (or envelope) expansion through model-based experiment design.
As the input space is partitioned in real time, an assessment of the model quality in
each cell, as well as data density, could be used to either confirm that enough data are
obtained in that region, or to inform the experiment design or test guidance to return
to obtain additional data. This could have significant impact during wind tunnel
tests as well to improve the test efficiency. The linear regions of the input space
that do not require high resolution test points could be quickly traversed, while the
insight provided by the partitioning would allow the rig to be automatically directed
to the regions that require more data to improve the prediction.
The LMN structure and the real-time development are also particularly amen-
able to control applications. The SPLITR model provides local linear models, which
are applicable to gain scheduling-based control development. Parallel distributed
compensation (PDC) can be used to automatically design a local linear control law
for the local model in each cell, and then weight the control inputs according to
the cell validity functions. If the SPLITR model is combined with a PDC method
applied in real time, it could take advantage of the cell structure to offer both a
modeling and control strategy. More generally, the adaptive nature of the modeling
can provide updates to an adaptive control law, and the control gains can be modified
in real time accordingly.
The scope of this dissertation included applying SPLITR to simple simulated
test data and conventional aircraft in post-flight real-time simulations, but alterna-
234
tive applications and testbeds can also be considered.
Although SPLITR is a real-time modeling method, the unique logical decision-
making processes may also be useful for batch modeling. If the real-time constraints
are removed, then SPLITR could be adapted to operate in a more flexible, iterative
manner. However, there exist a large number of batch LMN modeling algorithms
that in their original formulation and development, were not restricted to real-time
operation and limited information. These techniques may therefore be more suited
for batch mode operation.
SPLITR was developed under the constraints of real-time operation that would
be compatible with onboard model development, but future work could validate this
assertion by compiling the software and testing onboard the aircraft.
This dissertation focused on applications to fixed-wing vehicles, but SPLITR
has a much wider applicability, and future work could apply the SPLITR method
to additional air vehicle testbeds, such as rotorcraft, lift+cruise, tiltwing, and other
novel air vehicle configurations. This real-time modeling approach could potentially
be used in the fast-paced context of Urban Air Mobility (UAM) to obtain an aero-
dynamic model efficiently, to mitigate the need for many ground-based tests, to
provide much-needed physical insight into these new and unconventional vehicles,
and to provide additional safety through adaptable autonomous onboard model-
based systems.
Finally, although this work was motivated by the NASA L2F concept for the
purpose of aerodynamic modeling, future work could expand the test cases and
applications to other physical systems with similar modeling goals and priorities.
235
Bibliography
[1] Advancing Aerial Mobility: A National Blueprint. National Academies of Sci-
ences, Engineering, and Medicine, National Academies Press, 2020.
[2] ?Mobility on demand.? https://webcms.pima.gov/UserFiles/
Servers/Server_6/File/Government/Transportation/
ATDM-ActiveTransportationDemandMgmt/200127%20USDOT%
20MobilityonDemand%20Fact%20sheet.pdf, US Department of Trans-
portation. Accessed: 2021-04-06.
[3] M. D. Moore, ?Personal Air Vehicles: A Rural/Regional and Intra-Urban On-
Demand Transportation System,? in AIAA International Air and Space Sym-
posium and Exposition, AIAA 2003-2646, 2003.
[4] A. Bacchini and E. Cestino, ?Electric VTOL Configurations Comparison,? in
Aerospace, Multidisciplinary Digital Publishing Institute, vol. 6, no. 3, 2019.
doi: 10.3390/aerospace6030026.
[5] E. H. Heim, E. M. Viken, J. M. Brandon, and M. A. Croom, ?NASA?s Learn-
to-Fly Project Overview,? in AIAA Atmospheric Flight Mechanics Conference,
AIAA Paper 2018-3307, Atlanta, GA, June 2018. doi: 10.2514/6.2018-3307.
[6] S. E. Riddick, ?An Overview of NASA?s Learn-to-Fly Technology Develop-
ment,? in AIAA Atmospheric Flight Mechanics Conference, AIAA Paper 2020-
0760, Orlando, FL, January 2020. doi: 10.2514/6.2020-0760.
[7] E. A. Morelli, ?Real-Time Global Nonlinear Aerodynamic Modeling for Learn-
to-Fly,? in AIAA Atmospheric Flight Mechanics Conference, AIAA Paper 2016-
2010, San Diego, California, January 2016. doi: 10.2514/6.2016-2010.
[8] ?Vortex Lattice Utilization,? NASA Langley Research Center, NASA SP-405,
May 1976.
[9] P. Parikh, W. Engelund, S. Armand, and R. Bittner, ?Evaluation of a CFD
Method for Aerodynamic Database Development using the Hyper-X Stack Con-
figuration,? in 22nd Applied Aerodynamics Conference and Exhibit, AIAA Pa-
per 2004-5385, Providence, Rhode Island, August 2004. doi: 10.2514/6.2004-
5385.
[10] J. Chambers, Modeling Flight: The Role of Dynamically Scaled Free Flight
Models in Support of NASA?s Aerospace Programs. NASA Langley Research
Center, NASA SP 2009-575, 2009.
[11] J. M. Brandon and E. A. Morelli, ?Nonlinear Aerodynamic Modeling From
Flight Data Using Advanced Piloted Maneuvers and Fuzzy Logic,? NASA Lan-
gley Research Center, NASA TM 2012-217778, October 2012.
[12] J. M. Brandon and E. A. Morelli, ?Real-Time Onboard Global Nonlinear
Aerodynamic Modeling from Flight Data,? Journal of Aircraft, vol. 53, no. 5,
pp. 1261?1297, September 2016. doi: 10.2514/1.C033133.
[13] S. E. Riddick, R. C. Busan, D. E. Cox, and S. A. Laughter, ?Learn-to-Fly
Test Setup and Concept of Operations,? in AIAA Atmospheric Flight Me-
chanics Conference, AIAA Paper 2018-3308, Atlanta, GA, June 2018. doi:
10.2514/6.2018-3308.
[14] E. A. Morelli, ?Practical Aspects of Real-Time Modeling for the Learn-to-Fly
Concept,? in AIAA Atmospheric Flight Mechanics Conference, AIAA Paper
2018-3309, Atlanta, GA, June 2018. doi: 10.2514/6.2018-3309.
[15] J. V. Foster, ?Autonomous Guidance Algorithms for NASA Learn-to-Fly Tech-
nology Development,? in AIAA Atmospheric Flight Mechanics Conference,
AIAA Paper 2018-3310, Atlanta, GA, June 2018. doi: 10.2514/6.2018-3310.
[16] S. M. Snyder, B. J. Bacon, and E. A. Morelli, ?Online Control Design for Learn-
to-Fly,? in AIAA Atmospheric Flight Mechanics Conference, AIAA Paper 2018-
3311, Atlanta, GA, June 2018. doi: 10.2514/6.2018-3311.
[17] E. A. Morelli, ?Determining Aircraft Moments of Inertia from Flight Test
Data,? in AIAA Atmospheric Flight Mechanics Conference, AIAA Paper 2021-
1642, January 2021. doi: 10.2514/6.2021-1642.
[18] E. A. Morelli, ?Autonomous Real-Time Global Aerodynamic Modeling in
the Frequency Domain,? in AIAA Atmospheric Flight Mechanics Conference,
AIAA Paper 2020-0761, Orlando, FL, January 2020. doi: 10.2514/6.2020-0761.
[19] R. Weinstein and J. E. Hubbard, ?Global Aerodynamic Modeling Using Au-
tomated Local Model Networks in Real Time,? in AIAA Atmospheric Flight
Mechanics Conference, AIAA Paper 2020-0762, Orlando, FL, January 2020.
doi: 10.2514/6.2020-0762.
[20] S. M. Snyder, ?Autopilot Design with Learn-to-Fly,? in AIAA Atmospheric
Flight Mechanics Conference, AIAA Paper 2020-0763, Orlando, FL, January
2020. doi: 10.2514/6.2020-07632.
[21] E. A. Morelli and V. Klein, Aircraft System Identification: Theory and Practice.
Williamsburg, VA: Sunflyte Enterprises, 2nd ed., 2016.
237
[22] O. Nelles, Nonlinear System Identification: From Classical Approaches to Neu-
ral Networks and Fuzzy Models. Springer-Verlag, Berlin, Germany, 2002. doi:
10.1007/978-3-662-04323-3.
[23] R. Murray-Smith, A Local Model Network Approach to Nonlinear Modeling.
PhD thesis, University of Strathclyde, Glasgow, U.K, 1994.
[24] E. A. Morelli and V. Klein, ?Application of system identification to aircraft at
NASA Langley Research Center,? Journal of Aircraft, vol. 42, no. 1, pp. 12?25,
2005. doi: 10.2514/1.3648.
[25] R. Jategaonkar, D. Fischenberg, and W. von Gruenhagen, ?Aerodynamic Mod-
eling and System Identification from Flight Data?Recent Applications at DLR,?
Journal of Aircraft, vol. 41, no. 4, pp. 681?691, 2004. doi: 10.2514/1.3165.
[26] E. A. Morelli, ?Multiple Input Design for Real-Time Parameter Estimation in
the Frequency Domain,? IFAC Proceedings Volumes, vol. 36, no. 16, pp. 639?
644, 2003. doi: 10.1016/S1474-6670(17)34833-4.
[27] E. A. Morelli, ?Flight Test Experiment Design for Characterizing Stability and
Control of Hypersonic Vehicles,? Journal of Guidance, Control, and Dynamics,
vol. 32, no. 3, pp. 949?959, May 2009. doi: 10.2514/1.37092.
[28] E. A. Morelli, ?Flight Test Maneuvers For Efficient Aerodynamic Modeling,?
in AIAA Atmospheric Flight Mechanics Conference, AIAA Paper 2011-6672,
Portland, Oregon, August 2011. doi: 10.2514/6.2011-6672.
[29] E. A. Morelli, ?Efficient Global Aerodynamic Modeling from Flight Data,?
in 50th AIAA Aerospace Sciences Meeting, AIAA Paper 2012-1050, Nashville,
Tennessee, January 2012. doi: 10.2514/6.2012-1050.
[30] M. B. Tischler and R. K. Remple, Aircraft and Rotorcraft System Identification.
Reston, VA: AIAA, 2nd ed., 2012.
[31] L. Ljung, ?Frequency Domain Versus Time Domain Methods in System
Identification?Revisited,? in Control of Uncertain Systems: Modelling, Approx-
imation, and Design, pp. 277?291, Springer, 2006.
[32] L. Ljung and K. Glover, ?Frequency Domain Versus Time Domain Methods
in System Identification,? Automatica, vol. 17, no. 1, pp. 71?86, 1981. doi:
10.1016/0005-1098(81)90085-6.
[33] R. E. Maine and K. W. Iliff, ?Application of Parameter Estimation to Aircraft
Stability and Control: The Output-Error Approach,? NASA, NASA RP-1168,
1986.
[34] J. A. Grauer and M. J. Boucher, ?Aircraft System Identification from Multisine
Inputs and Frequency Responses,? Journal of Guidance, Control, and Dynam-
ics, vol. 43, no. 12, pp. 2391?2398, December 2020. doi: 10.2514/1.G005131.
238
[35] S. Seher-Weiss, User?s Guide FITLAB - Parameter Estimation Using MATLAB
- Version 2.3. DLR Internal Report. DLR-IB 111-2014/07, 73 S, 2014.
[36] R. V. Jategaonkar, Flight Vehicle System Identification: A Time-Domain
Methodology. Reston, VA: AIAA Progress in Aeronautics and Astronautics,
vol. 247, 2nd ed., 2015. doi: 10.2514/4.102790.
[37] K. Hui, J. Ricciardi, K. Ellis, and D. Tuomey, ?Beechjet Flight Test Data Gath-
ering and Level-D Simulator Aerodynamic Mathematical Model Development,?
in AIAA Atmospheric Flight Mechanics Conference, AIAA Paper 2001-4012,
Montreal, Canada, August 2001. doi: 10.2514/6.2001-4012.
[38] R. V. Jategaonkar, W. Mo?nnich, D. Fishenberg, and B. Krag, ?Identification
of C-160 Simulator Data Base from Flight Data,? in Proceedings of the 10th
IFAC Symposium on System Identification, pp. 1031?1038, Elsevier Sciences,
Oxford, 1994. doi: 10.1016/S1474-6670(17)47844-X.
[39] M. Millidere, U. Cakin, and T. Yigit, ?Generating Aerodynamic Database from
Closed Loop Simulated Flight Data,? in 8th European Conference for Aeronau-
tics and Space Sciences (EUCASS), July 2019. doi: 10.13009/EUCASS2019-
742.
[40] M. B. Tischler and E. L. Tobias, ?A Model Stitching Architecture for Contin-
uous Full Flight-Envelope Simulation of Fixed-Wing Aircraft and Rotorcraft
from Discrete Point Linear Models,? tech. rep., US Army Aviation and Missile
Research, Development and Engineering Center, Redstone, Alabama, 2016.
[41] E. Tobias, M. Tischler, T. Berger, and S. G. Hagerott, ?Full Flight-Envelope
Simulation and Piloted Fidelity Assessment of a Business Jet Using a Model
Stitching Architecture,? in AIAA Modeling and Simulation Technologies Con-
ference, AIAA Paper 2015-1594, Kissimmee, Florida, January 2015. doi:
10.2514/6.2015-1594.
[42] L. T. Nguyen, M. E. Ogburn, W. P. Gilbert, K. S. Kibler, P. W. Brown, and
P. L. Deal, ?Simulator Study of Stall/Post-Stall Characteristics of a Fighter
Airplane With Relaxed Longitudinal Static Stability,? NASA Langley Research
Center, NASA TP 1538, 1979.
[43] M. S. Selig, ?Real-Time Flight Simulation of Highly Maneuverable Unmanned
Aerial Vehicles,? Journal of Aircraft, vol. 51, no. 6, pp. 1705?1725, 2014. doi:
10.2514/1.C032370.
[44] C. C. de Visser, J. A. Mulder, and Q. P. Chu, ?Global Aerodynamic Model-
ing with Multivariate Splines,? in AIAA Modeling and Simulation Technolo-
gies Conference, AIAA Paper 2008-7500, Honolulu, Hawaii, August 2008. doi:
10.2514/6.2008-7500.
239
[45] J. G. Batterson and V. Klein, ?Partitioning of Flight Data for Aerodynamic
Modeling of Aircraft at High Angles of Attack,? Journal of Aircraft, vol. 26,
no. 4, pp. 334?339, April 1989. doi: 10.2514/3.45765.
[46] J. G. Batterson, Estimation of Airplane Stability and Control Derivatives From
Large Amplitude Longitudinal Maneuvers. NASA Langley Research Center,
NASA TM 83185, 1981.
[47] V. Klein, J. G. Batterson, and P. C. Murphy, ?Determination of Airplane Model
Structure From Flight Data by Using Modified Stepwise Regression,? NASA
Langley Research Center, NASA TP 1916, 1981.
[48] E. A. Morelli, ?Global Nonlinear Aerodynamic Modeling Using Multivariate
Orthogonal Functions,? Journal of Aircraft, vol. 32, no. 2, pp. 270?277, 1995.
doi: 10.2514/3.46712.
[49] J. A. Grauer and E. A. Morelli, ?Generic Global Aerodynamic Model
for Aircraft,? Journal of Aircraft, vol. 52, no. 1, pp. 13?20, 2015. doi:
10.2514/1.C032888.
[50] V. Klein and J. G. Batterson, ?Determination of Airplane Model Structure
From Flight Data Using Splines and Stepwise Regression,? NASA TP-2126,
March, 1983.
[51] C. C. de Visser, J. A. Mulder, and Q. P. Chu, ?A Multidimensional Spline-Based
Global Nonlinear Aerodynamic Model for the Cessna Citation II,? in AIAA
Atmospheric Flight Mechanics Conference, AIAA Paper 2010-7950, Toronto,
Ontario, Canada, August 2010. doi: 10.2514/6.2010-7950.
[52] S. Seher-Weiss, ?Identification of Nonlinear Aerodynamic Derivatives using
Classical and Extended Local Model Networks,? Aerospace Science and Tech-
nology, vol. 15, pp. 33?44, 2011. doi: 10.1016/j.ast.2010.06.002.
[53] D. Rohlf, ?Global Model Approach for X-31 VECTOR System Identification,?
Journal of Aircraft, vol. 42, no. 1, pp. 54?62, 2005. doi: 10.2514/1.6147.
[54] S. Weiss, H. Friehmelt, E. Plaetschke, and D. Rohlf, ?X-31A System Identifi-
cation Using Single-Surface Excitation at High Angles of Attack,? Journal of
Aircraft, vol. 33, no. 3, pp. 485?490, 1996. doi: 10.2514/3.46970.
[55] S. Weiss and F. Thielecke, ?Aerodynamic Model Identification Using Local
Model Networks,? in AIAA Atmospheric Flight Mechanics Conference, AIAA
Paper 2000-4098, Denver, Colorado, August 2000. doi: 10.2514/6.2000-4098.
[56] S. Weiss, D. Rohlf, and E. Plaetschke, ?Parameter Identification for X-31A
at High Angles of Attack,? in Fourth High Alpha Conference, vol. 1, NASA
Dryden Research Center, 1994.
240
[57] R. Hess, ?On the Use of Back Propagation with Feed-Forward Neural Net-
works for the Aerodynamic Estimation Problem,? in Flight Simulation and
Technologies, AIAA Paper 1993-3638, Monterey, California, August 1993. doi:
10.2514/6.1993-3638.
[58] T. Rajkumar and J. E. Bardina, ?Prediction of Aerodynamic Coefficients Using
Neural Networks for Sparse Data,? in FLAIRS Conference, pp. 242?246, May
2002.
[59] D. J. Linse and R. F. Stengel, ?Identification of Aerodynamic Coefficients Using
Computational Neural Networks,? Journal of Guidance, Control, and Dynam-
ics, vol. 16, no. 6, pp. 1018?1025, November 1993. doi: 10.2514/3.21122.
[60] M. Larsson, P. De Raedt, and M. Hedlund, Aerodynamic Identification Using
Neural Networks. Linko?ping University Electronic Press, 1997.
[61] A. Ghosh, S. Raisinghani, and S. Khubchandani, ?Estimation of Aircraft
Lateral-Directional Parameters Using Neural Networks,? Journal of Aircraft,
vol. 35, no. 6, pp. 876?881, 1998. doi: 10.2514/2.2407.
[62] R. Kumar and A. Ghosh, ?Estimation of Aerodynamic Derivatives Using Neural
Network Based Method,? IFAC Proceedings Volumes, vol. 47, no. 1, pp. 897?
904, 2014. doi: 10.3182/20140313-3-IN-3024.00057.
[63] R. F. Stengel and D. J. Linse, ?System Identification for Nonlinear Control
Using Neural Networks,? in Conference on Information Sciences and Systems,
Princeton, New Jersey, March 1990.
[64] D. J. Linse and R. F. Stengel, ?A System Identification Model for Adaptive
Nonlinear Control,? in 1991 American Control Conference, pp. 1752?1757,
IEEE, Boston, Massachusetts, June 1991. doi: 10.23919/ACC.1991.4791685.
[65] Y. Chen, ?Modeling of Longitudinal Unsteady Aerodynamics at High Angle-
of-Attack Based on Support Vector Machines,? in 2012 8th International
Conference on Natural Computation, pp. 431?435, IEEE, 2012. doi:
10.1109/ICNC.2012.6234640.
[66] J. Sanwale and D. J. Singh, ?Aerodynamic Parameters Estimation Using Ra-
dial Basis Function Neural Partial Differentiation Method.,? Defence Science
Journal, vol. 68, no. 3, pp. 241?250, 2018. doi: 10.14429/dsj.68.11843.
[67] A. Graps, ?An Introduction to Wavelets,? IEEE Computational Science and
Engineering, vol. 2, no. 2, pp. 50?61, 1995.
[68] M. Sharma, G. Singh, and R. Gupta, ?Application of Wavelet ? An Advanced
Approach of Transformation,? Advanced Research in Electrical and Electronic
Engineering, vol. 1, no. 1, pp. 28?34, 2014.
241
[69] S. Mohammadi, M. Sabzeparvar, and M. KArrari, ?Aircraft Stability and Con-
trol Model Using Wavelet Transforms,? Proceedings of the Institution of Me-
chanical Engineers, Part G: Journal of Aerospace Engineering, vol. 224, no. 10,
pp. 1107?1118, 2010. doi: 10.1243/09544100JAERO717.
[70] X.-s. Gan, J.-s. Duanmu, Y.-b. Meng, and W. Cong, ?Wavelet neural network
aerodynamic modeling from flight data based on pso algorithm with information
sharing and velocity disturbance,? Journal of Central South University, vol. 20,
no. 6, pp. 1592?1601, 2013. doi: 10.1007/s11771-013-1651-3.
[71] O. Nelles, A. Fink, and R. Isermann, ?Local Linear Model Trees (LOLIMOT)
Toolbox for Nonlinear System Identification,? 12th IFAC Symposium on System
Identification, vol. 33, no. 15, pp. 845?850, June 2000. doi: 10.1016/S1474-
6670(17)39858-0.
[72] R. Murray-Smith and T. A. Johansen, ?Local Learning in Local Model Net-
works,? in 4th International Conference on Artificial Neural Networks, pp. 40-
46, 1995.
[73] R. Shorten and R. Murray-Smith, ?Side effects of normalising radial basis func-
tion networks,? International Journal of Neural Systems, vol. 7, no. 2, pp. 167?
179, 1996. doi: 10.1142/S0129065796000130.
[74] G. Gregorc?ic? and G. Lightbody, ?Nonlinear model-based control of highly
nonlinear processes,? Computers and Chemical Engineering, vol. 34, no. 8,
pp. 1268?1281, 2010. doi: 10.1016/j.compchemeng.2010.03.011.
[75] L. Breiman, ?Hinging Hyperplanes for Regression, Classification, and Function
Approximation,? IEEE Transactions on Information Theory, vol. 39, no. 3,
pp. 999?1013, 1993. doi: 10.1109/18.256506.
[76] S. Ernst, ?Hinging Hyperplane Trees for Approximation and Identification,?
in Proceedings of the 37th IEEE Conference on Decision and Control, vol. 2,
pp. 1266?1271, IEEE, 1998. doi: 10.1109/CDC.1998.758452.
[77] O. Nelles, ?Axes-Oblique Partitioning Strategies for Local Model Networks,?
in 2006 IEEE International Symposium on Intelligent Control, pp. 2378?2383,
IEEE, 2006. doi: 10.1109/CACSD-CCA-ISIC.2006.4777012.
[78] L. Breiman, J. Friedman, C. J. Stone, and R. A. Olshen, Classification and
Regression Trees. Wadsworth, 1984.
[79] ?Matlab regression tree toolbox.? https://www.mathworks.com/help/stats/
regression-trees.html, 2021. Accessed: 2021-04-08.
[80] A. Karalic, ?Linear Regression in Regression Tree Leaves,? in Proceedings of
the International School for Synthesis of Expert Knowledge (ISSEK), 1992.
242
[81] P. Chaudhuri, M. Huang, W. Loh, and R. Yao, ?Piecewise-Polynomial Regres-
sion Trees,? Statistica Sinica, pp. 143?167, 1994.
[82] W. Loh, ?Regression Trees with Unbiased Variable Selection and Interaction
Detection,? Statistica Sinica, pp. 361?386, 2002.
[83] A. Dobra and J. Gehrke, ?SECRET: A Scalable Linear Regression Tree Al-
gorithm,? in Proceedings of the 8th ACM SIGKDD International Confer-
ence on Knowledge Discovery and Data Mining, pp. 481?487, 2002. doi:
10.1145/775047.775117.
[84] J. H. Friedman, ?Multivariate Adaptive Regression Splines,? The Annals of
Statistics, vol. 19, no. 1, pp. 1?67, 1991. doi: 10.1214/aos/1176347963.
[85] O. Nelles and R. Isermann, ?Basis Function Networks for Interpolation of Local
Linear Models,? Proceedings of 35th IEEE Conference on Decision and Control,
vol. 1, pp. 470?475, 1996. doi: 10.1109/CDC.1996.574356.
[86] O. Ba?nfer, M. Franke, and O. Nelles, ?Adaptive Local Model Networks with
Higher Degree Polynomials,? in International Conference on Control, Au-
tomation and Systems, pp. 168?171, IEEE, October 2010. doi: 10.1109/IC-
CAS.2010.5669893.
[87] O. Ba?nfer and O. Nelles, ?Polynomial Model Tree (POLYMOT)?A New Train-
ing Algorithm for Local Model Networks with Higher Degree Polynomials,? in
2009 IEEE International Conference on Control and Automation, pp. 1571?
1576, IEEE, 2009. doi: 10.1109/ICCA.2009.5410547.
[88] O. Ba?nfer, B. Hartmann, and O. Nelles, ?POLYMOT versus HILOMOT ?
A Comparison of Two Different Training Algorithms for Local Model Net-
works,? IFAC Proceedings Volumes, vol. 45, no. 16, pp. 1569?1574, 2012. doi:
10.3182/20120711-3-BE-2027.00354.
[89] D. Potts and C. Sammut, ?Online Nonlinear System Identification Using Linear
Model Trees,? Proceedings of the 16th IFAC World Congress, pp. 202?207, 2005.
doi: 10.3182/20050703-6-CZ-1902.00034.
[90] D. Potts and C. Sammut, ?Incremental Learning of Linear Model Trees,? Ma-
chine Learning, vol. 61, pp. 5?48, 2005. doi: 10.1007/s10994-005-1121-8.
[91] C. Hametner and S. Jakubek, ?Neuro-Fuzzy Modelling Using a Logistic Dis-
criminant Tree,? in Proceedings of the 2007 American Control Conference,
pp. 864?869, IEEE, 2007. doi: 10.1109/ACC.2007.4283048.
[92] C. Hametner and S. Jakubek, ?Combustion Engine Modelling using an Evolv-
ing Local Model Network,? in 2011 IEEE International Conference on Fuzzy
Systems, pp. 2802?2807, IEEE, 2011. doi: 10.1109/FUZZY.2011.6007357.
243
[93] C. Hametner and S. Jakubek, ?Local model network identification for online
engine modelling,? Information Sciences, vol. 220, pp. 210?225, 2013. doi:
10.1016/j.ins.2011.12.034.
[94] T. Jordan, J. Foster, R. Bailey, and C. Belcastro, ?AirSTAR: A UAV Platform
for Flight Dynamics and Control System Testing,? in 25th AIAA Aerodynamic
Measurement Technology and Ground Testing Conference, AIAA Paper 2006-
3307, San Francisco, California, June 2006. doi: 10.2514/6.2006-3307.
[95] E. Morelli and K. Cunningham, ?Aircraft Dynamic Modeling in Turbulence,? in
25th AIAA Atmospheric Flight Mechanics Conference, AIAA Paper 2012-4650,
Minneapolis, Minnesota, August 2012. doi: 10.2514/6.2012-4650.
244