ABSTRACT

Title of Dissertation: WORKING IN REVERSE: ADVANCING
INVERSE OPTIMIZATION IN THE FIELDS OF
EQUILIBRIUM AND INFRASTRUCTURE MODELING
Stephanie Ann Allen
Doctor of Philosophy, 2022

Dissertation Directed by: Professor Steven A. Gabriel
Department of Mechanical Engineering

Professor John P. Dickerson
Department of Computer Science

Transportation and infrastructure modeling allows us to pursue societal aims such as improved

disaster management, traffic flow, and water allocation. Equilibrium programming enables us

to represent the entities involved in these applications such that we can learn more about their

dynamics. These entities include transportation users and market players. However, determining

the parameters in these models can be a difficult task because the entities involved in these

equilibrium processes may not be able to articulate or to disclose the parameterizations that

motivate them. The field of inverse optimization (IO) offers a potential solution to this problem by

taking observed equilibria to these systems and using them to parameterize equilibrium models.

In this dissertation, we explore the use of inverse optimization to parameterize multiple new

or understudied subclasses of equilibrium problems as well as expand inverse optimization’s

application to new infrastructure domains. In the first project of our dissertation, our contribution


to the literature is to propose that IO can be used to parameterize cost functions in multi-stage

stochastic programs for disaster management and can be used in disaster support systems. We

demonstrate in most of our experiments that using IO to obtain the hidden cost parameters for

travel on a road network changes the protection decisions made on that road network when

compared to utilizing the mean of the parameter range for the hidden parameters (also referred

to as “uniform cost”). The protection decisions made under the IO cost parameterizations versus

the true cost parameterizations are similar for most of the experiments, thus lending credibility to

the IO parameterizations. In the second project of our dissertation, we extend a well-known

framework in the IO community to the case of jointly convex generalized Nash equilibrium

problems (GNEPs). We demonstrate the utility of this framework in a multi-player transportation

game in which we vary the number of players, the capacity level, and the network topology

in the experiments as well as run experiments assuming the same costs among players and

different costs among players. Our promising results provide evidence that our work could be

used to regulate traffic flow toward aims such as reduction of emissions. In the final project

of our dissertation, we explore the general parameterization of the constant vector in linear

complementarity problems (LCPs), which are mathematical expressions that can represent optimization,

game theory, and market models [75]. Unlike the limited previous work on inverse optimization

for LCPs, we characterize theoretical considerations regarding the inverse optimization problem

for LCPs, prove that a previously proposed IO solution model can be dramatically simplified,

and handle the case of multiple solution data points for the IO LCP problem. Additionally,

we use our knowledge regarding LCPs and IO in a water market allocation case study, which

is an application not previously explored in the IO literature, and we find that charging an

additional tax on the upstream players enables the market to reach a system optimal. In sum,


this dissertation contributes to the inverse optimization literature by expanding its reach in the

equilibrium problem domain and by reaching new infrastructure applications.


WORKING IN REVERSE: ADVANCING INVERSE OPTIMIZATION IN
THE FIELDS OF EQUILIBRIUM AND INFRASTRUCTURE MODELING

by

Stephanie Ann Allen

Dissertation submitted to the Faculty of the Graduate School of the
University of Maryland, College Park in partial fulfillment

of the requirements for the degree of
Doctor of Philosophy

2022

Advisory Committee:
Professor Steven A. Gabriel, Chair/Co-Advisor
Professor John P. Dickerson, Chair/Co-Advisor
Professor Maria K. Cameron
Professor Kimia Ghobadi
Professor Subramanian Raghavan


© Copyright by
Stephanie Ann Allen

2022


Foreword

In accordance with the University of Maryland, College Park Graduate School Doctor of

Philosophy Degree Policies, Stephanie Allen states in this forward that she has made substantial

contributions to all of the coauthored work included in this dissertation. The letter at the beginning

of this dissertation also speaks to this. The signed copy of this letter has been sent to and approved

by the UMD Graduate School.

ii


Acknowledgments

When I started this journey, I knew that it would be arduous, infuriating, and one of the

most emotionally trying experiences of my life. And, I was right, but it ended up being both

much more difficult than I could have imagined yet also vastly more fulfilling than I could have

dreamed. I found confidence in who I am and what I want out of my life during the last five and

a half years, and I honestly can say that I am at peace with myself as I finish this journey. I could

not have navigated or traversed this long path though without many people, and I am grateful for

the opportunity to thank them here.

To my family: My mother, in particular, helped me survive the pandemic, through many

phone calls and virtual art nights. You have always been my biggest cheerleader, and I am truly

grateful to have you in my life. My sisters have kept me constantly amused and inspired to

continue achieving. And, to my father, thank you for cooking for me during my visits home and

for quietly supporting and believing in me.

To my advisors: Steve, you have been with me from the beginning. I am grateful for the

researcher you have inspired me to be, and thank you tremendously for giving me the space and

grace I needed during some very difficult moments during the process. John, I know we met later

in my Ph.D. years, but thank you for teaching me the importance of communicating ideas clearly

and continuing to inspire me to do research for the social good. And thank you to you both for

helping me to cite a bit less compulsively!

iii


To my collaborators: Thank you to Dr. Daria Terekhov, who has pushed me to be a better

researcher and who took the time and interest to work with someone completely outside of her

university. Nathan Boyd, thank you for being a great friend with whom to get stuck in London

and for being a truly wonderful collaborator.

To my professors: Thank you to Dr. Jacob Bedrossian and Dr. Maria Cameron for

providing a rigorous first year series in scientific computation. I’m constantly drawing upon the

skills you instilled within me, and I’m grateful for the roles you have both played in my graduate

journey. Thank you to Dr. Tom Goldstein and Dr. Howard Elman for informative classes on

numerical optimization and numerical linear algebra.

To my friends: Where to begin! I am blessed to have a number of wonderful friends. Luke

and Lora, you both provided a refuge when I needed it, and I will never forget your kindness and

your grace. Shin, thank you for being an example of perseverance in my life and for introducing

me to wonderful food. Sai, you are one of the most generous people I have ever met; thank you

for being a kindred spirit with whom I can agree on so many issues. Ishfaaq, Jerry, Noah, and

Shashank, thank you for providing so much fun, laughter, and joy during the pandemic over the

course of our outdoor get-togethers. Zack, Ben, Vaughn, Gabriel, Al, and the rest of my board

game group, thank you for helping me get through the pandemic and bringing more fun into my

life on most Friday nights.

Meredith and Paige, thank you for picking up the phone whenever I call to run something

by you and for being constants in my life. You are some of my oldest and dearest friends,

and I would not be where I am without your support. Anna, thank you for your kindness and

your dedication to make the world a better place. Aidan, thank you for visiting me at some key

moments in my grad school life and providing a sounding board during those moments.

iv


Tony, I know our journey has just begun, but I am grateful we found each other toward

the end of my dissertation. It has been a beautiful experience falling for someone as I close this

chapter of my life.

To the UMD Mathematics Department: Thank you to both Jessica and Cristina for

helping me navigate the bureacracy of graduate school and for your kindness in doing so.

To Board and Brew: I’ve spent many hours in your coffee shop working through assignments,

working on research, and finding a refuge during difficult moments. Thank you to all the wonderful

staff members at this coffee shop for making me feel at home and welcome in this space.

Dedication: And, finally, I dedicate this dissertation to all of those facing barriers in the

fight to finish their dissertation journeys and to all of those individuals who are supporting them in

those processes. I cannot express enough the magnitude of my gratitude to those who supported

me in my struggles as a woman in STEM in the Ph.D. process. I aim to pay this forward.

v


Table of Contents

Acknowledgements iii

Table of Contents vi

List of Tables ix

List of Figures x

List of Abbreviations xiii

Chapter 1: Introduction 1
1.1 Background: Equilibrium Problems . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Background: Inverse Optimization . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Summary of the Dissertation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 Papers and Presentations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

Chapter 2: A Hybrid Inverse Optimization-Stochastic Programming Framework for
Network Protection 11

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1.1 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.2 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.1 Inverse Optimization for Transportation Problems . . . . . . . . . . . . . 15
2.2.2 Multi-Stage Disaster Relief Models . . . . . . . . . . . . . . . . . . . . 16
2.2.3 Disaster Support Systems . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.3 Hybrid Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3.1 Data Analysis Component: Inverse Optimization . . . . . . . . . . . . . 18
2.3.2 Normative Model: Two-Stage Stochastic Model . . . . . . . . . . . . . . 24

2.4 Experimental Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.4.1 Data Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.4.2 Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.4.3 Procedures for Experiments A and B . . . . . . . . . . . . . . . . . . . . 33
2.4.4 Networks and Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.4.5 Details of the Experiments . . . . . . . . . . . . . . . . . . . . . . . . . 40

2.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.5.1 Experiment IA-IVA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.5.2 Experiments IB-IVB . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

2.6 Conclusions & Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

vi


2.7 Chapter Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

Chapter 3: Using Inverse Optimization to Learn Cost Functions in Generalized Nash
Games 52

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.1.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.1.2 Our Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

3.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.3 Theoretical Background and Connections . . . . . . . . . . . . . . . . . . . . . 65

3.3.1 Theoretical Considerations for Generating the Simulation Data . . . . . . 65
3.3.2 General Residual Model [154] . . . . . . . . . . . . . . . . . . . . . . . 69
3.3.3 Generalized Nash Connection to Residual Model . . . . . . . . . . . . . 71

3.4 Experimental Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
3.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

3.5.1 Broader Observations about the Results: Part A . . . . . . . . . . . . . . 82
3.5.2 Same Randomized Costs: Part A . . . . . . . . . . . . . . . . . . . . . . 82
3.5.3 Different Randomized Costs: Part A . . . . . . . . . . . . . . . . . . . . 84
3.5.4 Broad Conclusions: Part B . . . . . . . . . . . . . . . . . . . . . . . . . 86
3.5.5 Same Randomized Costs: Part B . . . . . . . . . . . . . . . . . . . . . . 86
3.5.6 Different Randomized Costs: Part B . . . . . . . . . . . . . . . . . . . . 89

3.6 Conclusions & Future Research . . . . . . . . . . . . . . . . . . . . . . . . . . 91
3.7 Chapter Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

Chapter 4: Inverse Optimization for Parameterization of Linear Complementarity Problems
and for Incentive Design in Markets 93

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.2 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

4.2.1 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
4.3 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

4.3.1 Basic Linear Complementarity Problem . . . . . . . . . . . . . . . . . . 100
4.3.2 Examples for the Different Cases . . . . . . . . . . . . . . . . . . . . . 108
4.3.3 Basic Inverse Optimization Problem for LCP . . . . . . . . . . . . . . . 127

4.4 Algorithms for Finding q . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
4.4.1 The Complete Information Property and Complementary Cone Effects

on Possible IO q Solutions . . . . . . . . . . . . . . . . . . . . . . . . . 131
4.4.2 Case 1 M ∈ P . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
4.4.3 Case 2 Nondegenerate Matrix M . . . . . . . . . . . . . . . . . . . . . 144
4.4.4 Case 3 Positive Semi-Definite M . . . . . . . . . . . . . . . . . . . . . 156
4.4.5 Case 4 Unknown M Matrix Structure . . . . . . . . . . . . . . . . . . . 161

4.5 Case Study: Water Supply Market . . . . . . . . . . . . . . . . . . . . . . . . . 163
4.5.1 Water Supply Background . . . . . . . . . . . . . . . . . . . . . . . . . 163
4.5.2 Water Supply Inverse Optimization . . . . . . . . . . . . . . . . . . . . 174

4.6 Conclusions & Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
4.7 Chapter Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189

vii


Chapter 5: Conclusion and Future Work 190

Appendix A: A Hybrid Inverse Optimization-Stochastic Programming Framework for
Network Protection 193

A.1 Data Analysis Component: Inverse Optimization . . . . . . . . . . . . . . . . . 193
A.1.1 Proof of dw −Nxw = 0 . . . . . . . . . . . . . . . . . . . . . . . . . . 193
A.1.2 Explanation of Forming the Inverse Model from Bertsimas et al. [21] . . 196

A.2 Normative Model: Two-Stage Stochastic Model . . . . . . . . . . . . . . . . . . 199
A.2.1 Two-Stage Stochastic Model: Parameter Details . . . . . . . . . . . . . . 199
A.2.2 Calculating the Mk,s

i,j Values . . . . . . . . . . . . . . . . . . . . . . . . 201
A.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201

A.3.1 Flow Error under IO α . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
A.3.2 Median/Min-Max Tables and Nguyen & Dupuis Boxplots . . . . . . . . 202
A.3.3 Run Time Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205

A.4 Code Attribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206

Appendix B: Using Inverse Optimization to Learn Cost Functions in Generalized Nash
Games 208

B.1 Variable Number Calculations for Lemma 1 . . . . . . . . . . . . . . . . . . . . 208
B.2 Uniqueness Considerations for the VI . . . . . . . . . . . . . . . . . . . . . . . 210
B.3 Proof of Lemma 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
B.4 Counterexample and MATLAB Code for Lemma 4 . . . . . . . . . . . . . . . . 215
B.5 Additional Experimental Set-Up Details . . . . . . . . . . . . . . . . . . . . . . 216
B.6 Positive Definiteness of Different Costs Matrices with MATLAB rng(5) . . . . . 218
B.7 Objective Function Values for the Experiment Groups . . . . . . . . . . . . . . . 219
B.8 Timing for the Experiment Groups . . . . . . . . . . . . . . . . . . . . . . . . . 222
B.9 Part B: Additional Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
B.10 Code Attribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227

Appendix C: Inverse Optimization for Linear Complementarity Problems and for Incentive
Design in Markets 229

C.1 Principle Minors Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
C.2 Full q Table from Aggregate z Solution IO Quadratic Program Solve . . . . . . . 230
C.3 Verifying the q for System Optimal IO Solve . . . . . . . . . . . . . . . . . . . . 230
C.4 Code Attribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231

Bibliography 232

viii


List of Tables

2.1 Experiment Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.2 Experiment Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.3 Means, Means as Percentage of I = 6 Budget, and 99% Confidence Intervals

(CI) for Experiments IA-IVA, ϵ = 0.01 . . . . . . . . . . . . . . . . . . . . . . . 43
2.4 Means, Means as Percentage of I = 6 Budget, and 99% Confidence Intervals

(CI) for Experiments IA-IVA, ϵ = 0.001 . . . . . . . . . . . . . . . . . . . . . . 43

3.1 Experimental Details. Here, # refers to the experiment group number, Network
refers to the graph upon which the experiment was run, C Bounds indicating the
range for randomization and the bounds used in the IO mathematical program for
the diagonal of the C, c̄ Bounds indicates the same for the c̄ parameters, and α
refers to that parameter value. These experiments are repeated for both the part
A and part B results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

4.1 q Vectors and Associated z Solutions for Example 1.1 . . . . . . . . . . . . . . . 111
4.2 q Vectors and Associated z Solutions for Example 2.1 . . . . . . . . . . . . . . . 114
4.3 q Vectors and Associated z Solutions for Example 2.2 . . . . . . . . . . . . . . . 116
4.4 q Vectors and Associated z Solutions for Example 3.2 . . . . . . . . . . . . . . . 121
4.5 q Vectors and Associated z Solutions for Example 4.1 . . . . . . . . . . . . . . . 124
4.6 Parameter Values for Three Node Model . . . . . . . . . . . . . . . . . . . . . . 168
4.7 PATH Solver [54, 70] Generated Market Solutions . . . . . . . . . . . . . . . . . 176
4.8 q from Observed z Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
4.9 Comparing z Solutions Between Market and Aggregate Models . . . . . . . . . 183
4.10 Sum of Objective Function Values Market vs Aggregate Model (cost/day) . . . . 184
4.11 Individual Objective Function Values Market vs Aggregate Models (cost/day) . . 184
4.12 Selected IO q Entries for GCM and CSM Markets Using Aggregate z Solution . . 187

A.1 Flow Errors for BPR Functions on the Two Networks . . . . . . . . . . . . . . . 202
A.2 Medians, Medians as Percentage of I = 6 Budget, and Ranges for Experiments,

ϵ = 0.01 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
A.3 Medians, Medians as Percentage of I = 6 Budget, and Ranges for Experiments

A, ϵ = 0.001 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204

C.1 q Recovered from IO for GCM and CSM Markets Using One Aggregate z Solutions230

ix


List of Figures

1.1 Commonalities Between the Three Dissertation Projects . . . . . . . . . . . . . 7

2.1 Illustrative Road Networks Used for Experiments . . . . . . . . . . . . . . . . . 35
2.2 Scenarios for Experiments I & II . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.3 Scenarios for Experiments III & IV . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.4 Experiment IA Results: 4x4 Grid Network with Linear Cost. The parameter

differences refer to the ϕ differences. . . . . . . . . . . . . . . . . . . . . . . . . 44
2.5 Experiment IIA Results: 4x4 Grid with BPR Function. Parameter differences

here refers to the α differences. . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.6 Experiment IB All OD Pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.7 Experiment IIB All OD Pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.8 Experiment IIIB All OD Pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.9 Experiment IVB All OD Pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.10 Flow Error for Linear and BPR on 4x4 Directed Grid Network: Note a few of the

BPR flow error values fell outside of the 0-0.2 flow error range. . . . . . . . . . . 48
2.11 Flow Error for Linear and BPR on N & D Directed Grid Network: Note some of

the BPR flow error values fell outside of the 0-0.2 flow error range. . . . . . . . . 48

3.1 Framework for the Transportation Game . . . . . . . . . . . . . . . . . . . . . . 54
3.2 Networks utilized for testing inverse optimization framework . . . . . . . . . . . 76
3.3 Flow Error Metrics for Experiment Group 1: The labeling at the bottom of the

graphs indicates attributes of the boxplot, specifically the Grid Size/Number of
Players/Alpha Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

3.4 Flow Error Metrics for Experiment Group 2: The labeling at the bottom of the
graphs indicates attributes of the boxplot, specifically the Number of Players/Alpha
Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

3.5 Flow Error Metrics for Experiment Group 3: The labeling at the bottom of the
graphs indicates attributes of the boxplot, specifically the Number of Players/Alpha
Value. Note: Only 6 trials are included for 5/10/5.0, see note in the text. . . . . . 85

3.6 Flow Error Metrics for Experiment Group 4: The labeling at the bottom of the
graphs indicates attributes of the boxplot, specifically the Number of Players/Alpha
Value. Note: The trials did not finish in time for N = 10 and α = 5, hence that
boxplot is not included. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

3.7 2× 2 Grid Train Test Same Costs . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.8 4× 4 Grid Train Test Same Costs . . . . . . . . . . . . . . . . . . . . . . . . . 88
3.9 Sioux Falls Train Test Same Costs . . . . . . . . . . . . . . . . . . . . . . . . . 88

x


3.10 2× 2 Grid Train Test Different Costs . . . . . . . . . . . . . . . . . . . . . . . . 89
3.11 4× 4 Grid Train Test Different Costs . . . . . . . . . . . . . . . . . . . . . . . . 90
3.12 Sioux Falls Train Test Different Costs . . . . . . . . . . . . . . . . . . . . . . . 90

4.1 Linear Program Diagram for Example 3.1 . . . . . . . . . . . . . . . . . . . . . 118
4.2 QIOz as a Subset of the Corresponding Complementary Cone for z . . . . . . . . 136
4.3 Overlapping Complementary Cones . . . . . . . . . . . . . . . . . . . . . . . . 137
4.4 Example 2.2 Complementary Cones . . . . . . . . . . . . . . . . . . . . . . . . 155
4.5 Three Node Line Graph Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

A.1 Experiment IIIA Results: Nguyen & Dupuis Network with Linear Cost. The
parameter differences refer to the ϕ differences. . . . . . . . . . . . . . . . . . . 204

A.2 Experiment IVA Results: Nguyen & Dupuis Network with BPR. The parameter
differences refer to the α differences. . . . . . . . . . . . . . . . . . . . . . . . . 205

A.3 Experiment A Timing Results (Minutes) . . . . . . . . . . . . . . . . . . . . . . 206

B.1 Minimum Eigenvalues for 10 Trials, N = 2 − 15, rng(5), and Arc Number=76,
which is the same number of arcs as Sioux Falls . . . . . . . . . . . . . . . . . . 219

B.2 Objective Function Values for Experiment Group 1: The labeling at the bottom
of the graph indicates attributes of the boxplot, specifically the Grid Size/Number
of Players/Alpha Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220

B.3 Objective Function Values for Experiment Group 2: The labeling at the bottom of
the graph indicates attributes of the boxplot, specifically the Number of Players/Alpha
Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221

B.4 Objective Function Values for Experiment Group 3: The labeling at the bottom of
the graph indicates attributes of the boxplot, specifically the Number of Players/Alpha
Value. Note: Only 6 trials are included for 5/10/5.0, see note in the text. . . . . . 221

B.5 Objective Function Values for Experiment Group 4: The labeling at the bottom of
the graph indicates attributes of the boxplot, specifically the Number of Players/Alpha
Value. Note: The trials did not finish in time for N = 10 and α = 5, hence that
boxplot is not included. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222

B.6 Timing for Experiment Group 1: The labeling at the bottom of the graph indicates
attributes of the boxplot, specifically the Grid Size/Number of Players/Alpha
Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223

B.7 Timing for Experiment Group 2: The labeling at the bottom of the graph indicates
attributes of the boxplot, specifically the Number of Players/Alpha Value . . . . 223

B.8 Timing for Experiment Group 3: The labeling at the bottom of the graph indicates
attributes of the boxplot, specifically the Grid Size/Number of Players/Alpha
Value. Note: Only 6 trials are included for 5/10/5.0, see note in the text. . . . . . 224

B.9 Timing for Experiment Group 4: The labeling at the bottom of the graph indicates
attributes of the boxplot, specifically the Number of Players/Alpha Value. Note:
The trials did not finish in time for N = 10 and α = 5, hence that boxplot is not
included. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224

B.10 3× 3 Grid Train Test Same Costs . . . . . . . . . . . . . . . . . . . . . . . . . 225
B.11 5× 5 Grid Train Test Same Costs . . . . . . . . . . . . . . . . . . . . . . . . . 225

xi


B.12 3× 3 Grid Train Test Different Costs . . . . . . . . . . . . . . . . . . . . . . . . 226
B.13 5× 5 Grid Train Test Different Costs . . . . . . . . . . . . . . . . . . . . . . . . 226

xii


List of Abbreviations

BPR Bureau of Public Roads
CI Confidence Intervals
CSM Cost-Sharing Market
DSS Disaster Support System
FEMA Federal Emergency Management Agency
GAMS General Algebraic Modeling System
GCM General Commodity Market
GIS Geographic Information System
GNEP Generalized Nash Equilibrium Problem
IO Inverse Optimization
KKT Karush–Kuhn–Tucker
LCP Linear Complementarity Problem
MATLAB Matrix Laboratory
MCP Mixed Complementarity Problem
MIP Mixed-Integer Program
MPEC Mathematical Program with Equilibrium Constraints
N & D Nguyen & Dupuis
OD Origin-Destination
PH Progressive Hedging
QVI Quasi-variational Inequality
RHS Right Hand Side
SNPP Stochastic Network Protection Problem
VI Variational Inequality

xiii


The Graduate School, University of Maryland

2123 Lee Building, 7809 Regents Drive

College Park, MD 20742

Dear Dean Fetter,

We provide this letter at the beginning of Stephanie Allen’s dissertation to confirm that she

has made a substantial contribution to the coauthored work contained in this dissertation. This

dissertation is composed of three projects.

Project 1 is entitled “A Hybrid Inverse Optimization-Stochastic Programming Framework

for Network Protection,” and it was authored by Stephanie Allen, Daria Terekhov, and Steven

Gabriel. For project 1, Stephanie Allen did the vast majority of the work on this project, from

framework conception to code writing to paper writing. She received some advice from Daria

Terekhov on implementation of the train-test part of the experiments, but she had brought the

idea of these types of experiments herself. Daria Terekhov and Steven Gabriel provided editorial

feedback.

Project 2 is entitled “Using Inverse Optimization to Learn Cost Functions in Generalized

Nash Games,” and it was authored by Stephanie Allen, Steven Gabriel, and John Dickerson.

Stephanie Allen did the vast majority of the work on this project, including extending the inverse

optimization framework discussed to jointly convex generalized Nash equilibrium problems,

forming the transportation game, writing the code, and writing the paper. Steven Gabriel worked

out one of the proofs and one of the counter examples in the Appendix, and he talked about a

paper in one of his classes that Allen found very useful in extending the IO framework to jointly

convex generalized Nash equilibrium problems. Steven Gabriel and John Dickerson provided

xiv


editorial comments on the paper.

Finally, project 3 is entitled “Inverse Optimization for Parameterization of Linear Complementarity

Problems and for Incentive Design in Markets,” and it was authored by Stephanie Allen, Steven

Gabriel, and Nathan Boyd. Stephanie Allen did the vast majority of the work on this project,

including the new theory and proofs, the majority of the coding for the case study, and the writing

of the paper. Steven Gabriel provided some suggestions for additional avenues to explore for

the theory as well as provided much editorial feedback. Nathan Boyd helped Stephanie Allen

with understanding the application behind the case study as well as interpreting results obtained

from solving the linear complementarity problems involved. Nathan Boyd also provided editorial

feedback on the case study section.

We provide the signatures of the co-advisors of Stephanie Allen as well as the signature of

her Graduate Program Director.

Sincerely,

Stephanie Allen

Dr. Steven A. Gabriel, Co-Advisor

Dr. John P. Dickerson, Co-Advisor

Dr. Howard Elman, AMSC Graduate Director

xv


Chapter 1: Introduction

Transportation and infrastructure modeling allows us to pursue societal aims such as improved

disaster management, traffic flow, and water allocation. Equilibrium programming enables us

to represent the entities involved in these applications such that we can learn more about their

dynamics. These entities include transportation users and market players. However, determining

the parameters in these models can be a difficult task because the entities involved in these

equilibrium processes may not be able to articulate or to disclose the parameterizations that

motivate them. The field of inverse optimization (IO) offers a potential solution to this problem.

1.1 Background: Equilibrium Problems

Before discussing inverse optimization, we provide some background on equilibrium problems

in general. First, we describe the concept of a Nash equilibrium, which applies to games in which

the players have some kind of “market power” or, in other words, the ability to affect each other

and the outcome of the game [75]. Indeed, with entities and companies in our world gaining more

market power over time, Nash games are becoming more and more relevant. For convex Ki and

convex fi(xi), we can write player i’s problem as [75]:

min
xi∈Ki

fi(xi,x−i) (1.1.1)

1


According to Gabriel et al. [75], a Nash equilibrium x∗ “has the property that, for every i, x∗
i

solves (1.1.1) given x−i = x∗
−i.” Essentially, this means that x∗i is a Nash equilibrium if it is the

optimal solution to player i’s optimization problem assuming that the other players in the game

do not change their decision variables. We can find an Nash equilibrium for the above problem

using a relationship called a variational inequality in which we aim to find an x∗ such that [75]:

G(x∗)T (x− x∗) ≥ 0, ∀x ∈ K =
∏
i

Ki (1.1.2a)

G(x∗) =


∇x1f1(x)

...

∇xI
fI(x)

 (1.1.2b)

Thus, there is an equivalence between a solution to this variational inequality (VI) and a Nash

equilibrium point for the multi-player game [75]. The variational inequality can also represent

other equilibrium phenomenon such as the traffic equilibrium principle by Wardrop [190]. Smith

[167] prove that an adjusted version of this variational inequality problem encodes the properties

of the Wardrop equilibrium principle, with the G function representing the cost of flow on either

paths or arcs.

We can then transform the VI to what is known as a mixed complementarity problem.

Given that G and g are convex and continuously differentiable, h is affine, and a constraint

qualification holds, both the following VI and MCP-VI produce the same solution sets [75, 93].

2


VI(K,G): Find x∗ ∈ K such that:

G(x∗)T (x− x∗) ≥ 0 ∀x ∈ K = {x ∈ Rn|g(x) ≤ 0, h(x) = 0} (1.1.3)

MCP-VI: Find x∗, v∗ ∈ Rm, and β∗ ∈ Rp such that:

G(x∗) +∇g(x∗)Tv∗ +∇h(x∗)Tβ∗ = 0, x is free. (1.1.4a)

0 ≥ g(x∗) ⊥ v∗ ≥ 0 (1.1.4b)

h(x∗) = 0, β∗ is free. (1.1.4c)

The value of being able to transform the VI into a mixed complementarity problem (MCP)

is that it allows us to stack the KKT conditions of Nash problems into one MCP and use the

PATH solver [54] in GAMS to solve the problem [75]. Sometimes, we are able to simplify the

MCP into what is known as a linear complementarity problem (LCP) which can be defined for

nonnegative variables z ∈ Rn
≥0 and parameters M ∈ Rn×n and q ∈ Rn as follows [48]:

0 ≤ z ⊥Mz+ q ≥ 0 (1.1.5)

The ⊥ indicates the inner product between the two vectors above. We further explore the

parameterization of this LCP in Chapter 4.

3


1.2 Background: Inverse Optimization

Inverse optimization allows a user to parameterize particular functions in optimization

and equilibrium problems using solutions to these problems [3, 21, 205]. This means we can

take observed data regarding phenomena in the world and use this data to find the parameters

that define the motivating functions at the heart of the models that represent the phenomena.

Mathematically, we represent this as follows. For optimization problems, we have the classic

problem of, given some function f(x) : Rn → R and set S, find x that solves the following

problem:

min
x
f(x) (1.2.1a)

x ∈ S (1.2.1b)

Conversely, in part of the inverse optimization literature, we take a solution x [3, 37] or a set of

solutions X [13, 112] and use them to find the parameters of the function f(x). Another part

of the inverse optimization literature uses the solutions to find the parameters for the constraints

defining S [36, 82, 161, 174]. What’s more, additional literature has explored explicitly dealing

with when the solution(s) provided are noisy [8, 132], with the statistical properties of the

inverse optimization problem [8], and with robust versions of this problem [81]. For equilibrium

problems, we use the variational inequality (VI) formulation from Bertsimas et al. [21] to

describe inverse optimization for these types of problems. In a VI problem, we find a x̂ ∈ Rn

such that the following relationship is satisfied for function F (x) : Rn → Rn and set F ⊆ Rn:

4


F (x̂)T (x− x̂) ≥ 0, ∀x ∈ F (1.2.2)

We know from Gabriel et al. [75] that the VI generalizes many game theory and market problems

which are themselves equilibrium problems and generalizes their mathematical structures including

linear complementarity problems (LCPs) and mixed complementarity problems (MCPs). As

Bertsimas et al. [21] demonstrate, one part of the inverse optimization literature aims to parameterize

the F (x) function using solution(s) to this problem [4, 21, 154]. Other work focuses upon

parameterizing both F (x) and F [103]. See [38] for a recent review of the inverse optimization

literature. Our work falls into the equilibrium category of inverse optimization.

1.3 Summary of the Dissertation

In this dissertation, we explore the use of inverse optimization to parameterize multiple

new or understudied subclasses of equilibrium problems as well as expand inverse optimization’s

application to new infrastructure domains. Indeed, we pursue a combination of extending methods

in the literature after being inspired by applications and applying methods in the literature to

new applications. Using inverse optimization to parameterize new or understudied subclasses

of equilibrium problems is important because it allows us to parameterize the entities involved

in these systems to better understand their motivations. Introducing inverse optimization to new

applications extends the field’s use and involved more communities in its development. This

dissertation is composed of three projects in total, which comprise the next three chapters.

In the first project of our dissertation [5], our contribution to the literature is to propose

that IO can be used to parameterize cost functions in multi-stage stochastic programs for disaster

5


management and can be used in disaster support systems. Specifically, the multi-stage stochastic

program that we parameterize is a challenging problem because it is a stochastic optimization

problem with a traffic equilibrium problem embedded within it, forming a mathematical problem

with equilibrium constraints (MPEC) problem. MPECs are mathematical programming structures

that embed equilibrium problems and/or optimization problems within their constraints [75]. We

use inverse optimization on a separate traffic equilibrium problem to obtain parameters for the

travel cost functions inherent in the objective function and equilibrium constraints of this MPEC.

Through computational experimentation, we demonstrate in most of our experiments that using

IO to obtain the hidden cost parameters for travel on a road network changes the protection

decisions made on that road network when compared to utilizing the mean of the parameter

range for the hidden parameters (also referred to as “uniform cost”). The protection decisions

made under the IO cost parameterizations versus the true cost parameterizations are similar for

most of the experiments, thus lending credibility to the IO parameterizations.

In the second project of our dissertation [4], we extend a well-known framework in the IO

community to the case of jointly convex generalized Nash equilibrium problems (GNEPs). In

particular, we take the inverse optimization techniques of Keshavarz et al. [112] and Ratliff et al.

[154] and integrate an additional term to account for the jointly convex constraints, justifying and

proving this addition using GNEP and VI theory. We demonstrate the utility of this framework

in a multi-player transportation game in which we vary the number of players, the capacity level,

and the network topology in the experiments as well as run experiments assuming the same costs

among players and different costs among players. Our promising results provide evidence that

our work could be used to regulate traffic flow toward aims such as reduction of emissions.

In the final project of our dissertation, we explore the general parameterization of the

6


constant vector in linear complementarity problems (LCPs), which are mathematical expressions

that can represent optimization, game theory, and market models [75]. Unlike the limited previous

work on inverse optimization for LCPs, we characterize theoretical considerations regarding the

inverse optimization problem for LCPs, prove that a previously proposed IO solution model

can be dramatically simplified, and handle the case of multiple solution data points for the IO

LCP problem. What’s more, we use our knowledge regarding LCPs and IO in a water market

allocation case study, which is an application not previously explored in the IO literature, and we

find that charging an additional tax on the upstream players enables the market to reach the system

optimal. In sum, this dissertation contributes to the inverse optimization literature by expanding

its reach in the equilibrium problem domain and by addressing new infrastructure applications.

These three projects together draw upon similar themes and have some overlapping characteristics,

which we demonstrate in the following Venn Diagram:

Figure 1.1: Commonalities Between the Three Dissertation Projects

As can be seen and as has been demonstrated throughout this Introduction, all three projects share

the common theme of inverse optimization. They also all deal with some kind of infrastructure

7


application, with project 1 dealing with road network protection, project 2 employing a traffic

game among multiple players, and project 3 exploring a water market application. Projects 1

and 2 have the commonality of embarking on intensive computational experiments to validate

the approaches. Projects 1 and 3 both apply inverse optimization to a new application area,

with project 1 demonstrating inverse optimization for disaster relief and project 3 employing

inverse optimization in the context of water markets. Projects 2 and 3 extend inverse optimization

techniques and/or theory to new or understudied sub-classes of equilibrium problems, with project

2 addressing jointly convex generalized Nash equilibrium problems (GNEPs) and project 3 exploring

linear complementarity problems (LCPs).

Indeed, we can see the progression and maturation process of the research through these

three projects as well as coverage of many topics and concepts in optimization. We start at project

1 with a new application idea and a difficult computational task in the form of solving a stochastic

MPEC, touching on areas including stochastic programming and handling complementarity constraints

embedded in optimization problems. We advance through project 2 in which we take a theoretical

step in our research and advance inverse optimization in a more general sense to a new sub-

class of equilibrium problems, stretching our understanding of building blocks of equilibrium

models including game theory, variational inequalities, and mixed complementarity problems.

This project also involved a large computational study. Indeed, in both of the first two projects,

we learned strategies and techniques for approaching large computational studies. Finally, in

the culmination of our research, we embark upon a much more difficult theoretical task: the

parameterization of linear complementarity problems which encompass optimization, game theory,

and market models. In approaching this task, we consider the multiplicity of solutions to these

problems and broader inverse optimization considerations such as the implications of moving

8


between the original problem and the inverse optimization problem. What’s more, in project

3, we approach a new application and, in that case study, we demonstrate the power of inverse

optimization to impact policy directly through using system optimal solutions to parameterize

markets and, thus, encourage the addition of new taxes or subsidies. Indeed, we provide a

framework through which to parameterize LCPs, touching on such topics as complementarity

cones and matrix analysis, and a demonstration of the capabilities of inverse optimization to

influence markets and policy in profound ways.

Therefore, to conclude, we showcase in this dissertation that the worlds of theory and

application must be in conversation with each other in order to advance our understanding of

both, thus underscoring the importance of this broad field we call Applied Mathematics.

1.4 Papers and Presentations

Before we progress to the rest of the dissertation, we outline the papers and presentations

that have been produced as a result of this dissertation.

• Project 1: [5]

– Papers: Allen, Stephanie, Daria Terekhov, and Steven A. Gabriel. “A Hybrid Inverse

Optimization-Stochastic Programming Framework for Network Protection.” arXiv

preprint arXiv:2110.00488 (2021). (Under review by a journal)

– Presentations: INFORMS 2020 & Trans-Atlantic Infraday Conference 2021

• Project 2: [4]

– Papers: Allen, Stephanie, Steven A. Gabriel, and John P. Dickerson. “Using inverse

9


optimization to learn cost functions in generalized Nash games.” Computers & Operations

Research (2022): 105721.

– Presentations: INFORMS 2021, ECOM 2021, & NTNU Winter School 2022

• Project 3:

– Papers: Allen, Stephanie, Steven A. Gabriel, & Nathan T. Boyd, “Inverse Optimization

for Parameterization of Linear Complementarity Problems and for Incentive Design

in Markets,” Available on arXiv soon.

10


Chapter 2: A Hybrid Inverse Optimization-Stochastic Programming Framework

for Network Protection

Note: The contents of this chapter come mostly from a paper we have published on arXiv,

referenced here: [5]. There are certain additions that were made in response to the first round of

reviews from an academic journal. We have received the second round of reviews which will not

greatly change the contents of the chapter, thus we will address these comments post-dissertation.

We have presented this work at INFORMS 2020 and at the Trans-Atlantic Infraday Conference

in 2021. We thank our co-authors on this chapter: Daria Terekhov and Steven Gabriel. Stephanie

Allen did the vast majority of the work on this project, from framework conception to code

writing to paper writing. She received some advice from Daria Terekhov on implementation of

the train-test part of the experiments, but she had brought the idea of these types of experiments

herself. Daria Terekhov and Steven Gabriel provided editorial feedback.

2.1 Introduction

Given the threat of natural disasters, it is imperative that communities and nations prepare in

order to mitigate the consequences. According to NOAA [139], in the year 2020, there were

22 “weather and climate disasters” that cost 1 billion or more US dollars in the United States,

and 262 people died in these disasters. Governments and planning agencies often have little

11


foresight of the type of disaster that might strike, meaning they must be prepared for many

different potential events. To be effective, governments must make strategic decisions before and

immediately after crises such that financial and human costs are minimized. Indeed, in a report

by Federal Emergency Management Agency (FEMA) entitled “National Response Framework”

[68], officials state there are five mission areas to this response framework which are prevention,

protection, mitigation, response, and recovery, thus indicating that the US government actively

intervenes at the various decision points in the disaster management cycle. We note that we use

the words “disaster” and “crisis” interchangeably in this chapter.

As a response to these events, researchers have developed disaster support systems (DSS),

which include systems that evacuate people out of tunnels [6], deliver supplies to affected individuals

[73], decide where to place supplies in anticipation of a disaster [157], and extinguish wildfires

[201]]. Wallace and De Balogh [187] define DSSes as systems which contain “a data bank, a

data analysis capability, normative models, and technology for display and interactive use of

the data and models.” We focus on the data analysis and normative model elements of these

requirements which, respectively, mathematically examine and present information for decision

makers and help make decisions. One type of normative model used in the disaster management

community which we employ in this chapter is a multi-stage stochastic program. This type of

model is proposed as a way to make decisions regarding protecting networks against disasters or

bringing supplies to communities after disasters. The 2013 US National Infrastructure Protection

Plan [52] proposes seven principles for organizations involved in critical infrastructure, one of

which states: “Risk should be identified and managed in a coordinated and comprehensive

way across the critical infrastructure community to enable the effective allocation of security

and resilience resources.” This principle supports the work of the aforementioned multi-stage

12


programs for disaster relief. Indeed, in the report, the need for incorporating “security and

resilience into the design and operation of assets, systems, and networks” is stated as part of the

way to reduce vulnerabilities in infrastructure, which requires accurate representations of these

structures. The misspecification of the parameters defining these structures can severely limit

the usefulness of the multi-stage programs for disaster relief. In particular, in this chapter, we

focus on transportation cost parameters, whose misspecification could lead to incorrect protection

decisions, such as allocating too few or too many resources to parts of road networks affected by

landslides and flash floods. These incorrect protection decisions could happen if, for instance,

cost parameters are assigned such that more traffic is assumed to be using a road than actual data

shows, which in turn leads to more protective measures being allocated to this road rather than to

the road that actually sees more traffic. We propose using inverse optimization to recover these

cost parameters.

Inverse optimization (IO) allows a user to parameterize particular functions in optimization

and equilibrium problems using solutions to these problems [3, 21, 205]. This means we can

take observed data regarding phenomena in the world and use this data to find the parameters

that define the motivating functions at the heart of the models that represent the phenomena. For

this chapter, we focus on parameterizing the cost functions of a traffic equilibrium model, which

means we take data regarding traffic flow and find the values of parameters of the cost functions

that induced that flow.

Inverse optimization has two main advantages over other parameter estimation approaches:

(a) a user is able to employ a model of an entire system such as an optimization or equilibrium

problem with constraints [3, 21] when performing parameter estimation which explicitly allows

us to incorporate knowledge of the constrained model that generated the data into the estimation

13


method and (b) we can more intuitively understand what is going on with regard to this method

than “black box” methods such as deep learning (see [84] for more information on deep learning).

There are techniques such as structural estimation that can be compared to inverse optimization

but, as [21] outline in Appendix 2 of their paper, there are several advantages of an inverse

optimization formulation over structural estimation, which include (a) allowing for approximate

equilibria and (b) formulating a less complicated problem (because there are no bilinear terms;

see [21] for more details).

2.1.1 Contribution

To our knowledge, neither the DSS literature nor the multi-stage stochastic program for

disaster relief literature have explored inverse optimization as a tool for estimating model parameters.

We propose inverse optimization as a new approach for data analysis and demonstrate its ability to

recover similar protection decisions as the originally parameterized stochastic network protection

model. We also demonstrate that accurate knowledge regarding the cost functions matters because

it can change protection decisions when compared to the assumption of uniform cost for most

of our experiments. If our protection decisions differ when we have more informed knowledge

of hidden traffic parameters versus when we have less information, this is important. Indeed,

having more accurate information enables us to protect the arcs that have the greatest impact on

traffic flow. This chapter’s goal is to demonstrate the joint inverse optimization and stochastic

programming framework on smaller networks and simplified situations, leaving more realistic

networks and situations to future work.

The rest of the chapter is organized as follows. Section 2.2 investigates the literature

14


related to our problem. Section 2.3 provides appropriate mathematical background. Section 2.4

explains the experimental structure, and Section 2.5 explains the results. Section 2.6 discusses

our conclusions and ideas for future work.

2.2 Literature Review

The literature in this section demonstrates that multi-stage disaster relief models and DSS

models have not used inverse optimization previously.

2.2.1 Inverse Optimization for Transportation Problems

Although inverse optimization has not been previously utilized in disaster relief, it has

been used to parameterize cost functions in the transportation literature, which is relevant to our

framework because we use the techniques from this literature to parameterize the cost functions in

the stochastic network protection problem. Within the transportation literature, Thai et al. [177]

use a mathematical program with equilibrium constraints to minimize the difference between

the simulated solutions and optimal solutions to the traffic equilibrium problem as a way of

recovering the specified cost function parameters. Thai et al. [176] use a combination of methods

by [21] and [40] to create a multi-objective program that minimizes the duality gap for the

variational inequality and the difference between the optimal and observed solutions. Bertsimas

et al. [21] use their inverse variational inequality problem along with kernel methods to estimate

the cost functions. Zhang et al. [206] and Zhang et al. [207] follow [21]’s methodology,

with [206] involving different categories of vehicles and [207] emphasizing recovering both

cost function and origin-destination matrices from real-world traffic data. Chow et al. [43] use

15


techniques from [3] but augment them to handle the nonlinear nature of their problem. Finally,

Allen et al. [4] extend [154]’s parameterization framework for multi-player Nash problems to the

case of jointly convex generalized Nash equilibrium problems and demonstrate this framework

by parameterizing a transportation game.

2.2.2 Multi-Stage Disaster Relief Models

There is a substantial literature on multi-stage stochastic programs for disaster relief and

protection; see [85]. Methods for estimating cost functions in road networks include fuzzy

numbers, Euclidean distances, road distance data, the Bureau of Public Roads (BPR) function

(see Section 2.3.1.1), and stochastic programming. None of them use inverse optimization to

estimate costs, which is what we propose in this chapter.

Zheng et al. [208] estimate cost parameters for moving supplies after natural disasters

using fuzzy numbers for the time it takes to traverse between the supply and demand nodes.

Barbarosoglu and Arda [16] use road information and Euclidean distances between points for

their cost parameters in their stochastic model pertaining to distributing supplies after natural

disasters. Chu et al. [44] also calculate the travel cost for several routes/paths of origin-destination

pairs to capture the idea that one or more routes could fail in a disaster in their stochastic network

protection model. Travel cost is measured by a variable which takes on real numbers between 0

and 1 and which measures how close to the shortest path the demand for an OD pair is allowed

to take through the network. Noyan et al. [144] and Doyen et al. [56] use a mixture of road data

along with scenario dependent costs to form their cost functions, while Mohammadi et al. [133]

use exclusively scenario dependent costs.

16


Fan and Liu [65] employ the BPR function for their stochastic network protection problem

for arc costs, but no stochastic parameters are involved. For the rest of the BPR literature, either

the capacity is impacted by the protection decisions and/or some of the parameters in the BPR

function are stochastic [7, 66, 67, 124, 125]. For these multi-stage stochastic programs, inverse

optimization methods would have captured a set of parameters that led to given flow patterns,

which could have been used to augment the existing cost function approaches.

2.2.3 Disaster Support Systems

We focus on reviewing DSSes that have a data analysis step in their processes. First,

there are disaster support system papers that determine important quantities and parameters via

simulation and/or physical models of the situation [6, 50, 59, 73, 118, 162, 178, 182, 201, 202].

Second, DSS papers can also determine parameters via data processing as in [72, 101, 204], or

they can utilize machine learning to determine modeling structures as in [1]. Other DSS papers

use geographic information system (GIS) techniques to estimate parameters [45, 157]. Horita et

al. [100] combine GIS and sensor information to estimate parameters. In addition, some papers

propose data fusion techniques such as ensemble Kalman filters [148] and gradient based methods

[111]. Inverse optimization allows a user to propose a model of the system and parameterize

the model using data/simulated solutions1 and optimality conditions [3, 21, 37, 205], which can

augment the information gained from data and, thus, could be useful for DSSes. However, as can

be seen from this review, DSSes have not used inverse optimization for data analysis.

1See [171] for an example using proposed solutions.

17


2.3 Hybrid Framework

In the next few sections, we will explain our data analysis component (which presents

information for decision makers) along with our normative model (which helps make decisions).

The data analysis component employs Bertsimas et al.’s [21] work on parameterizing cost functions

for the traffic equilibrium problem, which uses traffic data to find the parameters for the cost

functions involved in this model. We utilize the techniques from Bertsimas et al. [21] on inverse

optimization because Bertsimas et al. [21] handles equilibrium problems, which are an important

component of the normative model. The normative model comes from Fan and Liu [65] who

suggest a two-stage network protection problem with equilibrium constraints to make protection

decisions for road networks. We propose pairing the two components together in the following

sequence of steps, with θ representing the collection of parameters to be estimated by the inverse

optimization model:

1. Input data x̂j, j = 1...J into inverse optimization model (2.3.4) and obtain θ

2. Form stochastic network protection problem (SNPP) (2.3.8) with θ

3. Solve the SNPP (2.3.8) and obtain protection decisions u.

2.3.1 Data Analysis Component: Inverse Optimization

Bertsimas et al. [21] utilize variational inequalities (VI) to represent optimization and

equilibrium problems. They assume that the following extended variational inequality describes

the ϵ equilibrium of a system, with F : Rn → Rn, F ⊂ Rn, x ∈ F , and ϵ ∈ R+ [21]:

18


F (x∗)T (x− x∗) ≥ −ϵ, ∀x ∈ F . (2.3.1)

When ϵ = 0, we recover the classical VI that was identified in Chapter 1. In the case of our traffic

application, we assume that F is the cost function, representing the time per vehicle along each

arc in the set A of arcs [124, 125]. Therefore, expression (2.3.1) states that x∗ solves the VI if the

inner product between F at x∗ and the difference between any point in F and x∗ is greater than

a small, negative number.

To discuss the traffic equilibrium problem, we need to define the F function and F set

which requires introducing some new notation. Bertsimas et al. [21] describe the Wardrop traffic

equilibrium with nodes N and arcs A as having a node-arc incidence matrixN ∈ {−1, 0, 1}|N |×|A|

[129], vectors dw ∈ R|N | which contain the origin destination locations represented by the set W

with a negative entry for the origin and a positive entry for the destination [129], and a feasible

set F . Bertsimas et al. [21] define the F set as:2

F =

{
x : ∃xw ∈ R|A|

+ s.t. x =
∑
w∈W

xw, Nxw = dw ∀w ∈ W

}
(2.3.2)

in which xw ∈ R|A|
+ represents the flow between origin and destination w and x ∈ R|A|

+ represents

the composite flow vector. The corresponding F function for the variational inequality is defined

as c(x) such that ca : R|A|
+ → R+ for arc a.

The multipliers associated with the constraints in the F set should be non-negative because

they represent the time it takes to travel from the associated node to the destination w [65].

Therefore, we need to turn the equalities in the F set into inequalities [14, 15]. Respecting

2We define the F set for the inverse optimization model differently; see Appendix A.1.2.

19


the definition of N and dw, our traffic equilibrium problem in complementarity form is then

[14, 15, 21, 65, 75, 165]:3

0 ≤ c(xw) +NTyw ⊥ xw ≥ 0, ∀w ∈ W (2.3.3a)

0 ≤ dw −Nxw ⊥ yw ≥ 0, ∀w ∈ W (2.3.3b)

We know we can undergo this transformation from the variational inequality to the complementarity

form due to the discussion in Chapter 1 about this transformation. We can show that dw−Nxw =

0 when there is a solution for (2.3.3) and when we assume that the c(x) function is greater than 0

for all x ≥ 0 in a proof which is adapted from Ban [15]. See Appendix A.1.1. We use (2.3.3) to

generate data for the inverse optimization part of the framework, and the data generation process

can be found in Section 2.4.1.

For the inverse optimization model in Bertsimas et al. [21], we can then form an optimization

model including each data point x̂j , j = 1, ..., J (with J representing the total number of data

points utilized) representing the flow on the network such that:

• There is one OD pair for each instance x̂j .

• There is the same node-arc incidence matrix N for each x̂j .

The inverse optimization model for J data points (corresponding to each of the x̂j flow patterns),

parameters θ ∈ Θ with Θ as a convex subset of RZ (Z representing a number of parameters),

3We keep the row in N that contains the destination, which is different from [15] and [14].

20


yj ∈ R|N |, and ϵ ∈ RJ is:

min
θ∈Θ,y,ϵ

||ϵ||22 (2.3.4a)

−(N)Tyj ≤ c(x̂j; θ), j = 1, ..., J, (2.3.4b)

yj ≥ 0, j = 1, ..., J, (2.3.4c)

c(x̂j; θ)T x̂j + (dj)Tyj ≤ ϵj, j = 1, ..., J, (2.3.4d)

The derivation of this mathematical program can be found in Appendix A.1.2. In this section of

the Appendix, we explain the Berstimas et al. [21] method in much more detail. We note that

the yj dual variables are important because they correspond, as in the complementarity problem

above, to the travel distance to destination j [65]. We solve this mathematical program using the

ipopt solver [185], which, for convex problems such as the one above, does solve the problem

to global optimality [186]. We set the tol parameter to 1e − 12. There can be multiple forms

for the vector-valued arc cost function c(x; θ), which the next subsection will cover.

Before moving to the next section, it is important to address one of the questions that has

been asked of inverse optimization: how is this different from linear regression? Linear regression

in its unconstrained form as presented in [84] would not be able to express the constraint relationships

in (2.3.4b)-(2.3.4d). In addition, in order to attempt to model our problem with linear regression,

we would need target total cost values for origin-destination pairs on the road network, which

21

ipopt
tol


is something we do not have for our problem. We assume that we only have the values of the

decision variables. We need the dual problem to establish the right optimization relationships

to find our parameter values [21]. Even in linear regression’s constrained form (as expressed

by these papers: [80, 122, 169]), the constraints are on the parameters themselves, which Θ

in (2.3.4a) can already express. Consequently, since even constrained linear regression cannot

accommodate the relationships expressed in (2.3.4b)-(2.3.4d), then we need a more complicated

model to parameterize our cost functions. The equations in (2.3.4b)-(2.3.4d) represent the satisfaction

of dual feasibility and strong duality for the linear program produced by the variational inequality

created by the traffic equilibrium problem. Indeed, one of the most important aspects of the

model (2.3.4) is the set of dual variables yj which must satisfy two sets of inequality relationships

(2.3.4b) and (2.3.4d). These dual variables represent the cost of traveling on the network from the

various nodes to the destination node. What’s more, the inequalities in (2.3.4b)-(2.3.4d) encode

the variational inequality conditions seen in (2.3.1) and, thus, the equilibrium relationship needed

for the problem.

2.3.1.1 Different Types of Cost Functions

We propose two different formulations for the vector valued function c(x) which represents

the time per vehicle along each arc in the set of arcs A [124, 125]. Note that θ will represent the

collection of all parameters for a given function.

22


• Linear Cost: We assume that ca(x) = ϕaxa + βa, ϕa ∈ R+, βa ∈ R+, such that

c(x) =


ϕ1

. . .

ϕ|A|

x+


β1

...

β|A|

 (2.3.5)

This function has been used for representing travel times such as in [166]. Siri et al. [166]

label the βa as the free flow travel times i.e., travel times without any interaction with other

travelers [21, 43, 177, 206, 207]. ϕa is the factor of additional time of having one more unit

of flow on the arc. In this chapter, we assume that the free flow travel times are given, and

our goal is to estimate ϕa for all a ∈ A (see Section 2.4.5).

• Bureau of Public Roads Function: The Bureau of Public Roads function (BPR) [27, 180]

is a common function utilized by transportation researchers when modeling flow along arcs

in a network [21, 176, 206, 207]. From [165], the BPR function for arc a is:

ca(xa) = t0a

(
1 + αa

(
xa

c′a

)β
)
. (2.3.6)

The t0a is the free-flow travel time, c′a is the “practical capacity” which we just take as

the normal capacity, and αa & β are parameters which, following [165], are commonly

assumed to be 0.15 and 4 respectively, regardless of the arc. In contrast, in this chapter,

we will assume that the αa parameter is different for each arc and that it is the quantity

we estimate with inverse optimization (see Section 2.4.5). We linearize the BPR function

using standard techniques [127, 197].

23


In our experiments, we compare the protection decisions made under the costs imputed using IO

with the protection decisions made when a user has the original parameterization from (2.3.3)

and with the protection decisions when a user assumes uniform cost, meaning average ϕ for the

linear cost function and 0.15 for the α in the BPR cost function. Section 2.4.2 will explain this

further. We choose these two cost functions because (a) the linear cost function is easy to use,

and (b) the BPR function is standard in the literature.

2.3.2 Normative Model: Two-Stage Stochastic Model

For the stochastic network protection model portion of the framework, we implement Fan

and Liu’s [65] two-stage network protection model with complementarity constraints with a few

changes in the capacity function, the conservation of flow constraints, and the objective function.

We adopt much of the notation from [65] and extended definitions for these terms can be found

in Appendix A.2.1:

• A: the set of network arcs, and m as the number of arcs.

• N : the set of network nodes, and n as the number of nodes.

• K: the number of destinations of flow in the network.

• S: the scenario set.

• xk,sa : the flow on arc a that is destined for the kth destination in scenario s. The vector

xk,s ∈ Rm denotes the flow on all arcs. (units=thousands of vehicles)

• f s
a : the total flow on arc a in scenario s, and f s as the vector containing all of the arcs.

(units=thousands of vehicles)

24


• ua: the decision variable controlling resources used to protect an arc a against a crisis.

(units=proportion of necessary resources needed to fully insure the arc)

• W : the node-link adjacency matrix.

• qk ∈ Rn: designates the amount of flow originating at each node that is headed to destination

k. (units = thousands of vehicles)

• hsa(ua): the capacity of an arc a given first stage decision ua under scenario s:

hsa(ua) =


capa if a /∈ Ā

capa −ms
a(1− ua) if a ∈ Ā

(2.3.7)

with capa representing capacity of the arc without it being affected by a disaster, ms
a

representing the amount of damage done to arc a in scenario s if not protected, and Ā

represents the set of arc vulnerable to the disaster. Note that ms
a could be 0 in certain

scenarios. (units = thousands of vehicles)

• ta(f s) represents the time per vehicle along arc a [124, 125] as a function of the flows f s in

scenario s. We explore multiple different forms for ta, described in Section 2.3.1.1.

• λk,si as the minimum travel time from node i to node k in scenario s [65]. (units=travel

time)

• dk,s as the vector of extra variables that acts as a buffer for any flow that cannot be properly

apportioned. (units=thousands of vehicles).

• ps as the probability of each scenario s.

25


Fan and Liu’s [65] model with a modification to the complementarity constraints based on work

by [15] and [14] is thus:

min
∑
s∈S

psQs(u, f s) (2.3.8a)

s.t. u ∈ D (2.3.8b)

f s
a =

K∑
k=1

xk,sa ≤ hsa(ua), a ∈ A, s ∈ S (2.3.8c)

0 ≤ xk,sij ⊥
(
ta(f

s) + λk,sj − λk,si

)
≥ 0, ∀(i, j) ∈ A, ∀k = 1...K, ∀s ∈ S (2.3.8d)

0 ≤ qk + dk,s −Wxk,s ⊥ λk,s ≥ 0, ∀k = 1...K, ∀s ∈ S (2.3.8e)

The Qs(u, f s) function (2.3.8a) in the objective function has the following form:

Qs(u, f s) = ⟨ψ,u⟩+ γ⟨f s, t(f s)⟩+ 10000
K∑
k=1

||dk,s||22

The first term denotes the total cost of protection (with ψ as the dollar amount it costs to protect

each arc fully); this term differs from the [65] paper which instead uses the cost of repair. The

second term computes the total travel time for all of the flow on each arc, sums these amounts,

and then multiplies the sum by γ, which transforms travel time to financial units [65], keeping

in the same units as the first term. Note, t(f s) corresponds to the c(x) function from Section

26


2.3.1. The third term makes it extremely costly for the model to use any of the buffer that the dk,s

vectors provide for the conservation of flow (2.3.8e) constraints. Constraints (2.3.8b) represent

the budgetary and technological restrictions that [65] define. We specified this further as just

budgetary constraints of the form: ∑
a∈A

ua ≤ I (2.3.9)

with I representing the number of arcs we can afford to fully protect. However, because ua are

continuous variables, we can protect more than I number of arcs partially because we are treating

ua as proportions. The capacity constraints (2.3.8c) have an s dependence for the hsa functions

because the ms
a ∀a ∈ Ā are scenario-dependent. The constraints in (2.3.8d) encapsulate the idea

that there should be no flow on the arc a on its way to destination k in scenario s unless that arc is

part of the minimal travel time route to destination k. The complementarity constraints in (2.3.8e)

include the conservation of flow constraints that ensure flow begins and ends at the appropriate

places in the network. The dk,s vectors are buffers in case some of this flow does not fulfill the

conservation of flow constraints; Fan and Liu [65] define them as variables to ensure a feasible

solution.

We specify some of the parameters for the model that will not change over the course of

the chapter:

• The capa value is set to 8 for all arcs a.

• The ms
a value (amount of damage) is set to 8.

• In the objective function, we set γ = 1 (following [65]) and set the ψ vector to 1 because

we do not want cost to be prohibitive.

27


2.3.2.1 Big M Method for Complementarity Constraints and Progressive Hedging

Algorithm for Solving Stochastic Network Protection Problem

Fan and Liu [65] note in their paper that the stochastic network protection problem is

difficult to solve because of (1) the complementarity constraints and (2) the stochastic elements.

In order to handle the complementarity constraints, we use the disjunctive constraint/big M

method approach [74, 95]. As an example, we take the complementarity condition from (2.3.8d)

and produce a series of constraints:

xk,sij ≥ 0 (2.3.10a)

ta(f
s) + λk,sj − λk,si ≥ 0 (2.3.10b)

xk,sij ≤Mk,s
ij (bk,sij ) (2.3.10c)

(
ta(f

s) + λk,sj − λk,si

)
≤Mk,s

ij (1− bk,sij ) (2.3.10d)

The bk,sij is a binary variable, and Mk,s
ij is a sufficiently large number, which forces at least

one of the two terms in (2.3.10c) or (2.3.10d) to be 0. We repeat the same procedure for the

complementarity constraints in (2.3.8e). See Appendix A.2.2 for information on calculating the

Mk,s
ij values. We note that this addition of binary variables increases our computational costs

because, now, we are dealing with a mixed-integer problem (MIP). To handle the stochasticity of

this problem, we follow [65] by employing the progressive hedging (PH) algorithm. Proposed by

28


Rockafellar and Wets [156], the PH algorithm at its most basic level solves scenario subproblems

created by the random variable(s) involved in the original problem using an approach in which

there is a penalty term that encourages first-stage variables to tend toward the “aggregate” solution

[156] that is computed after each iteration of the algorithm. See the following references for more

information about using the algorithm, about setting its parameters, and about the fact that the

PH algorithm serves as a heuristic when handling MIP problems: [32, 49, 65, 83, 86, 106, 119,

136, 149, 160, 183, 191]. We use the implementation of the PH algorithm found in the pysp

extension [192] of the pyomo package [95, 96] in Python. We use gurobi [90] for the mixed-

integer quadratic programming sub-problems arising as part of the PH algorithm.

We note that the binary variables created by the big M method cause the SNPP to become

a mixed integer quadratic program in the case of the linear cost function. In the case of the

BPR cost function, we need to integrate a non-convex constraint into the mixed integer quadratic

program. We handle this non-convexity by turning on gurobi’s non-convex flag. This non-

convexity means that we could be obtaining local solutions for the BPR function version of the

SNPP.

Overall, with regard to the size of our problem, one of the computational bottlenecks is

the presence of the complementarity constraints in (2.3.8d) and (2.3.8e). They produce binary

variables due to the big M method, which means as the network becomes larger with more arcs, as

the number of scenarios increases, and/or as the number of origin-destination pairs K increases

(the beginning and ending points of flow on the network), more binary variables are produced

and, thus, the computational complexity increases. Thus, the size of the problem is dependent

upon the number of arcs, scenarios, and OD pairs. For the BPR cost function, there is the added

dimension of the number of breakpoints in the linearization of the function, which adds additional

29

pysp
pyomo
gurobi
gurobi


binary variables outside of the complementarity constraints.

2.4 Experimental Design

In this section, we define our experimental setup and the metrics by which we will evaluate

the experiments. Most of our experimental results demonstrate that inverse optimization enables

users to recover comparable protection decisions as the original cost protection decisions, and

there is a difference between protection decisions made under uniform cost parameters and the

original or IO parameterizations.

2.4.1 Data Generation

We generate data (observations of flow x̂j for all j = 1, ..., J), as discussed in Section 2.1,

using the forward problem in the form of the complementarity model (2.3.3); we solve (2.3.3)

using PATH [54, 70] in GAMS. The set Λ represents the origin-destination pairs utilized for

each run of the complementarity model. For this chapter, Λ is the set of all different origin-

destination (OD) pairs for each network for the data generation process, one pair for each run of

the complementarity model. In more complicated versions, Λ would consist of multiple different

OD pairs per run of the complementarity model. The algorithm below illustrates generating the

x̂j data for j = 1, ..., J given a set of configurations Λ with |Λ| = J .

30


Algorithm 1: Generating the Data
Data: The set Λ of configurations

for j = 1:|Λ| do
Build the traffic equilibrium model (2.3.3) with the jth configuration of OD pair(s)

Solve the traffic equilibrium model (2.3.3) with PATH in GAMS

Store optimal x̂j

end

In part A of the experiments, the generated data x̂j, j = 1...J is used as input into the inverse

optimization model to determine the parameters θ for the cost function. In part B of the experiments,

a subset of the generated data x̂j, j = 1...J , is used as input into the inverse optimization model

to determine the parameters θ for the cost function. Namely, for each of the OD pairs, we leave

out the data corresponding to that OD pair to obtain parameters θ that correspond to that OD

pair. This parallels the training-testing framework used in machine learning [84]. For both

experimental setups, we then carry through with the rest of the hybrid framework described at the

beginning of Section 2.3 to obtain the protection decisions. See Section 2.4.3 for more details.

We note that the data produced by the methods outlined in this section is non-noisy data and,

thus, represents perfect equilibrium solutions to the traffic equilibrium model (2.3.3). This in part

causes the forthcoming defined “flow error” to be low for our experiments.

2.4.2 Metrics

In order to evaluate our hybrid framework, we must solve the stochastic network protection

problem three times for each set of generated data because we must compare the protection

decisions under the original cost parameters, the inverse cost parameters, and the assumption of

31


uniform cost parameters:

• We define the IO information protection decisions, denoted u, as the protection decisions

the two-stage model would make based on the cost vector obtained using the inverse

optimization algorithm. This cost vector is defined as θ.

• We define the original information protection decisions, denoted û, as the protection decisions

that the two-stage model would make if it were directly given the original cost structure

that was used in (2.3.3) to generate the data x̂j, j = 1...J . The goal of the framework is for

the inverse optimization algorithm to be able to provide a cost estimate that will result in

the same/comparable protection decisions as under the original cost structure. We use the

Original-IO metric below to evaluate the similarity between the protection decisions: the

closer the Original-IO metric is to 0, the better IO is at recovering parameters leading to

the original (assumed correct) decisions. The cost vector corresponding to the original cost

structure is θ̂.

• We define the uniform information protection decisions, denoted ū, as those decisions that

the two-stage model would make if it were given a uniform cost structure for the network. If

the original information decisions differ significantly from the uniform information decisions,

then this provides evidence that knowing the cost structure of the network is important. The

cost vector corresponding to the uniform costs is θ̄.

Our performance metrics capture the difference between the protection decisions made under

different costs:

• Original-IO (O-IO): ||û− u||2

32


• Uniform-IO (U-IO): ||ū− u||2

• Uniform-Original (U-O): ||ū− û||2.

One other metric used in part B of the experimental results is known as flow error in which we

take the difference between the flow produced by the original parameters and the flow produced

by the inverse optimization parameters. We take the 2-norm between these two flow vectors.

Mathematically, this can be expressed as:

Flow Error: ||x̂− x||2 (2.4.1)

2.4.3 Procedures for Experiments A and B

In part A of the experimental results, the full framework is as follows:
Algorithm 2: Part A of the Experiments

Data: Network Structure, Type of Cost Function

for i=1:10 do
Generate the data x̂j, j = 1...J according to Algorithm 1

Input data x̂j, j = 1...J into inverse optimization model (2.3.4) and obtain θ

u = SNPP(θ)

û = SNPP(θ̂)

ū = SNPP(θ̄)

Calculate the performance metrics according to Section 2.4.2

end

In part B of the experimental results, the full framework is as follows:

33


Algorithm 3: Part B of the Experiments
Data: Network Structure, Type of Cost Function, Number of OD pairs

for i=1:10 do
Generate the data x̂j, j = 1...J according to Algorithm 1

for k=1:number of OD pairs do
Input data x̂j, j = 1...J with the x̂j corresponding to the kth OD pair removed

into inverse optimization model (2.3.4) and obtain θ

Calculate the flow error by generating the flow for the kth OD pair under the θ

and compare this flow with x̂k

end

for k ∈ subset of OD pairs do
u = SNPP(θ)

û = SNPP(θ̂)

ū = SNPP(θ̄)

Calculate the performance metrics according to Section 2.4.2

end

end

In part A, there are 10 trials for each network structure and type of cost function; in part B, within

each trial, there are separate sub-trials for each OD pair in which (a) the flow error metric is

calculated for the kth OD pair and (b) the performance metrics from Section 2.4.2 are calculated

for a subset of OD pairs. We choose to take 10% of the OD pairs for computational reasons. The

flow error metric corresponds to a type of test error for the IO algorithm because we leave out

x̂k when calculating θ for that OD pair and, then, we take the 2-norm between the flow obtained

from Algorithm 1 and the flow obtained for the kth OD pair under θ [4, 84].

34


2.4.4 Networks and Scenarios

We consider two networks on which to test our hybrid framework: a 4x4 grid in Figure 3.2a

and the Nguyen & Dupuis network [142] in Figure 2.1b, both of which we make bidirectional.

(a) 4x4 Directed Grid Network, 16 Nodes & 48 Arcs (b) Nguyen & Dupuis Network, 13 Nodes & 38 Arcs

Figure 2.1: Illustrative Road Networks Used for Experiments

Experiment I: In the first experiment, the linear cost function along with the 4x4 grid are utilized.

The linear cost function is ϕaxa + βa for each arc a. The scenarios are chosen such that every

other pair of arcs are vulnerable to complete destruction. This can be seen in Figure 2.2 through

the placement of the triangles with lightening bolts, indicating the arc pairs at risk. Each arc pair

is given a 1/12 chance of failing.

35


Figure 2.2: Scenarios for Experiments I & II

Furthermore, for the experiments involving the linear cost function, only the ϕa parameters

are estimated. βa, the free flow travel time for arc a, is assumed to be known in the majority of

the papers cited in the literature review on estimating cost functions using inverse optimization

[21, 43, 176, 177, 206, 207]. Consequently, for the θ̂, θ̄, and θ stochastic protection models

involving linear cost, the βa terms are the same across all of them. For the θ̂ protection model, the

ϕa terms are the original cost values generated by the unifrnd MATLAB function, as further

discussed in Section 2.4.5. For the θ protection model, the ϕa terms come from the inverse

optimization model (2.3.4). Finally, for the θ̄ protection model, the ϕa = 6 for all a, as indicated

in Table 2.1 in Section 2.4.5.

Experiment II: In Experiment II, the inverse optimization algorithm computes the αa parameters

for each arc a for the BPR cost function. Wong and Wong [199] support having different αa

parameters across the network because they vary the α based on the structure of the network

involved. Lu et al. [126] create their BPR function such that the α parameter value differs for

each type of vehicle in their simulation, thus again showing that the α parameter can be different

than the standard uniform 0.15 noted in Section 2.3.1.1. The IO model is given the randomly

36

unifrnd


chosen t0a parameters and the capacity levels (all set to 8), and it is asked to estimate the αa

values. The uniform parameter value that is chosen for the BPR experiments is 0.15 because

that is the traditionally chosen parameter value [165]. The scenario set up is the same as in

Experiment I (see Figure 2.2).

Experiment III: Experiment III uses the linear cost function for the Nguyen & Dupuis network.

Figure 2.3 illustrates the arcs that have a chance of failing, which are again chosen such that every

other pair of arcs are vulnerable to complete destruction. There are 9 pairs of arcs indicated,

which means each pair has a 1/9 chance of completely failing.

Figure 2.3: Scenarios for Experiments III & IV

The notes about the linear cost function discussed in Experiment I hold for this experiment; only

the network has changed along with the qk for part A of the SNPP solves. See the beginning of

Section 2.4.5 for the note about qk for part A of the experiments.

Experiment IV: Experiment IV employs the BPR cost function along with the Nguyen & Dupuis

network. The scenario pattern is the same as in Experiment III (see Figure 2.3), and the description

of Experiment II for the BPR function holds for this experiment as well.

37


Some more of the details for Experiments I-IV can be found in Table 2.1, with the following

explanation.

• Network: The network used

• Cost Function: The transportation function used, as defined in Section 3.1.1.

• Parameters: Distributions from which the original cost function coefficients are drawn. We

choose the ends of the uniform distributions for the ϕa, βa, and t0a parameters to be 2 and

10 so as to produce a fairly varied range of numbers but not too large of a range. We choose

the range for the uniform distribution for αa to be between 0.1 and 0.2 because typically

αa is set to 0.15 [165], so we wanted to generate parameter values that were not too large

or too different from this value. The capacity c′a values set to 8 are due to the fact that we

are sending 8 units of flow between nodes in the networks.

• # Scenarios: References the number of arc pairs that are vulnerable to destruction. We

obtain these numbers because we are choosing every other pair of arcs to be candidates for

destruction, and each of these pairs is assigned an equal chance of failing. Therefore, we

arrive at the number of scenarios listed for each experiment.

• ρ Used: Indicates a parameter value in the PH Algorithm. We obtain this value after many

runs of the PH Algorithm with various values of ρ.

• Number of Cores: Refers to the number of computer cores utilized for the experiments.

38


Experiment # I II III IV

Network 4x4 Grid 4x4 Grid N & D N & D

Cost Function Linear BPR Linear BPR

Parameters
ϕa ∼ U [2, 10]

βa ∼ U [2, 10]

αa ∼ U [0.1, 0.2]

t0a ∼ U [2, 10]

c′a = 8

ϕa ∼ U [2, 10]

βa ∼ U [2, 10]

αa ∼ U [0.1, 0.2]

t0a ∼ U [2, 10]

c′a = 8

# Scenarios 12 12 9 9

ρ Used 5 5 5 5

Number of Cores 8 8 8 8

Table 2.1: Experiment Descriptions

To summarize all of the experiments, we have the following experiment matrix. To distinguish

between experiments A and B, we indicate the number of origin-destination pairs in the SNPP

solves via k. For part A experiments, there are two origin-destination pairs for the SNPP solves

(k = 2) and, for part B experiments, there is one origin-destination pair for each of the SNPP

solves (k = 1).

39


A B

I
Experiment IA:

Linear, 4x4, & k = 2

Experiment IB:

Linear, 4x4, & k = 1

II
Experiment IIA:

BPR, 4x4, & k = 2

Experiment IIB:

BPR, 4x4, & k = 1

III
Experiment IIIA:

Linear, N & D, & k = 2

Experiment IIIB:

Linear, N & D, & k = 1

IV
Experiment IVA:

BPR, N & D, & k = 2

Experiment IVB:

BPR, N & D, & k = 1

Table 2.2: Experiment Matrix

2.4.5 Details of the Experiments

In this section, we elaborate upon some of the details of the experiments. For each trial

of each experiment, Algorithm 1 is used to generate x̂j for j = 1...J . Next, for each trial of

each experiment, the hybrid framework is used to estimate a set of parameters for the current cost

function and to find the protection decisions under that parameterization given a set budget.

In the experiments, three components are varied: the type of graph (4x4 grid vs. Nguyen &

Dupuis (N & D)), the type of cost function (linear vs. BPR functions), and the origin-destination

specification (see Table 2.2 as reference). For all of the experiments, the following remain the

same:

• The MATLAB built-in function unifrnd is used to create the random original costs for

the data generation part of each trial of each experiment.

• There are 10 trials for each experiment, each with a different original (and random) cost

40

unifrnd


parameterization.

• For part A of the experiments, the k for qk is equal to 2. For the 4x4 Grid, eight units

of flow go from node 0 to node 15 and from 15 to node 0 and, for the Nguyen & Dupuis

network, eight units of flow go from node 0 to node 2 and from node 2 to node 0. See

Figure 2.1 for the node references. For part B of the experiments, the k for each run of the

SNPP is equal to 1, and the end points correspond to the chosen kth OD pair in the subset

of OD pairs chosen in Algorithm 3. These subsets are chosen randomly.

• The set of scenarios correspond to each pair of arcs indicated in Figures 2.2 and 2.3. Each

pair of arcs has an equal chance of failing.

• The budget for constraint (2.3.9) is set to 6. It could be modified in future work.

• For each run of the stochastic network protection problem (SNPP) in part A, ϵ = 0.01 or

ϵ = 0.001 for the g(k) PH error metric [191]; we utilize the default error metric outlined

in the pysp documentation in [94]. The maximum number of iterations is 300 for part A.

Therefore, the progressive hedging algorithm stops if it reaches the tolerance or if it reaches

the maximum number of iterations, whichever occurs first. Two runs of each experiment

are done, one under ϵ = 0.01 and one under ϵ = 0.001. For part B, the adjustments are

that the stochastic solves are undertaken with an ϵ = 0.001 and an iteration maximum of

150. The parameters were changed between the two parts because we theorized that part B

would be more computationally intensive.

41

pysp


2.5 Experimental Results

2.5.1 Experiment IA-IVA

The results of the experiments are evaluated with respect to two hypotheses regarding

confidence intervals [188] of the means and to a direct comparison of the means of the O-IO,

U-IO, and U-O metrics defined in Section 2.4.2.4 See Appendix A.3.2 for information on the

medians and minimum/maximums of the data and see Appendix A.3.3 for the run times of the

experiments.

The first hypothesis states the O-IO confidence intervals for the experiments will include

0 because the inverse optimization framework can recover comparable protection decisions to

the θ̂ protection decisions. Looking at Tables 2.3-2.4, all of the confidence intervals include 0.

Therefore, there is evidence in favor of the hypothesis.

The second hypothesis states that the U-IO and U-O metric confidence intervals will not

include 0 because having either cost parameters that are learned from IO or the original parameters

makes a difference in protection decisions when compared to the case of uniform cost parameters.

Tables 2.3-2.4 indicate that Experiments IA-IIIA present evidence in favor of the hypothesis, but

Experiment IVA falls short since the confidence intervals for U-IO and U-O both include 0.

Experiments IA-IIIA demonstrate that using θ̄ (uniform cost) leads to different protection

decisions than the θ̂ or θ costs. Comparing the means as a percentage of the total budget in Tables

2.3 and 2.4, we see that the O-IO metric as a percentage of the budget is small, while the U-IO

and the U-O metrics as a percentage of the budget are many times greater. The small values of the

4See Appendix A.3.1 for the flow error metrics for the IO α values because, as can be seen from Figure 2.5, the
IO α values are different from the original α values.

42


O-IO metric as a percentage of the budget indicate that IO can be used to recover parameters for

the SNPP model, while the comparatively large values of the U-IO and U-O metrics indicate that

the uniform protection decisions are quite different from the IO and original protection decisions,

confirming the value of IO in recovering the network parameters. Boxplots in Figures 2.4 and 2.5

(along with the boxplots for Experiment IIIA in Appendix A.3.2) tell the same story.

Experiment IA Experiment IIA Experiment IIIA Experiment IVA

Mean CI Mean CI Mean CI Mean CI

O-IO
0.0001

0.0%
(-0.0001, 0.0002)

0.0015

0.03%
(-0.0007, 0.0038)

0.0374

0.62%
(-0.0144, 0.0892)

0.0041

0.07%
(-0.0017, 0.0099)

U-IO
0.3449

5.75%
(0.2795, 0.4103)

0.0564

0.94%
(0.0115, 0.1013)

0.2617

4.36%
(0.2019, 0.3215)

0.0261

0.43%
(-0.028, 0.0801)

U-O
0.3449

5.75%
(0.2795, 0.4103)

0.0559

0.93%
(0.0108, 0.1009)

0.2573

4.29%
(0.2011, 0.3135)

0.0264

0.44%
(-0.0285, 0.0812)

Table 2.3: Means, Means as Percentage of I = 6 Budget, and 99% Confidence Intervals (CI) for
Experiments IA-IVA, ϵ = 0.01

Experiment IA Experiment IIA Experiment IIIA Experiment IVA

Mean CI Mean CI Mean CI Mean CI

O-IO
0.0001

0.0%
(-0.0001, 0.0002)

0.0013

0.02%
(-0.0002, 0.0027)

0.0251

0.42%
(-0.017, 0.0673)

0.0117

0.19%
(-0.0168, 0.0402)

U-IO
0.3304

5.51%
(0.2545, 0.4063)

0.0638

1.06%
(0.01, 0.1176)

0.2521

4.2%
(0.1873, 0.3168)

0.0197

0.33%
(-0.016, 0.0554)

U-O
0.3304

5.51%
(0.2545, 0.4063)

0.0634

1.06%
(0.0095, 0.1173)

0.2504

4.17%
(0.1807, 0.3202)

0.0257

0.43%
(-0.0158, 0.0673)

Table 2.4: Means, Means as Percentage of I = 6 Budget, and 99% Confidence Intervals (CI) for
Experiments IA-IVA, ϵ = 0.001

43


(a) Experiment IA Results for ϵ = 0.01 (b) Experiment IA Results for ϵ = 0.001

Figure 2.4: Experiment IA Results: 4x4 Grid Network with Linear Cost. The parameter
differences refer to the ϕ differences.

(a) Experiment IIA Results for ϵ = 0.01 (b) Experiment IIA Results for ϵ = 0.001

Figure 2.5: Experiment IIA Results: 4x4 Grid with BPR Function. Parameter differences here
refers to the α differences.

2.5.2 Experiments IB-IVB

From Algorithm 3, we obtain results for Experiments IB-IVB. We calculate the metrics

described in Section 2.4.2 for 10% of the OD pairs, but we calculate the flow error between the

original and inverse optimization parameterizations for all of the origin-destination pairs. There

are 240 OD pairs for the grid network and 156 OD pairs for the N & D network.

In Figures 2.6-2.9, we plot histograms of the decision difference metrics between u, û, ū

44


outlined in the first part of Section 2.4.2 for Experiments IB-IVB. We see that, for the linear

function for both networks, there is a clear difference between the IO-Original metric and IO-

Uniform & Original-Uniform metrics, with the IO-Original & IO-Uniform histograms showing

that there is a large difference between the IO/original and uniform protection decisions. For

the BPR grid Experiment IIB, there are still some OD pairs that result in protection decisions

that are different between the uniform parameterization and the IO/original parameterizations.

For the BPR N&D network Experiment IVB, there are really no differences between the three

histograms, thus this experiment does not showcase the value of inverse optimization for making

protection decisions. We do note that there is more variability in the hidden parameters for the

linear cost function than for the BPR cost function, which certainly contributed to the differing

IO-Uniform and Original-Uniform results between the two functions on the two graphs.

Finally, in Figures 2.10-2.11, we have the flow errors for all the OD pairs for each of the

four experiments. We see that there is low flow error for all of the experiments except Experiment

IVB which is BPR N & D. Therefore, we note that, as in experiment IVA, experiment IVB does

not perform in the way we were expecting, which may mean not all cost function and network

pairings are suitable to inverse optimization. However, most of the experiments (Experiments

IB-IIIB) do support the idea that inverse optimization is recovering parameterizations that lead

to similar flow patterns as the original parameterizations. We note that this low flow error is

in part due to the way in which we designed the train-test experiments, with only leaving one

observation out of each fold.

In these results, we demonstrate that, in most of the experiments, (a) inverse optimization

can recover cost parameters that produce the same flow values as the original parameters, (b) the

protection decisions are very similar between the original and inverse optimization costs, and (c)

45


for at least a portion of the origin-destination pairs, there were differences between the uniform

cost and original & inverse optimization cost protection decisions.

Figure 2.6: Experiment IB All OD Pairs

Figure 2.7: Experiment IIB All OD Pairs

46


Figure 2.8: Experiment IIIB All OD Pairs

Figure 2.9: Experiment IVB All OD Pairs

47


Figure 2.10: Flow Error for Linear and BPR on 4x4 Directed Grid Network: Note a few of the
BPR flow error values fell outside of the 0-0.2 flow error range.

Figure 2.11: Flow Error for Linear and BPR on N & D Directed Grid Network: Note some of the
BPR flow error values fell outside of the 0-0.2 flow error range.

2.6 Conclusions & Future Work

In this chapter, we have demonstrated that inverse optimization can be used as a tool to

make better protection decisions in multi-stage stochastic programs for disaster relief. Using

48


two different networks and two different cost functions, we demonstrate that IO can be used to

recover network parameters that produce similar protection decisions as the original parameters

that were used to generate the data. For most of the experiments, we also demonstrate that

the protection decisions are different when we have either cost parameters learned from IO or

the original cost parameters compared to the protection decisions that would have been made

under the assumption of uniform cost. These results suggest that inverse optimization can be

used as a data analysis approach in a DSS and as a way to estimate cost parameters in multi-

stage stochastic programs for disaster management. Indeed, inverse optimization is valuable

precisely because it can estimate the hidden parameters in the functions that drive optimization-

based systems. Being able to propose an optimization or equilibrium model of a system and,

then, being able to find the parameters for the functions that drive those models are important

parameter estimation capabilities. Inverse optimization is a parameter estimation tool with these

capabilities and, as opposed to a method such as deep learning, we can actually understand the

method in terms of optimality conditions and do not lose importan