ABSTRACT
Title of dissertation: DATA-DRIVEN OPTIMIZATION AND
STATISTICAL MODELING TO IMPROVE
DECISION MAKING IN LOGISTICS
Debdatta Sinha Roy
Doctor of Philosophy, 2019
Dissertation directed by: Professor Bruce Golden
Department of Decision, Operations
& Information Technologies
Robert H. Smith School of Business
In this dissertation, we develop data-driven optimization and statistical mod-
eling techniques to produce practically applicable and implementable solutions to
real-world logistics problems.
First, we address a significant and practical problem encountered by utility
companies. These companies collect usage data from meters on a regular basis.
Each meter has a signal transmitter that is automatically read by a receiver within
a specified distance using radio-frequency identification (RFID) technology. The
RFID signals are discontinuous, and each meter differs with respect to the spec-
ified distance. These factors could lead to missed reads. We use data analytics,
optimization, and Bayesian statistics to address the uncertainty.
Second, we focus on an important problem experienced by delivery and service
companies. These companies send out vehicles to deliver customer products and
provide services. For the capacitated vehicle routing problem, we show that reducing
route-length variability while generating the routes is an important consideration
to minimize the total operating and delivery costs for a company when met with
random traffic.
Third, we address a real-time decision-making problem experienced in practice.
For example, routing companies participating in competitive bidding might need to
respond to a large number of requests regarding route costs in a very short amount
of time. Also, during post-disaster aerial surveillance planning or using drones to
deliver emergency medical supplies, route-length estimation would quickly need to
assess whether the duration to cover a region of interest would exceed the drone
battery life. For the close enough traveling salesman problem, we estimate the
route length using information about the instances.
Fourth, we address a practical problem encountered by local governments.
These organizations carry out road inspections to decide which street segments to
repair by recording videos using a camera mounted on a vehicle. The vehicle taking
the videos needs to proceed straight or take a left turn to cover an intersection fully.
Right turns and U-turns do not capture an intersection fully. We introduce the
intersection inspection rural postman problem, a new variant of the rural postman
problem involving turns. We develop two integer programming formulations and
three heuristics to generate least-cost vehicle routes.
DATA-DRIVEN OPTIMIZATION AND
STATISTICAL MODELING TO IMPROVE
DECISION MAKING IN LOGISTICS
by
Debdatta Sinha Roy
Dissertation submitted to the Faculty of the Graduate School of the
University of Maryland, College Park in partial fulfillment
of the requirements for the degree of
Doctor of Philosophy
2019
Advisory Committee:
Professor Bruce Golden, Chair
Professor Edward Wasil
Professor Michael Ball
Professor Tunay Tunca
Professor Paul Schonfeld (Dean?s Representative)
?c Copyright by
Debdatta Sinha Roy
2019
Dedication
to mom and dad...
ii
Acknowledgments
First and foremost, I would like to express my deepest gratitude to my advisor,
Professor Bruce Golden. To say the least, this journey of five years through many
successes and failures would not have been possible without his constant motivation
and support. I felt a sense of freedom while working under his guidance because
he allowed me to freely explore and find solutions to research problems at my own
pace. I always tried to make the most of this opportunity to increase my knowledge
about the field and to approach a problem from my perspective because I knew that
he would always give me a second chance if I failed. I believe this sense of freedom
to think brings out the best in a student. He always pushed me for that extra bit
of thinking on a research problem that I would not have done on my own. More
generally, I am a fan of his sense of humor and his perspective on life. He is a very
caring advisor and I have seen him care for the well-being of his students even long
after their graduation. It has been an honor to work with him and to become a
part of his great academic lineage. I will surely miss the very late night (extremely
late for ?normal? people) discussions with him over the phone and the Saturday
?afternoon? (depends on the perspective) meetings at his home.
I also express my heartfelt gratitude to Professor Edward Wasil for practically
being my co-advisor throughout my Ph.D. journey. He spent countless hours to
help me improve my academic writing and to help me grow as a scholar. I am an
admirer of his straightforward personality and his useful and practical advice on
research problems and life. He was extremely patient and helped to channelize my
iii
thoughts. My journey would be a lot different and tougher without his presence and
involvement.
I would like to thank Professor Michael Ball and Professor Tunay Tunca for
giving me feedback on my research progress from time to time and helping me in my
academic journey in various ways. It is also an honor to have them in my dissertation
committee. I would also like to thank Professor Michael Fu and Professor Frank
Alt for always being there for me and helping me with all kinds of resources and
suggestions. Moreover, I am extremely thankful to Professor Paul Schonfeld for
being an integral part of my dissertation committee.
I am grateful to all my professors at IISER Mohali and ISI Delhi for grooming
me well to take up a challenging PhD journey.
Life in 3330 Van Munching Hall would be a lot different and challenging with-
out Justina Blanco. She is an excellent multi-tasker and keeps the ball rolling in
the Smith School Ph.D. office. She forms a personal bond with each student, and I
believe that she knows the answers to every question a doctoral student might have.
My Ph.D. experience would not have been as much enjoyable and fulfilling
without Janet Cavanagh who has been like a mother to me. She was always there
to help me whenever I needed it. She made me feel at home with her care and by
inviting me every year to her place with her family during Thanksgiving celebrations.
I am very thankful to my co-authors, Dr. Christof Defryn and Adriano Masone
for working with me on some exciting research problems and for making my Ph.D.
journey significantly more productive.
I would like to thank Dr. Rui Zhang, Dr. Xingyin Wang, Dr. Oliver Lum, and
iv
all other senior students when I started my Ph.D. journey for their invaluable advice
to cope with this challenging journey full of high expectations. I am very grateful
to have Dr. Stefan Poikonen and Dr. Cheng Jie as friends for their many hours of
discussions on research problems, homework assignments, and ways to understand
and get used to the American life.
I believe that after five years of close friendship with Dr. Aishwarya Deep
Shukla and Dr. Gokul Iyer, I have a broader perspective about life and people. A
special thanks to both of these brilliant people, in their own domains, for passing
on to me, their knowledge and experiences. I would always cherish our discussions
on science, technology, business, and academic writing.
All work and no play makes one a dull boy. I am very thankful to Anirudh
Singh Chauhan, Kunal Dey, Dr. Siddharth Sharma, and Mohit Gupta for their
enjoyable company which helped me cool off during times of stress.
A very special thanks to my best friend, Anjali (well, Dr. Gupta now), for
always supporting me, guiding me to choose the right path, and helping me reach
where I am today. My accomplishments are negligible with her out of the equation.
Last but not least, I am deeply indebted to my parents for all their sacrifices to
help me excel. They have always provided unconditional love, trust, encouragement,
and support. They have inspired me to keep performing better than yesterday and
continuously motivated me to achieve my goal. They have always tried to keep me
protected from all sorts of family responsibilities to help me focus on my career. I
would not be able to stand here without their strong shoulders and the rock-solid
platform they have provided.
v
Table of Contents
Dedication ii
Acknowledgements iii
Table of Contents vi
List of Tables ix
List of Figures xiii
1 Introduction 1
2 Data-Driven Optimization and Statistical Modeling to Improve Meter Read-
ing for Utility Companies 7
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.1 Background and Literature Review . . . . . . . . . . . . . . . 7
2.1.2 Research Goal and Contributions . . . . . . . . . . . . . . . . 12
2.2 Description of the Data Set . . . . . . . . . . . . . . . . . . . . . . . 14
2.3 Initial Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4 Integer Programming Formulation . . . . . . . . . . . . . . . . . . . . 21
2.4.1 Stage 1 IP Formulation . . . . . . . . . . . . . . . . . . . . . . 22
2.4.2 Stage 2 IP Formulation . . . . . . . . . . . . . . . . . . . . . . 24
2.5 Jensen?s Inequality for the Stage 1 IP . . . . . . . . . . . . . . . . . . 26
2.6 Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.7 Regression Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.8 Bayesian Updating . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.8.1 Logit Model and Probit Model . . . . . . . . . . . . . . . . . . 41
2.8.2 Metropolis-Hastings Random Walk Algorithm for the Logit
Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.8.3 Gibbs Sampling Algorithm for the Probit Model . . . . . . . . 43
2.8.4 Hierarchical Probit Model . . . . . . . . . . . . . . . . . . . . 43
2.8.5 Gibbs Sampling Algorithm for the Hierarchical Probit Model . 45
2.9 Bayesian Updating Results . . . . . . . . . . . . . . . . . . . . . . . . 46
2.9.1 Logit Model and Probit Model Results . . . . . . . . . . . . . 46
2.9.2 Hierarchical Probit Model Results . . . . . . . . . . . . . . . . 50
2.10 Description of the CNG Data Set . . . . . . . . . . . . . . . . . . . . 52
vi
2.11 Bayesian Updating Results for the CNG Data Set . . . . . . . . . . . 54
2.11.1 Logit Model and Probit Model Results . . . . . . . . . . . . . 54
2.11.2 Hierarchical Probit Model Results . . . . . . . . . . . . . . . . 57
2.12 Discussion of Different Formulations and Computational Experiments
of the Stage 1 IP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
2.12.1 Alternative Formulations . . . . . . . . . . . . . . . . . . . . . 59
2.12.1.1 Coefficient Round-Down Inequalities . . . . . . . . . 60
2.12.1.2 Increasing Coefficient Extended Cover Inequalities . 60
2.12.1.3 Decreasing Coefficient Extended Cover Inequalities . 61
2.12.1.4 Middle Coefficient Extended Cover Inequalities . . . 61
2.12.1.5 Extreme Coefficient Extended Cover Inequalities . . 62
2.12.1.6 All Inequalities . . . . . . . . . . . . . . . . . . . . . 62
2.12.2 Computational Results . . . . . . . . . . . . . . . . . . . . . . 63
2.12.3 Observations from the Computational Experiments . . . . . . 74
2.12.4 Other Insights from the Computational Experiments . . . . . 75
2.13 Heuristics for the Stage 2 IP . . . . . . . . . . . . . . . . . . . . . . . 80
2.13.1 Route Generator . . . . . . . . . . . . . . . . . . . . . . . . . 80
2.13.2 Route Trimmer . . . . . . . . . . . . . . . . . . . . . . . . . . 82
2.14 Simulation Experiments . . . . . . . . . . . . . . . . . . . . . . . . . 86
2.14.1 Actual Reading Probabilities . . . . . . . . . . . . . . . . . . . 87
2.14.2 Simulation Model Overview . . . . . . . . . . . . . . . . . . . 88
2.14.3 Generating the Network . . . . . . . . . . . . . . . . . . . . . 90
2.14.4 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . 91
2.15 Conclusions and Future Directions . . . . . . . . . . . . . . . . . . . 99
3 Data-Driven Analysis of the Variability of Routes in the Capacitated Vehicle
Routing Problem 101
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
3.2 Capacitated Vehicle Routing Problem Instances . . . . . . . . . . . . 103
3.3 Importance of Standard Deviation of Routes . . . . . . . . . . . . . . 105
3.4 Effect of Reducing Standard Deviation on Cost . . . . . . . . . . . . 111
3.5 Contribution of Standard Deviation to Cost . . . . . . . . . . . . . . 120
3.6 Conclusions and Future Directions . . . . . . . . . . . . . . . . . . . 122
4 Data-Driven Estimation of the Route Length for the Close-Enough Traveling
Salesman Problem 124
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
4.2 The Regression Model . . . . . . . . . . . . . . . . . . . . . . . . . . 127
4.3 Regression Data and Model Fit Measures . . . . . . . . . . . . . . . . 129
4.4 Regression Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
4.4.1 Results on all 842 Instances . . . . . . . . . . . . . . . . . . . 131
4.4.2 Results on the Second Group of 62 Instances . . . . . . . . . . 135
4.4.3 Results on the First Group of 780 Instances . . . . . . . . . . 141
4.4.4 Cross-validation for the First Group of 780 Instances . . . . . 143
4.4.5 Model Selection for the First Group of 780 Instances . . . . . 152
vii
4.5 Conclusions and Future Directions . . . . . . . . . . . . . . . . . . . 162
5 Intersection Inspection Rural Postman Problem on a Mixed Graph 164
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
5.2 Problem Formulations on a Mixed Graph . . . . . . . . . . . . . . . . 170
5.2.1 IIRPP Formulation using Node Transformations . . . . . . . . 172
5.2.2 IIRPP Formulation using Path Transformations . . . . . . . . 176
5.2.3 Heuristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
5.3 Computational Experiments . . . . . . . . . . . . . . . . . . . . . . . 182
5.3.1 Test Instances . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
5.3.2 Computational Results . . . . . . . . . . . . . . . . . . . . . . 184
5.3.3 Summary of Results . . . . . . . . . . . . . . . . . . . . . . . 192
5.4 Conclusions and Future Directions . . . . . . . . . . . . . . . . . . . 195
6 Concluding Remarks 214
Bibliography 218
viii
List of Tables
2.1 Results based on the analysis from Step 1 to Step 5. . . . . . . . . . 18
2.2 Results based on the analysis from Step 9. . . . . . . . . . . . . . . . 21
2.3 Summary statistics for the dependent and the independent variables. 32
2.4 Correlation between the dependent and the independent variables. . 33
2.5 Logistic regression results. . . . . . . . . . . . . . . . . . . . . . . . . 34
2.6 Probit regression results. . . . . . . . . . . . . . . . . . . . . . . . . 36
2.7 Logit model results. . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.8 Probit model results. . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.9 Hierarchical probit model results for the higher level parameter ma-
trix. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
2.10 Summary of the CNG data set. . . . . . . . . . . . . . . . . . . . . . 52
2.11 Logit model results for the CNG data set. . . . . . . . . . . . . . . . 54
2.12 Probit model results for the CNG data set. . . . . . . . . . . . . . . 56
2.13 Hierarchical probit model results for the higher level parameter ma-
trix for the CNG data set. . . . . . . . . . . . . . . . . . . . . . . . 58
2.14 Results for 20 meters and 10 street segments. . . . . . . . . . . . . . 65
2.15 Results for 10 meters and 20 street segments. . . . . . . . . . . . . . 66
2.16 Results for 100 meters and 20 street segments. . . . . . . . . . . . . 67
2.17 Results for 20 meters and 100 street segments. . . . . . . . . . . . . 68
2.18 Results for 200 meters and 100 street segments. . . . . . . . . . . . . 69
2.19 Results for 100 meters and 200 street segments. . . . . . . . . . . . . 70
2.20 Results for 1000 meters and 200 street segments. . . . . . . . . . . . 71
2.21 Results for 200 meters and 1000 street segments. . . . . . . . . . . . 72
2.22 Results for 2000 meters and 1000 street segments. . . . . . . . . . . 73
2.23 Results for 1000 meters and 2000 street segments. . . . . . . . . . . 74
2.24 Comparison of the linear Stage 1 IP performance for the three differ-
ent cost structures. . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
2.25 Comparison of the linear Stage 1 IP objective value for specified like-
lihood values of 0.95 and 0.75 for all meters. . . . . . . . . . . . . . 78
2.26 Local search operators in the variable neighborhood descent meta-
heuristic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
2.27 Algorithm for the remove and repair procedure of the route trimmer. 83
ix
2.28 Average comparison of the total time to read all the meters. . . . . . 96
3.1 Summary statistics of route times (in hours) for routes generated by
two third-party software programs on an actual street network to
serve customers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
3.2 Eleven CVRP instances (Christofides and Eilon 1969a). . . . . . . . 103
3.3 Breakdown of 1000 CVRP solutions for each of the 11 instances in
terms of the number of vehicles required to serve the customers. . . 106
3.4 Parameter values that are used to calculate the total operating and
delivery costs for a company to serve its customers (Levy 2018). . . 107
3.5 Linear regression results for three models. . . . . . . . . . . . . . . . 109
3.6 Average total cost under random traffic conditions for Scenario X. . 117
3.7 Average total cost under random traffic conditions for Scenario Y. . 118
3.8 Average total cost under random traffic conditions for Scenario Z. . . 118
3.9 Best average total cost under random traffic conditions across all
three scenarios and percent savings compared to Group A. . . . . . . 119
3.10 Number of solutions from the respective buckets for the 20 best total
cost solutions under random traffic conditions. . . . . . . . . . . . . 121
4.1 Definitions of the independent variables for the linear regression model.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
4.2 Regression results for the 842 instances with and without outliers. . 132
4.3 Regression results for the second group of 62 instances with and with-
out outliers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
4.4 Regression results for the first group of 780 instances with outliers. . 140
4.5 Regression results on each training set of 520 instances from the first
group of 780 instances. . . . . . . . . . . . . . . . . . . . . . . . . . 144
4.6 Best subset models based on R2 for the first group of 780 instances. 152
4.7 Regression results on the best subset models based on R2 for the first
group of 780 instances. . . . . . . . . . . . . . . . . . . . . . . . . . 155
5.1 Comparison of the transformed graphs G1 and G2 with the original
graph G on 25 nodes. . . . . . . . . . . . . . . . . . . . . . . . . . . 197
5.2 Comparison of the transformed graphs G1 and G2 with the original
graph G on 36 nodes. . . . . . . . . . . . . . . . . . . . . . . . . . . 198
5.3 Comparison of the transformed graphs G1 and G2 with the original
graph G on 49 nodes. . . . . . . . . . . . . . . . . . . . . . . . . . . 199
5.4 Comparison of the transformed graphs G1 and G2 with the original
graph G on 64 nodes. . . . . . . . . . . . . . . . . . . . . . . . . . . 200
5.5 Comparison of the percentage optimality gap between the best feasi-
ble solution and the best lower bound for the IIRPP formulations on
25 nodes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
5.6 Comparison of the percentage optimality gap between the best feasi-
ble solution and the best lower bound for the IIRPP formulations on
36 nodes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
x
5.7 Comparison of the percentage optimality gap between the best feasi-
ble solution and the best lower bound for the IIRPP formulations on
49 nodes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
5.8 Comparison of the percentage optimality gap between the best feasi-
ble solution and the best lower bound for the IIRPP formulations on
64 nodes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
5.9 Comparison of the running time for the RPP and IIRPP formulations
on 25 nodes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
5.10 Comparison of the running time for the RPP and IIRPP formulations
on 36 nodes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
5.11 Comparison of the running time for the RPP and IIRPP formulations
on 49 nodes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
5.12 Comparison of the running time for the RPP and IIRPP formulations
on 64 nodes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
5.13 Comparison of the percentage gap between the RPP optimal solution
and the best feasible solutions from the IIRPP formulations on 25
nodes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
5.14 Comparison of the percentage gap between the RPP optimal solution
and the best feasible solutions from the IIRPP formulations on 36
nodes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
5.15 Comparison of the percentage gap between the RPP optimal solution
and the best feasible solutions from the IIRPP formulations on 49
nodes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
5.16 Comparison of the percentage gap between the RPP optimal solution
and the best feasible solutions from the IIRPP formulations on 64
nodes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
5.17 Comparison of the RPP based and IIRPP based heuristics with the
IIRPP formulations on 25 nodes. . . . . . . . . . . . . . . . . . . . . 207
5.18 Comparison of the RPP based and IIRPP based heuristics with the
IIRPP formulations on 36 nodes. . . . . . . . . . . . . . . . . . . . . 208
5.19 Comparison of the RPP based and IIRPP based heuristics with the
IIRPP formulations on 49 nodes. . . . . . . . . . . . . . . . . . . . . 209
5.20 Comparison of the RPP based and IIRPP based heuristics with the
IIRPP formulations on 64 nodes. . . . . . . . . . . . . . . . . . . . . 210
5.21 Summary of the comparison of the transformed graphs G1 and G2
with the original graph G. . . . . . . . . . . . . . . . . . . . . . . . . 211
5.22 Summary of the comparison of the percentage optimality gap between
the best feasible solution and the best lower bound for the IIRPP
formulations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
5.23 Summary of the comparison of the running time for the RPP and
IIRPP formulations. . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
5.24 Summary of the comparison of the percentage gap between the RPP
optimal solution and the best feasible solutions from the IIRPP for-
mulations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
xi
5.25 Summary of the comparison of the RPP based and IIRPP based
heuristics with the IIRPP formulations. . . . . . . . . . . . . . . . . 213
xii
List of Figures
2.1 (Color online) A view of the street layer, the service location layer,
and the reading events layer. The red lines represent the street seg-
ments. The green dots represent the route traversed by the meter
reading vehicle. The blue dots and the yellow dots represent meters
(customers) in the service location layer that are read and that are
missed, respectively, after the vehicle has traversed the route. If mul-
tiple meters have the same geographic location, then all those meters
are represented by a single dot, although they have distinct account
identifiers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2 (Color online) A magnified view of the reading events layer. . . . . . 15
2.3 Minimum time interval between reads. . . . . . . . . . . . . . . . . . 18
2.4 Maximum read distance. . . . . . . . . . . . . . . . . . . . . . . . . 19
2.5 Histograms of fitted values from the logistic regressions. . . . . . . . 35
2.6 Histograms of fitted values from the probit regressions. . . . . . . . . 37
2.7 Density plots from the MH random walk algorithm for the logit
model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.8 Trace plots from the MH random walk algorithm for the logit model. 49
2.9 Autocorrelation plots from the MH random walk algorithm for the
logit model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.10 Histograms of the means of the lower level parameters from the Gibbs
sampling algorithm for the hierarchical probit model. . . . . . . . . . 52
2.11 Density plots from the MH random walk algorithm for the logit model
for the CNG data set. . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.12 Trace plots from the MH random walk algorithm for the logit model
for the CNG data set. . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.13 Autocorrelation plots from the MH random walk algorithm for the
logit model for the CNG data set. . . . . . . . . . . . . . . . . . . . 56
2.14 Histograms of the means of the lower level parameters from the Gibbs
sampling algorithm for the hierarchical probit model for the CNG
data set. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
xiii
2.15 (Color online) The route generator and the route trimmer applied to
a small example. Red lines denote the required street segemnts. Blue
lines denote the deadhead segments. Yellow line denotes the required
street segment that is removed. Green line denotes the new street
segement added to the route as a replacement for the yellow line. . . 86
2.16 (Color online) A view of a portion of the actual street network with
meter locations in the UTM format serviced by Connecticut Natural
Gas in our data set from Hartford, Connecticut. The red dots repre-
sent the meters. This network is used for our simulation experiments.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
2.17 (Color online) Simulation results for the route length. . . . . . . . . 93
2.18 (Color online) Simulation results for the number of missed meters. . 94
3.1 Example to illustrate an iteration of the RTR travel algorithm ex-
plaining the three scenarios. . . . . . . . . . . . . . . . . . . . . . . . 113
3.2 A flowchart showing the relation between the four groups and three
scenarios. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
4.1 An example of a CETSP with 12 customers. . . . . . . . . . . . . . 125
4.2 Node locations of instance d493 from the second group of 62 instances. 129
4.3 Studentized residual plot of 842 instances. The lines indicate the
Studentized residual values of 2 and ?2. . . . . . . . . . . . . . . . . 133
4.4 Histogram of Studentized residuals of 842 instances. . . . . . . . . . 133
4.5 Normal probability plot of Studentized residuals of 842 instances. . . 134
4.6 Studentized residual plot for the second group of 62 instances. The
lines indicate the Studentized residual values of 2 and ?2. . . . . . . 137
4.7 Histogram of Studentized residuals for the second group of 62 in-
stances. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
4.8 Normal probability plot of Studentized residuals for the second group
of 62 instances. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
4.9 Studentized residual plot for the first group of 780 instances. The
lines indicate the Studentized residual values of 2 and ?2. . . . . . . 139
4.10 Histogram of Studentized residuals for the first group of 780 instances. 140
4.11 Normal probability plot of Studentized residuals for the first group
of 780 instances. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
4.12 Studentized residual plot for the first set of 520 training instances.
The lines indicate the Studentized residual values of 2 and ?2. . . . 143
4.13 Histogram of Studentized residuals for the first set of 520 training
instances. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
4.14 Normal probability plot of Studentized residuals for the first set of
520 training instances. . . . . . . . . . . . . . . . . . . . . . . . . . . 145
4.15 Studentized residual plot for the second set of 520 training instances.
The lines indicate the Studentized residual values of 2 and ?2. . . . 147
4.16 Histogram of Studentized residuals for the second set of 520 training
instances. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
xiv
4.17 Normal probability plot of Studentized residuals for the second set of
520 training instances. . . . . . . . . . . . . . . . . . . . . . . . . . . 148
4.18 Studentized residual plot for the third set of 520 training instances.
The lines indicate the Studentized residual values of 2 and ?2. . . . 149
4.19 Histogram of Studentized residuals for the third set of 520 training
instances. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
4.20 Normal probability plot of Studentized residuals for the third set of
520 training instances. . . . . . . . . . . . . . . . . . . . . . . . . . . 150
4.21 Plot showing the best subset models for the first group of 780 in-
stances arranged according to adjusted R2. . . . . . . . . . . . . . . 153
4.22 Plot showing the best subset models for the first group of 780 in-
stances arranged according to Mallows?s Cp. . . . . . . . . . . . . . . 153
4.23 Plot showing the best subset models for the first group of 780 in-
stances arranged according to BIC. . . . . . . . . . . . . . . . . . . . 154
4.24 Studentized residual plot of 780 instances for the best adjusted R2
model. The lines indicate the Studentized residual values of 2 and ?2. 154
4.25 Histogram of Studentized residuals of 780 instances for the best ad-
justed R2 model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
4.26 Normal probability plot of Studentized residuals of 780 instances for
the best adjusted R2 model. . . . . . . . . . . . . . . . . . . . . . . . 156
4.27 Studentized residual plot of 780 instances for the best Mallows?s Cp
model. The lines indicate the Studentized residual values of 2 and ?2. 157
4.28 Histogram of Studentized residuals of 780 instances for the best Mal-
lows?s Cp model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
4.29 Normal probability plot of Studentized residuals of 780 instances for
the best Mallows?s Cp model. . . . . . . . . . . . . . . . . . . . . . . 158
4.30 Studentized residual plot of 780 instances for the best BIC model.
The lines indicate the Studentized residual values of 2 and ?2. . . . 160
4.31 Histogram of Studentized residuals of 780 instances for the best BIC
model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
4.32 Normal probability plot of Studentized residuals of 780 instances for
the best BIC model. . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
5.1 (Color online) An intersection with two left turns. . . . . . . . . . . 165
5.2 (Color online) An intersection with two right turns. . . . . . . . . . 166
5.3 (Color online) Map of Dupont Circle in Washington, DC. . . . . . . 167
5.4 (Color online) The RPP and the IIRPP solutions on a small grid-like
street network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
5.5 A mixed graph G (street network) with four nodes. Costs are shown
adjacent to the two arcs and one edge. . . . . . . . . . . . . . . . . . 170
5.6 Transformed graph G1 from the original graph G shown in Figure 5.5. 173
5.7 Transformed graph G2 from the original graph G shown in Figure 5.5. 177
5.8 (Color online) RPP-H route on a small grid-like street network. . . . 180
5.9 (Color online) IIRPP-F1-H and IIRPP-F2-H produce the same route
on a small grid-like street network. . . . . . . . . . . . . . . . . . . . 181
xv
5.10 Example of an 8? 8 instance. . . . . . . . . . . . . . . . . . . . . . 183
xvi
Chapter 1: Introduction
Business and industry decision making has entered the era of big data. The
growing importance of data in decision making has led to better data storage and
easy access of data to decision makers. However, they often lack the resources and
analytical tools required to understand what type of data they need precisely, how
to use the available data, and how to incorporate the vast amount of data from
multiple sources including departments inside the organization and clients outside
the organization. In this dissertation, we study real-world decision-making prob-
lems in logistics. Throughout the progress of this dissertation, we worked in close
collaboration with companies to identify the crucial challenges that could be ad-
dressed using data analytics, statistics, and optimization. We obtained real-world
data, used discussions with companies and their clients, and addressed company
requirements. These helped us to formulate our research questions and develop
models and solution methods. Since the problems considered in this dissertation
are real-world in nature, it is important to model them as data-driven, decision-
making problems. We develop data-driven optimization and statistical modeling
techniques that can synthesize and analyze different types of data to produce prac-
tically applicable and implementable solutions to real-world logistics problems and
1
to improve decision making for both clients and companies. Even though the collab-
oration process with different companies for the problems studied in this dissertation
was a fulfilling experience, obtaining data and practical insights can be tedious and
time-consuming. However, the importance of narrowing the gap between academic
research and practical applicability of the solutions is enough of a motivation to
strive harder for industry-driven collaborative research.
In Chapter 2, we address a significant and practical problem encountered by
utility companies. These companies collect usage data from meters on a regular ba-
sis. The usage data are collected automatically using radio-frequency identification
(RFID) technology. Each meter transmits signals from an RFID tag that are read
by a vehicle-mounted reading device within a specified distance. Currently, utility
companies generate meter reading routes by solving the close enough vehicle rout-
ing problem (CEVRP) such that all meters are within a specified distance (range
of the signal as specified by the manufacturers of the RFID devices) from at least
one location on the route. In reality, there is uncertainty while reading meters. The
signal transmitted by an RFID tag is discontinuous. The range that each meter can
be read is different and stochastic due to weather conditions, surrounding obstacles,
interference, and decreasing battery life of the RFID tags. These factors could lead
to meters not being read.
Utility companies typically read more than 1.5 million RFID meters on a
monthly basis in a mid-sized city in the United States. Around 5-10% of those
meters are missed from the planned routes of the meter reading vehicles. Generally,
a vehicle is sent at a later time to read the missed meters, resulting in increased costs
2
for utility companies due to additional operational costs and overtime payments to
drivers.
We address the uncertainty issues of the RFID technology using data from a
public utility company. We use data analytics, optimization, and Bayesian statistical
models to generate routes that are both cost-effective and robust (the number of
missed reads is minimized). The stochastic meter reading problem is formulated as
a deterministic two-stage integer program (IP). An iterative algorithmic framework
addresses the stochasticity. The algorithm starts by learning from the incoming
data every time the meter reading vehicle collects readings. A two-stage IP is
solved with the updated probability of a meter being read successfully to generate
routes that are more robust for addressing the uncertainty. We motivate the use
of Bayesian statistical models to update the probabilities and also show that a
hierarchical Bayesian statistical model is justified in our problem setup. Simulation
experiments using an actual meter reading data set with five million observations
are carried out on an actual street network. We show that the hierarchical Bayesian
statistical model provides better results than other types of Bayesian statistical
models. Utility companies may be able to integrate results from the hierarchical
Bayesian statistical model into their route generating software as a decision-support
tool to produce routes that are more cost-effective and robust than the routes that
they generate currently.
In Chapter 3, we focus on an important problem experienced by delivery and
service companies. These companies need to send out multiple vehicles to deliver
customer products and provide services to customers in a city every day. Each
3
vehicle has a capacity constraint and each customer has a demand. A company needs
to fulfill all customer demands in a city using a fleet of vehicles. The companies
generate vehicle routes using routing software and algorithms provided by third-
party vendors. It is essential to maintain a well-defined workload balance among
the drivers of a company, i.e., the drivers should have similar route times (inclusive
of the service times). However, algorithms found in routing software tend to focus
on minimizing the total route time or average route time for the fleet of vehicles
serving a city. Large differences in workloads (large route-length variability) are
not only unfair to the drivers, but they might also lead to increased operating and
delivery costs for these companies.
For the capacitated vehicle routing problem (CVRP), we use regression models
to show that reducing route-length variability while generating the routes is an
important consideration to minimize the total operating and delivery costs for a
company when met with random traffic. We implement fast and easy modifications
to some well-known routing algorithms to reduce the standard deviation of the
routes, thereby reducing the total operating and delivery costs.
In Chapter 4, we address a real-time decision-making problem experienced in
practice that involves the close enough traveling salesman problem (CETSP). In
the CETSP, every customer has a service region and is considered visited when the
salesman visits any point in the customer?s service region. The objective of the
CETSP is to visit all customers in the shortest distance traveled. In one application
of real-time decision-making, routing companies participating in competitive bidding
might need to respond to a large number of requests regarding route costs in a very
4
short amount of time. In such cases, it may be sufficient to estimate the route lengths
using information about the actual instances. In another application, during post-
disaster aerial surveillance planning or using drones to deliver emergency medical
supplies, route-length estimation would quickly need to assess whether the duration
to cover a region of interest would exceed the drone battery life. The traveling
salesman problem (TSP) is a special case of the CETSP, so the CETSP is at least
as difficult to solve as the TSP.
For the CETSP, we estimate the route length using regression models. Route-
length estimation approximates the route length generated by a specific algorithm
and does not necessarily approximate the optimal solution. For practical purposes,
routing companies need to know the actual costs that would be incurred using a
specific algorithm and not the optimal costs even if the optimal costs are lower than
the actual costs. The estimation model would be unique to the algorithm that was
applied even though the general framework would apply to any routing algorithm.
In Chapter 5, we address a practical problem encountered by local govern-
ments. These organizations carry out road inspections to decide which street seg-
ments to repair by recording videos using a camera mounted on a vehicle. This
process is similar to Google generating street view images. The vehicle taking the
videos needs to proceed straight or take a left turn to cover an intersection fully. A
right turn does not always capture an intersection fully and a U-turn does not cross
an intersection.
We introduce the intersection inspection rural postman problem (IIRPP), a
new variant of the rural postman problem (RPP) involving turns. The RPP is an
5
important arc routing problem. In an RPP, we need to find the shortest way of
connecting a given set of required street segments to form a full route. The IIRPP
is a hybrid of arc routing and node routing problems. When solving the IIRPP, we
have to make sure there is at least one left turn or straight turn at an intersection
that is required to be inspected. The RPP is a special case of the IIRPP when there
are no intersections to be inspected, so the IIRPP is at least as difficult to solve as
the RPP. We develop two integer programming formulations of the IIRPP based on
two different graph transformations to generate least-cost vehicle routes. We also
develop a heuristic based on the RPP, and two heuristics based on the two IIRPP
formulations. We perform computational experiments to compare the performance
of the formulations and the heuristics.
In Chapter 6, we present our concluding remarks and briefly summarize our
contributions.
6
Chapter 2: Data-Driven Optimization and Statistical Modeling to
Improve Meter Reading for Utility Companies
2.1 Introduction
2.1.1 Background and Literature Review
Utility companies read the electric, gas, and water meters of their residential
and commercial customers on a regular basis. Typically, for a residential customer,
a meter reader visits a customer and manually reads a meter at the site. Utility
companies are interested in generating short and balanced routes where all streets
with meters that have to be read are traversed.
In the late 1970s, one of the earliest efforts to solve the meter reading problem
was by Stern and Dror (1979). Their algorithm was based on a route-first, cluster-
second approach. An Euler cycle covered all required edges (streets) in the network.
The cycle was then partitioned into balanced routes.
In the early 1990s, geographic data became available in the form of GBF/DIME
(geographic base file/dual independent map encoding) files. This led to the develop-
ment of optimization algorithms, graphics, and interactive features in meter-reader
software systems. However, the incompatibility of street data and geographic data
7
led to modeling challenges. Bodin and Levy (1991) used an arc-partitioning algo-
rithm to cluster street segments into balanced routes. Their computerized routing
system produced much better routes than the routes generated by utility companies.
Utility companies have to read tens of thousands of meters every month. They
need to generate highly efficient routes for meter readers. The task of generating
efficient routes is complicated due to several factors including balancing workload
among routes, meter-reading modes (walking, driving, combination of walking and
driving), density of meters in a geographic region, amount of read time, and natural
boundaries such as highways, rivers, and lakes. In the early 2000s, geographic infor-
mation system (GIS) was combined with powerful (near-optimal) routing algorithms
to form a highly visual computerized system. Levy et al. (2002) described how a
GIS can address the complicating factors. For example, several layers of data such
as water, major highways, and railroad tracks can be displayed in a service area by
a GIS. The displayed layers can then be used to select a subset of meters to read
within the service area for route planning purposes.
In the late 2000s, the use of radio-frequency identification (RFID) technology
increased. RFID technology is used extensively in many industries for tracking
resources since it holds down cost while increasing accuracy compared to traditional
labor-intensive reading methods. During the decade, the accuracy of transmitters
and receivers improved and the cost decreased gradually with the advancement of
technology, making the RFID technology even more viable and useful. Automatic
meter reading (AMR) using RFID technology was first tested in the early 1960s
when trials were conducted by AT&T in cooperation with a group of utilities and
8
Westinghouse but it was not adopted for commercial use by utility companies until
the late 2000s. Eglese et al. (2014) gave a brief summary of the meter reading
problem from the late 1970s until the late 2000s.
An AMR system has two parts: an RFID tag and a vehicle-mounted reading
device. An RFID tag is connected to a physical meter. The tag encodes the iden-
tification number of the meter and its current reading into a digital signal. The
vehicle-mounted reading device collects the data automatically when it approaches
an RFID tag within a specified distance. Utility companies would like to design the
routes of the vehicles to cover all customers (meters) in the service area and mini-
mize the total length of the routes or the total cost of the routes. The use of RFID
technology in meter reading changes the routing problem from a standard vehicle
routing problem (VRP) to a close-enough VRP (CEVRP). Substantial savings over
traditional solutions are possible by developing routes that exploit this close-enough
feature, i.e., the meter readers have to be within a specified distance from the meters
to read them and not manually visit each one.
Most of the research on the close-enough problem is limited to Euclidean
distances and uses a node routing formulation. Dumitrescu and Mitchell (2003)
studied approximation algorithms for the close-enough traveling salesman problem
(CETSP). Gulczynski et al. (2006) and Dong et al. (2007) presented clustering
and convex hull heuristics for the CETSP in the context of meter reading. Mennell
(2009) and Behdani and Smith (2014) formulated mixed integer programs for the
CETSP. Coutinho et al. (2016) proposed an exact algorithm for the CETSP based
on a branch-and-bound procedure and second order cone programming. Groe?r et al.
9
(2009) addressed the balanced billing cycle vehicle routing problem (BBCVRP)
which occurs when, over time, routes become inefficient and fractured with imbal-
anced workloads for the meter readers. Their three-stage algorithm for solving the
BBCVRP used partitioning heuristics and integer programming to reduce the length
of the routes and to balance the workload.
Shuttleworth et al. (2008) were the first to model the CETSP with an arc
routing formulation. They developed a two-stage process to solve the CETSP over
a street network for a single meter-reader route. In the first stage, two heuristics
(weighted bang for buck, distance weighted bang for buck) and two integer programs
specify a subset of street segments that have to be traversed by a meter reader. All
meters are within distance r from at least one location on at least one of the specified
street segments. In the second stage, a travel path (cycle) is generated that traverses
the specified street segments. Ha? et al. (2014) proposed mathematical formulations
and heuristics for the close-enough arc routing problem (CEARP). In the CEARP,
traversed street segments only have to be within a specified distance from the points
of interest. A?vila et al. (2016) proposed a new mathematical formulation for the
CEARP and descibed its polyhedra. Renaud et al. (2017) considered a version of
the CEARP in the context of meter reading in which the probability of reading a
meter from a street segment decays exponentially as the distance from the meter to
the street segment increases. They proposed an integer programming formulation
and presented several heuristics.
There are issues with RFID technology that are not considered in the literature
that we need to take into account. The signal transmitted by an RFID tag occurs
10
at regular time intervals that are not continuous. This is done to extend the battery
life of the RFID tags. This leads to the possibility of a missed capture of a signal
if the vehicle with the receiver is within the range of the meter only for a short
time. Also, the signal range of a meter can vary from the distance specified by the
manufacturers of the RFID devices due to weather conditions, surrounding obstacles,
signal interference from other meters in the vicinity, and decreasing battery life of
the RFID tags.
On average, utility companies read more than 1.5 million RFID meters on
a monthly basis. It is observed that around 5-10% of those meters are missed
from the planned routes of the meter reading vehicles. Currently, utility companies
generate meter reading routes by solving the CEVRP such that all meters are within
a specified distance (range of the signal as specified by the manufacturers of the
RFID devices) from at least one location on the route. Utility companies always
make a special attempt to read the missed meters for commercial and industrial
customers because these tend to generate higher revenues. For residential customers,
they want to use estimated billing for the missed meters. However, the public
utility commission in many areas will not allow estimated billing. For example, in
Illinois, utility companies have to perform actual meter reading at least every second
billing cycle (Illinois Administrative Code 2018). Similar examples can be found in
Colorado (Colorado Department of Regulatory Agencies 2018), Michigan (Michigan
Department of Labor and Economic Growth 2018), and Irving, Texas (Irving, Texas
- Code of Ordinances 2018). A vehicle has to be sent at a later time to read the
missed meters, and this leads to increased costs due to additional operational costs
11
and overtime payments to drivers.
2.1.2 Research Goal and Contributions
In the meter reading context, we will address the above-mentioned issues of the
RFID technology by generating routes for the CEVRP that are both cost-effective
and robust (in the sense that we seek to minimize the number of missed reads). This
is done by bringing together data analytics, statistical modeling, and optimization
techniques. The idea is to significantly reduce the number of missed meters even
though the routes that are generated may be somewhat longer than those currently
used by a utility company. For the utility companies, it is much easier and cost-
effective if they know ex-ante that they have to traverse a somewhat longer route
that leads to fewer missed meters. This substantially reduces the need to dispatch a
vehicle to read the missed meters that may be spread throughout the street network.
While past research has focused on mathematical formulations and computational
experiments on artificially generated networks, we use real networks and actual
meter reading data from utility companies to solve a more realistic version of this
problem. Real networks that we use are a few orders of magnitude larger than the
artificial networks used in the literature. The most important factor is that the
way in which street segments and meters are distributed on real networks, is very
different from that in artificial networks. Ha? (2012) gives a detailed description of
how artificial meter reading networks are systematically generated. Therefore, the
computational performances of the heuristics discussed in the literature do not have
12
enough practical relevance.
We summarize the main contributions of this chapter as follows.
1. We formulate the stochastic meter reading problem as a two-stage integer
program (IP), where the Stage 1 IP is a linear IP that guarantees a pre-
specified likelihood of reading the meters. The Stage 2 IP adds deadhead
segments to the solution of the Stage 1 IP to generate the full route. The
two-stage IP formulation is deterministic even though the use of the RFID
technology makes the meter reading problem inherently stochastic.
2. We develop three Bayesian updating learning models, namely, a logit model,
a probit model, and a hierarchical probit model to capture the uncertainty in
the data and also to avoid the shortcomings of regression. We show that the
hierarchical probit model gives a more accurate estimate of the probability
that a meter is read successfully compared to logit and probit models. We
perform simulation experiments using an actual street network with meter
locations to show that the hierarchical probit model generates robust routes,
i.e., the number of missed meters is significantly less compared to the other
two Bayesian models, even though the routes may be slightly longer.
3. We present an iterative algorithmic framework. We start by learning from
the incoming data every time the meter reading vehicle collects readings. We
then re-solve the two-stage IP with the updated probability of a meter being
read successfully to generate routes that are more robust for addressing the
uncertainty. Utility companies can integrate this algorithm into their route
13
Figure 2.1: (Color online) A view of the street layer, the service location layer, and
the reading events layer. The red lines represent the street segments. The green
dots represent the route traversed by the meter reading vehicle. The blue dots and
the yellow dots represent meters (customers) in the service location layer that are
read and that are missed, respectively, after the vehicle has traversed the route.
If multiple meters have the same geographic location, then all those meters are
represented by a single dot, although they have distinct account identifiers.
generating software as a decision-support tool.
2.2 Description of the Data Set
The data set was gathered during the second half of 2015 by ITRON (a tech-
nology and services company) and provided by RouteSmart Technologies. ITRON
manufactures radio frequency transmitters and receivers that are used by clients for
meter reading. The data are in GIS format and have three layers.
1. Street Level Data. Information about the shape, length, and type of street
segments.
14
Figure 2.2: (Color online) A magnified view of the reading events layer.
2. Service Location Data. Geographic locations of all meters that are to be read
and each indexed by a unique account identifier.
3. Reading Events Data. Records of all read events by the meter reading vehicle
in the form of the time of read (with a resolution of one second), the account
identifier of the meter that is read, and the geographic location of the vehicle
during the read.
The data are represented using ArcGIS (Steiniger and Hunter 2013). In Figure
2.1, we show how the data appear in GIS format with views of the street layer, the
service location layer, and the reading events layer. After the vehicle has traversed
a portion of the route marked by the green dots, Figure 2.1 shows the meters in the
service location layer that are read (blue dots) and those that are missed (yellow
dots). Even though ITRON specifies the range of the RFID signals to be around
15
500 feet, some missed meters are well within that range, while some meters that are
read are well outside of it. The routes generated should address these variabilities.
Figure 2.2 gives a magnified view of the reading events layer. The green dots
are farther away from each other on some street segments compared to other street
segments. Since the green dots are the vehicle locations every second, we see that
the vehicle has traveled at different speeds on different street segments.
From the data, we make the following observations. There are many account
identifiers in the reading events file that have no corresponding entry in the service
location file, i.e., the RFID readers are picking up signals from nearby RFID trans-
mitters that do not require reading by the utility companies. The read events data
have a many-to-one relationship to a service location account identifier, i.e., some
of the meters are read more than once by the meter reading device. The vehicle
location is tracked every second. However, when the read events for a single meter
are recorded, they do not occur every second along a street segment that seems to
be within range. Rather, there is generally a regular time gap between occurrences
of read events for the same meter. This confirms the fact that the signal transmitted
by an RFID tag is at regular time intervals and is not occurring continuously. Some
meters that are very close to the vehicle route have been missed, probably due to a
discontinuous signal. Missed reads can also be due to the variability of the range of
a meter to transmit a signal because of weather conditions, surrounding obstacles,
or decreasing battery life of the signal transmitters.
16
2.3 Initial Data Analysis
Five steps are carried out sequentially to analyze the data.
Step 1. The read events of unwanted account identifiers are separated from
the read events of those account identifiers that are in the service location
layer.
Step 2. The account identifiers in the service location layer that are read at
least once are separated from the account identifiers in the service location
layer that are missed.
Step 3. The number of times each of the meters (account identifiers) in the
service location layer that were read at least once is calculated.
Step 4. The minimum time interval between any two consecutive reads of each
meter in the service location layer that were read more than once is calculated
using the time stamp of the read events.
Step 5. The maximum read distance of each of the meters in the service
location layer that were read at least once is calculated using the location of
the meters and the location of the vehicle during the read events.
In Table 2.1, we provide the results based on the analysis from Steps 1 to 5.
In Figure 2.3, we use box plots to show the relationship between the minimum
time interval between reads and the number of reads. Consider the value of three
on the x-axis, i.e., the number of reads is three. We are considering meters in the
17
Total number of meters in the service location layer 474
Number of meters in the service location layer that are read 209
Total number of read events 28,745
Number of read events from meters in the service location layer 827
Number of street segments traversed in the route 7
Time gap between consecutive signal transmission (sec) 13
Maximum read distance among all meters in the service location 3,510
layer (feet)
Table 2.1: Results based on the analysis from Step 1 to Step 5.
Figure 2.3: Minimum time interval between reads.
service location layer that are read exactly three times from the route traversed by
the meter reading vehicle (there are 29 such meters). For these meters, we have a
time interval between their first read and second read, and a time interval between
their second read and third read. We take the minimum of these two time intervals
for each of the 29 meters. The 29 minimum time interval values, which range from
13 seconds to 59 seconds, are shown using a box plot at x = 3. We observe that
most of the large values of the minimum time interval occur with a fewer number of
reads. When the number of reads is less for a meter, the vehicle is probably farther
from the meter most of the time on the route. The vehicle probably came close
18
Figure 2.4: Maximum read distance.
to the meter for small portions of the route. When the number of reads is larger
for a meter, the vehicle is probably closer to the meter for a longer duration. The
vehicle should have read that meter every second. However, this is not the case.
Instead, for this data set, the minimum time intervals attain a constant value of 13
seconds. This value is the time gap between consecutive signals sent by the RFID
transmitters.
In Figure 2.4, we use box plots to show the relationship between the maximum
read distance and the number of reads. Again, consider the value of three on the
x-axis. For the 29 meters, we have the read distances for each of their three reads.
We take the maximum of these three read distances for each of the meters. These
29 maximum read distance values, which range from 808 feet to 3,052 feet, are
shown using a box plot at x = 3. From our observations, it seems that the chances
of having larger values of the maximum read distance increases for meters with a
smaller number of reads. This also confirms our observations from Figure 2.3. When
19
the number of reads is less for a meter, the vehicle is farther from the meter most
of the time on a route; when the number of reads is larger for a meter, the vehicle
is closer to the meter for a longer duration.
We perform four additional steps of analysis on the data with respect to a
meter being read or not being read.
Step 6. The route traversed by a meter reading vehicle is discretized (like
the green dots in Figure 2.2) using the distinct geographic coordinates of the
vehicle position during the read events.
Step 7. For all the meters in the service location layer, the shortest distance
from the route traversed by a vehicle is calculated using the distinct locations
of the vehicle. The shortest distance from the route will be used as a proxy
for the distance from the meters to the route.
Step 8. Around each of the distinct points in discretized route, a circular disc
(radius of 100 feet to 1000 feet with steps of 100) is considered.
Step 9. For each radius, we count the number of meters within at least one of
the circular discs and the number read (regardless from where the meters are
read). We then calculate the fraction of meters read for each radius.
In Table 2.2, we provide the results based on the analysis from Step 9. The
fraction of meters read are calculated both cumulatively and non-cumulatively for
each of the 10 different radii, ranging from 100 feet to 1000 feet. The entries in the
two columns have the form a/b, where b denotes the number of meters within that
20
Radius (feet) Cumulative Success Non-Cumulative Success
100 14/14 = 1.00 14/14 = 1.00
200 35/35 = 1.00 21/21 = 1.00
300 53/54 = 0.98 18/19 = 0.95
400 64/67 = 0.96 11/13 = 0.85
500 74/78 = 0.95 10/11 = 0.91
600 85/94 = 0.90 11/16 = 0.69
700 97/108 = 0.90 12/14 = 0.86
800 117/131 = 0.89 20/23 = 0.87
900 129/147 = 0.88 12/16 = 0.75
1000 149/171 = 0.87 20/24 = 0.83
Table 2.2: Results based on the analysis from Step 9.
radius for the cumulative case and the number of meters between that radius and
the previous lower radius considered for the non-cumulative case, and a denotes the
number of meters read out of those b meters. The fractions in the cumulative case
show a gradual decrease in success with an increase in the distance of meters from
the route. We do not see a specific trend for the non-cumulative case. We note that
the smallest value of the fraction occurs for meters that are at a distance of 500 feet
to 600 feet from the route. This observation indicates that the shortest distance
of meters from routes is not the only key factor for reading a meter successfully.
Otherwise, the non-cumulative case would have followed the same trend as the
cumulative case.
2.4 Integer Programming Formulation
We formulate the meter reading problem with RFID technology as a two-stage
IP. The Stage 1 IP finds the street segments that are to be traversed for reading each
meter with a pre-specified chance of being read. The solution of the Stage 1 IP gives
21
street segments spread across the street network, which does not necessarily form
a full route. A mixed rural postman problem finds the shortest way of connecting
a given set of required street segments to form a full route on a mixed graph with
edges (two-way street segments) and arcs (one-way street segments). The Stage 2
IP solves a mixed rural postman problem that adds deadhead segments (extra street
segments not required for reading meters) to the solution of the Stage 1 IP to obtain
the full route and it ensures that the depot (denoted by a node on the graph) is a
part of the route.
2.4.1 Stage 1 IP Formulation
Consider a street network as a mixed graph denoted by G = (V,E?A), where
E denotes the set of the edges, A denotes the set of the arcs, and V denotes the set
of nodes. Let cj ? 0 be the cost (length) of street segment j. Let I be the set of
the meters. Let pij be the probability that meter i is read at least once from street
segment j. Let Li ? [0, 1] be the specified likelihood of reading meter i from the
full route. We define xj to be the binary decision variable denoting whether or not
street segment j should be traversed. The Stage 1 IP formulation is given on the
next page.
The objective function (2.1) minimizes the total cost (length). Constraints
(2.2) select the values of the binary decision variables (xj) so that the probability
of reading meter i is at least Li. Constraints (2.3) define the decision variables.
In general, the solution of the Stage 1 IP, i.e., the graph induced by the required
22
edges and arcs GR = (V,ER ? AR), where ER ? E and AR ? A denote the set
of required edges and arcs, respectively, is not connected. The objective value of
the Stage 1 IP will be greater for larger values of Li. The greater the need to read
meter i during the next meter reading trip, the larger should be the value of Li
set by the utility company. In cases where the utility company can manage using
estimated billing for meter i during the next billing cycle, the value of Li should
be set clo?se to 0. We note that constraints (2.2) can be linearized in the decision
variables j?E?A xj ? log(1? pij) ? log(1 ? Li) for all meters i yielding a linear
Stage 1 IP.
?
(Stage 1 IP) min cjxj (2.1)
j??E?A
s.t. (1? p )xjij ? (1? Li) ?i ? I (2.2)
j?E?A
xj ? {0, 1} ?j ? E ? A (2.3)
For values of Li close to 1, constraints (2.2) can be infeasible for some meter
i even when the meter reading vehicle traverses all street segments in the network
(xj = 1 for all street segments j), i.e., meter i cannot be read automatically with
probability of at least Li. In that case, the driver of the meter reading vehicle will
need to park the vehicle on the closest street segment and read meter i manually.
This means that meter i is read with probability 1 from the closest street segment,
i.e., the Stage 1 IP is solved with pij = 1, where j is the closest street segment to
23
meter i. This will enforce xj = 1 in the Stage 1 IP solution, and, therefore, street
segment j will be in the set of required street segments. Let MR ? ER ?AR denote
the subset of the required street segments that are needed to manually read some
of the meters, i.e., pij = 1 for all j ?MR. We consider a constant stoppage time to
manually read meter i from street segment j. Accordingly, we add a penalty to the
Stage 2 IP objective value as a proxy for the distance that could have been traversed
during the stoppage time.
2.4.2 Stage 2 IP Formulation
For S1, S2 ? V , (S1 : S2) denotes the set of edges and arcs with one endpoint
in S1 and the other endpoint in S2. A(S1 : S2) = {(i, j) ? A : i ? S1, j ? S2} denotes
the set of arcs with one endpoint in S1 and the other endpoint in S2. E(S1 : S2) =
{(i, j) ? E : i ? S1, j ? S2} denotes the set of edges with one endpoint in S1 and
the other endpoint in S2. For S ? V , ?+(S) = A(S : V \ S), ??(S) = A(V \ S : S)
and ?(S) = E(S : V \ S), where E(S) and A(S) denote the set of edges and arcs,
respectively, with both endpoints in S. ??(S) = ?(S)? ?+(S)? ??(S) = (S : V \S).
If S = {vi}, we simply write ?(i), ?+(i), ??(i) or ??(i). The vertex sets of the
connected components of GR are denoted by V1, . . . , Vp. The depot is denoted by
the node v0 ? V . We consider a single meter reading vehicle. We define yj to be the
non-negative integer decision variable denoting the numbe?r of times street segment
j is traversed in the full route. For F ? E ? A, Y (F ) = j?F yj. The Stage 2 IP
formulation is given by the following.
24
?
(Stage 2 IP) min cjyj (2.4)
j?E?A
s.t. Y (??(0)) ? 1 (2.5)
Y (??(i)) ? 0 mod 2 ?i ? V (2.6)
Y (?+(S)) ? 1 ?S = ?k?QVk, Q ? {1, . . . , p} (2.7)
Y (?+(S))? Y (??(S)) ? Y (?(S)) ?S ? V (2.8)
yj ? 1 and integer ?j ? ER ? AR (2.9)
yj ? 0 and integer ?j ? E ? A \ ER ? AR (2.10)
The objective function (2.4) minimizes the total cost (length) of the route.
Constraint (2.5) ensures that the depot is a part of the route. Constraints (2.6) are
the flow conservation constraints, i.e., every node has an even degree in the route.
Constraints (2.7) are the disjoint subtour elimination constraints, i.e., the required
street segments obtained in the Stage 1 IP are connected in the route. Constraints
(2.8) are the balanced-set inequalities, i.e., the difference between the number of arcs
in the route entering S and the number of arcs in the route leaving S cannot be more
than the number of edges in the route between S and V \S. Constraints (2.9) define
the decision variables for those street segments j which are required to be traversed
by the Stage 1 IP, i.e., xj = 1. Constraints (2.10) define the decision variables for
those street segments j which are not required to be traversed by the Stage 1 IP,
i.e., xj = 0. The Stage 1 IP solution already meets the specified likelihood Li of
25
reading each meter i. The deadhead segments added in the Stage 2 IP increase the
likelihood of reading the meters because the meter reading vehicle is also receiving
signals while traversing the deadhead segments. The Stage 2 IP formulation without
constraint (2.5) is the formulation for the mixed rural postman problem (Corbera?n
et al. 2014).
2.5 Jensen?s Inequality for the Stage 1 IP
Jensen?s inequality generalizes the statement that the secant line of a convex
function lies above the graph of the function. For just two points x1 and x2, the
secant line consists of weighted means of the convex function evaluated at the points,
tf(x1)+(1?t)f(x2), while the graph of the function is the convex function evaluated
at the weighted means of the points, f(tx1 + (1? t)x2). Thus, Jensen?s inequality is
f(tx1 + (1? t)x2) ? tf(x1) + (1? t)f(x2).
In the context of probability theory, it is generally stated in the following form:
if X is a random variable and ? is a convex function, then
?(E(X)) ? E(?(X)).
In the context of our meter reading problem, let p?ij be the random variable
denoting the probability that meter i is read from street segment j at least once
and pij = E(p?ij), where E() denotes the expected value. From Jensen?s inequality it
26
follows that
(1? p )xjij ? E((1? p? xjij) ) (2.11)
and the equality holds for xj values of 0 and 1. In constraint (2.2), (1 ? p xjij)
denotes the probability that meter i is missed from street segment j when street
segment j is traversed xj times. If we allow the decision variables xj to attain
integer values greater than 1, i.e., the street segments can be repeated, then the
meter reading vehicle may traverse street segment j several times (xj > 1 and
integer) to read meters from street segment j with higher probability. In that
case, from (2.11) we can observe that the probability that meter i is missed from
street segment j when street segment j is traversed xj times is underestimated and
therefore, the probability that meter i is read from street segment j at least once
when street segment j is traversed xj times is overestimated. The equality in (2.11)
also holds when it is assumed that the probability that meter i is missed from street
segment j is independent across multiple traversals of street segment j, since in
that case, E((1 ? p? )xjij ) = (E(1 ? p? ))xjij = (1 ? p )xjij . But, if street segment j is
traversed multiple times on the same day or even within a span of few days, then
the probability that meter i is missed from street segment j should be correlated
across those traversals.
Let us consider an example when xj = 2. The worst case scenario would be
that if p?ij values are 0 and 1. So, pij = (0 + 1)/2 = 0.5 and therefore, (1? p xjij) =
(1?0.5)2 = 0.25. On the other hand we have, E((1?p? )xjij ) = ((1?1)2+(1?0)2)/2 =
27
0.5. So, in this example the probability that meter i is missed from street segment
j when street segment j is traversed twice is underestimated by 0.5 ? 0.25 = 0.25,
and therefore, the probability that meter i is read from street segment j at least
once when street segment j is traversed twice is overestimated by the same amount.
The estimation bias is
b(pij) = E((1? p?ij)xj)? (1? p )xjij
d 1 d2? E(( (1? p xjij) )(p?ij ? pij)) + E(( (1? p xjij) )(p?ij ? pij)2)
dp 2ij 2 dpij
1 d2
= 0 + ( (1? p )xjij )(V ar(p?2 ij))2 dpij
?
Constraint (2.2) can be modified as j?E?A ((1? pij)xj + b(pij)) ? (1?Li)
to correct for the overestimation in the probability that meter i is read from street
segment j at least once when street segment j is traversed xj times. With this
m?odified constraint the Stage 1 IP cannot be linearized in the decision variables
j?E?A log((1? p xjij) + b(pij)) ? log(1 ? Li). To compute b(pij) we need to com-
pute V ar(p?ij), which is not easy to estimate. So, the most reasonable way around
is to solve the linear Stage 1 IP.
2.6 Regression
In order to solve the Stage 1 IP, we need to estimate the values of the pij?s.
We use regression models. The dependent variable in these models is denoted by
Read OR Notij (whether or not meter i was read from street segment j). The
28
data elements have the form of 1 and 0, where 1 indicates that meter i is read
from street segment j and 0 indicates that meter i is not read from street segment
j. The predicted values of the dependent variable in the regression model have to
be between 0 and 1 which will denote the probability pij. Based on the type of
the data we have and our requirements on the predicted values of the dependent
variable, logit and probit models are considered. The independent variables in
these models are: Shortest Distanceij (shortest distance between meter i and street
segment j), No of Pulsesj (number of pulses the meter reading vehicle can receive
from the meter while traveling on street segment j), and No of Customersi (number
of meters within 500 feet from meter i; 500 feet is the range of the RFID signals as
specified by ITRON, so the signals are strong enough to interfere with each other
within 500 feet). Shortest Distanceij should have a negative coefficient because the
larger the shortest distance between meter i and street segment j is, the smaller the
value of pij. No of Pulsesj is obtained from the amount of time the meter reading
vehicle spent on street segment j divided by the time interval between the RFID
signal transmissions. If the meter reading vehicle travels at a higher speed through
street segment j, then the time spent by the vehicle on street segment j is smaller
and, therefore, the No of Pulsesj is lower. No of Pulsesj should have a positive
coefficient because the greater the number of pulses the meter reading vehicle can
receive from the meters while traveling on street segment j, the larger the value of
pij. No of Customersi is a measure of density of meters in a region. It is important
because, with a large number of meters in a region, the interference of the RFID
signals is greater, so the signals die out quickly. No of Customersi should have a
29
negative coefficient because the greater the number of meters surrounding meter i,
the smaller the value of pij.
We constructed six logistic regression models and six probit regression models.
For the logistic and probit regressions, Model 1 uses three independent variables:
Shortest Distanceij, No of Pulsesj, and No of Customersi. Model 2 adds indicator
variables for the traversed street segments to Model 1. Model 3 adds indicator vari-
ables for the meters to Model 1. Model 4 adds indicator variables for the traversed
street segments to Model 3. Model 5 uses Shortest Distanceij, No of Pulsesj, and
indicator variables for the meters. Model 6 adds indicator variables for the traversed
street segments to Model 5.
In logistic regression and probit regression, the parameters are estimated using
the maximum likelihood estimation (MLE) method. We could use McFadden?s R2
(1 - Residual Deviance/Null Deviance) to assess our models. Louviere at al. (2000)
mention that values of McFadden?s R2 between 0.2 and 0.4 are considered to be
indicative of extremely good model fits. Simulations by Domencich and McFadden
(1975) showed that this range is equivalent to 0.7 to 0.9 for an R2 value from ordinary
least squares (OLS). However, McFadden?s R2 is similar to R2 from OLS in that
its value always increases as new predictors are added to the model. Instead, we
use McFadden?s Adjusted R2 (1 - Akaike Information Criterion/Null Deviance) to
assess our models. It is similar to the Adjusted R2 from OLS in that it penalizes
for using additional predictors.
30
2.7 Regression Results
We use only the traversed street segments to estimate the coefficients of the
regression models. We can only determine that a meter was read or not read on
street segments traversed by the meter reading vehicle. Thus, the data used for
estimating the coefficients has 474 meters and 7 street segments. Therefore, the size
of the data set is 3,318 (474? 7).
In Table 2.3, we provide summary statistics of the dependent and the indepen-
dent variables. In Table 2.4, we give the correlation between each pair of variables.
From Table 2.3, we see that only 15.2% of the 3,318 values for Read OR Not have
a value of 1 and the rest have a value of 0. This is due to the fact that many
meters are far away from the route, so that many of them are never read from
the route. From Table 2.4, we see that the correlation between Read OR Not and
Shortest Distance is negative and the largest in magnitude. There is a slight posi-
tive correlation between both Read OR Not and No of Pulses and No of Customers.
Shortest Distance and No of Customers have a high negative correlation that can
make the coefficient of No of Customers negative in the regression models.
In Tables 2.5 and 2.6, we present the logistic regression results and the pro-
bit regression results, respectively. We give the means of the coefficients of the
independent variables and their standard deviations in parenthesis. The results in
Tables 2.5 and 2.6 for the six models are similar for both logistic regressions and
probit regressions. For each of the six models in both regressions, the coefficient of
Shortest Distance is always significant and negative; the coefficient of No of Pulses
31
Table 2.3: Summary statistics for the dependent and the independent variables.
whenever significant is positive; the coefficient of No of Customers whenever sig-
nificant is negative. In Figures 2.5 and 2.6, we give the histograms for the fitted
values of the dependent variable from the logistic regression models and the probit
regression models, respectively. The histograms for each of the six logistic regression
models are similar to the histograms for each of the six respective probit regression
32
Statistic Read OR Not Shortest Distance No of Pulses No of Customers
Mean 0.152 2,528.714 5.500 33.181
Standard Deviation 0.359 1,491.172 6.165 15.218
Minimum 0.000 62.804 1.500 0.000
25th percentile 0.000 1,231.286 1.500 21.000
Median 0.000 2,348.150 2.500 31.000
75th percentile 0.000 3,695.959 8.500 43.000
Maximum 1.000 5,759.010 19.500 72.000
Table 2.4: Correlation between the dependent and the independent variables.
models.
In Tables 2.5 and 2.6 for both regressions, all three independent variables are
significant at the 1% level in Model 1. In Model 2, for both regressions, No of -
Pulses is not significant, and the other two independent variables are significant at
the 1% level in the presence of street dummies. In Model 3, for both regressions,
33
Read OR Not Shortest Distance No of Pulses No of Customers
Read OR Not 1.000 ?0.508 0.106 0.143
Shortest Distance 1.000 ?0.057 ?0.484
No of Pulses 1.000 0.000
No of Customers 1.000
?p<0.1; ??p<0.05; ???p<0.01
Table 2.5: Logistic regression results.
No of Customers is not significant, and the other two independent variables are
significant at the 1% level in the presence of customer dummies. In Model 4, for
both regressions, No of Customers is not significant, Shortest Distance is significant
34
Read OR Not
(1) (2) (3) (4) (5) (6)
Shortest Distance ?0.003??? ?0.003??? ?0.007??? ?0.007??? ?0.007??? ?0.007???
(0.0002) (0.0002) (0.0005) (0.0010) (0.0005) (0.0010)
No of Pulses 0.060??? ?0.004 0.091??? 0.165?? 0.091??? 0.165??
(0.010) (0.053) (0.016) (0.080) (0.016) (0.080)
No of Customers ?0.017??? ?0.021??? 0.060 0.057
(0.005) (0.005) (0.204) (0.202)
Constant 2.852??? 3.680??? 8.137 7.983 9.577??? 9.357???
(0.258) (0.382) (6.035) (5.951) (1.669) (1.641)
Street Dummies No Yes No Yes No Yes
Customer Dummies No No Yes Yes Yes Yes
McFadden?s Adjusted R2 0.514 0.525 0.377 0.381 0.377 0.381
Model 1 Model 2
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
Fitted values Fitted values
Model 3 Model 4
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
Fitted values Fitted values
Model 5 Model 6
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
Fitted values Fitted values
Figure 2.5: Histograms of fitted values from the logistic regressions.
at the 1% level, and No of Pulses is significant at the 5% level and the 10% level for
logistic regression and probit regression, respectively, in the presence of both street
dummies and customer dummies. In two of the first four models, No of Customers
is not significant, so we construct two models with customer dummies that leave out
No of Customers. In Model 5, for both regressions, the two independent variables
are significant at the 1% level in the presence of customer dummies. In Model 6,
35
Frequency Frequency Frequency
0 500 1500 2500 0 500 1500 2500 0 500 1500
Frequency Frequency Frequency
0 500 1500 2500 0 500 1500 2500 0 500 1500
?p<0.1; ??p<0.05; ???p<0.01
Table 2.6: Probit regression results.
for both regressions, Shortest Distance is significant at the 1% level, and No of -
Pulses is significant at the 5% level and the 10% level for logistic regression and
probit regression, respectively, in the presence of both street dummies and customer
36
Read OR Not
(1) (2) (3) (4) (5) (6)
Shortest Distance ?0.002??? ?0.002??? ?0.004??? ?0.004??? ?0.004??? ?0.004???
(0.0001) (0.0001) (0.0002) (0.0003) (0.0002) (0.0003)
No of Pulses 0.032??? ?0.009 0.052??? 0.084? 0.052??? 0.084?
(0.005) (0.030) (0.009) (0.044) (0.009) (0.044)
No of Customers ?0.013??? ?0.015??? 0.032 0.032
(0.003) (0.003) (0.111) (0.110)
Constant 1.721??? 2.180??? 4.479 4.378 5.254??? 5.138???
(0.147) (0.213) (3.411) (3.377) (0.986) (0.977)
Street Dummies No Yes No Yes No Yes
Customer Dummies No No Yes Yes Yes Yes
McFadden?s Adjusted R2 0.512 0.523 0.377 0.381 0.377 0.381
Model 1 Model 2
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
Fitted values Fitted values
Model 3 Model 4
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
Fitted values Fitted values
Model 5 Model 6
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
Fitted values Fitted values
Figure 2.6: Histograms of fitted values from the probit regressions.
dummies. Based on the results in Tables 2.5 and 2.6, Model 1 and Model 2 perform
the best for both logistic regression and probit regression, with Model 2 performing
slightly better than Model 1. The simplicity of Model 1 without any dummies,
however, makes it preferable to Model 2. Model 1 has McFadden?s Adjusted R2
values of 0.514 and 0.512 for logistic regression and probit regression, respectively.
These values indicate very good model fits. We also tried models with higher powers
37
Frequency Frequency Frequency
0 500 1500 2500 0 500 1500 2500 0 500 1500
Frequency Frequency Frequency
0 500 1500 2500 0 500 1500 2500 0 500 1500
of independent variables and with interactions between independent variables but
none of them performed better than Model 1 for both logistic regression and probit
regression.
2.8 Bayesian Updating
Every time the meter reading vehicle collects readings, it adds more data to
the previous readings. With more data, we expect that the estimates of the pij?s
will be more accurate. Therefore, the routes generated by the two-stage IP will
be higher quality. They will be better at capturing the uncertain signals thereby
further reducing the number of missed reads.
There are some serious issues if we use regression to estimate the pij?s at every
time period with the new data. Suppose in time period 1 we observe the first set of
i.i.d. data denoted by y1. We run the regression on y1. In time period 2, we observe
a second set of i.i.d. data denoted by y2, independent of y1. We run the regression
on y1 and y2 together as a single data set, and so on. We are regressing on the older
data sets repeatedly which makes this process of estimation inefficient. Data sets
from different time periods are given equal weights in the regression. In practice,
utility companies need to use different weights for the data based on seasonality
and other factors. For example, during the summer season, meter reading data
from the previous summer is more important and accurate compared to the meter
reading data from the previous winter. Also, new obstacles may appear between
a meter and a street segment, new meters may appear in the vicinity of a meter
38
causing more interference, and a decrease in the battery level of an RFID tag will
reduce the signal range of the meter. All these factors make the most recent meter
reading data more accurate. Therefore, utility companies should be able to apply
different weights to parts of the data accordingly. If we estimate the pij?s at every
time period with new data, i.e., every time the meter reading vehicle collects data
from traversing a route, using concepts from Bayesian statistics, then we can avoid
the two drawbacks faced while updating using regression.
Our Bayesian statistical approach is based on updating information using
Bayes? Law. Suppose we fully observe the data Y . This is now a fixed and given
quantity in the inferential process. Our Bayesian model is a (parametric) statisti-
cal model for the observed data, (Y, f(y|?)), where f(y|?) is the probability density
function given the parameter ?, and a prior distribution on the parameters, (?, p(?)),
where ? is the finite dimensional parameter space. Under the assumption that the
data are independ?ent and identically distributed (i.i.d.), the likelihood function is
L(?|y1, . . . , yn) = ni=1 f(yi|?) which contains all the information for inference on
the unknown parameter ?. Therefore, from Bayes? Law, we have
| ? p(?)L(?|y1, . . . , yn)?(? y1, . . . , yn) = (2.12)
p(?)L(?|y
? 1
, . . . , yn)d?
?
where p(?)L(?|y1, . . . , yn)d? is the marginal likelihood which is independent of?
?, and ?(?|y1, . . . , yn) is the posterior distribution of ? conditioning on the ob-
served data. Since the denominator in (2.12) does not contribute towards the in-
ference on ?, we can omit it. Therefore, (2.12) can be reduced to ?(?|y1, . . . , yn) ?
39
p(?)L(?|y1, . . . , yn), that is, the posterior distribution of the parameter of interest is
proportional to the prior distribution times the likelihood function.
Bayesian updating works as follows:
?(?|y1) ? p(?)L(?|y1)
?(?|y1, y2) ? ?(?|y1)L(?|y2)
...
?(?|y1, . . . , yT ) ? ?(?|y1, . . . , yT?1)L(?|yT ) = p(?)L(?|y1, . . . , yT ).
In time period 1, p(?) is the prior information on ?, and the posterior distribution
is ?(?|y1). In time period 2, the prior is ?(?|y1), which is the posterior from time
period 1, and so on. This process can be repeated and the model will continue
to update the posterior distributions as we collect new data. In time period T ,
the posterior is ?(?|y1, . . . , yT ), which does not depend on the sequence in which
data arrive. This is exactly the same result that would have obtained if all the
i.i.d. data (y1, . . . , yT ) had been gathered at the same time. Bayesian updating is
much faster than regression since analysis is done only on the new incoming data
at each time period. Data from different time periods can be weighted differently
in Bayesian updating depending on the requirements of the utility companies. The
idea is to re-solve the two stage IP at the end of each time period with the new
posterior distribution of the probabilities (pij) that is obtained and, thereby, have
an iterative algorithm to generate more robust routes at the end of each time period.
The unknown parameters in the Bayesian models are estimated using Markov Chain
40
Monte Carlo (MCMC) simulations.
2.8.1 Logit Model and Probit Model
Model 1 (three independent variables, no customer dummies, no street dum-
mies) for both logistic regression and probit regression can be represented by
g(pij) = ??1 + ??2 ? Shortest Distanceij + ??3 ? No of Pulsesj + ??4 ? No of Customersi
where g is the link function, pij = E(Read OR Notij), ??k = E(?k), and E() denotes
the expected value. For the logit model, g(pij) = ln(pij/1? pij), and for the probit
model g(pij) = ?
?1(pij), where ? is the cumulative Normal (0, 1) distribution func-
tion. We have to estimate the unknown parameter vector ? = (? >1, ?2, ?3, ?4) which
is a 4-dimensional vector of coefficients. The data matrix X is (N ? 4)-dimensional,
where N is the size of the data set, and the entries in the first column of X are
1?s. Xk denotes row k of X. In time period 1, we do not have any prior informa-
tion about ?. We rely on the information obtained from the data. Therefore, the
prior is set to a vague prior, i.e., the prior will have minimal effect on the posterior
distribution of time period 1.
Both models have their pros and cons. Error terms in logit models have a lo-
gistic distribution, whereas error terms in probit models have a normal distribution.
The logistic distribution has heavier tails compared to the normal distribution, so
logit models are more robust than probit models. Logit models have a better fit
to data that are more spread out in the tails. The normal distribution is the con-
41
jugate prior for the likelihood function in probit models. Therefore, the unknown
parameters in the probit model can be estimated using an exact algorithm. How-
ever, neither the normal distribution nor any other distribution from the exponential
family is the conjugate prior for the likelihood function in logit models. Therefore,
the unknown parameters in the logit model have to be estimated using a non-exact
algorithm.
2.8.2 Metropolis-Hastings Random Walk Algorithm for the Logit
Model
The Metropolis-Hastings (MH) Random Walk algorithm (Metropolis at al.
1953, Hastings 1970) is used to estimate the parameters of the logit model and has
four steps.
Step 1. Choose a starting value for ?.
Step 2. The random walk chain ?new = ?old+ where  ?Mult?ivariate Normal
(0, s2H?1) generates candidate realizations for ?, s = 2.3/ dimension(?)
(Marin and Robert 2014, Press 2003), andH is the Hessian of the log likelihood
function for the logit model.
new
Step 3. Accept ?new with probability ? = min{1, ?(? |y,X)old| }.?(? y,X)
Step 4. Repeat Steps 2 and 3.
42
2.8.3 Gibbs Sampling Algorithm for the Probit Model
The Gibbs sampling algorithm (Albert and Chib 1993) is used to estimate the
parameters of the probit model. The setup for this algorithm is as follows. The prior
distribution is p(?) = Multivariate Normal (B0, V0). We have y
?
k ? Normal (Xk?, 1)
and yk = Indicator (y
?
k > 0), where y is the dependent variable and y
? is the latent
variable. The distribution of y? is given by p(y?k|yk = 0, ?) = Normal (Xk?, 1) ?
Indicator (y?k ? 0) and p(y?k|yk = 1, ?) = Normal (Xk?, 1) ? Indicator (y?k > 0).
Therefore, the posterior distribution is ?(?|y?) = Multivariate Normal ((X>X +
V ?1)?1(X>y?0 + V
?1
0 B0), (X
>X + V ?10 )
?1). The Gibbs sampling algorithm for the
probit model has four steps.
Step 1. Choose a starting value for ?.
Step 2. Draw [y?|y, ?].
Step 3. Draw [?|y?].
Step 4. Repeat Steps 2 and 3.
2.8.4 Hierarchical Probit Model
For our meter reading problem, hierarchical models consider the signal trans-
mission behavior of individual meters and their interactions with the signals from
other meters. Bayesian updating for the hierarchical probit model is a more com-
plex method for estimating the pij?s but the estimates are more accurate compared
to Bayesian updating for the logit model and the probit model. The hierarchical
43
probit model accounts for the uncertain behavior of each meter separately while also
accounting for the similarity between meters.
The rationale behind using a hierarchical model for updating the probability
estimates is that each meter is inherently different from every other meter. Some
meters are surrounded by physical obstacles, some are on elevated ground, and some
are old. New meters have better technology. The meters have different stages of
battery life. As the battery level of the meters drop below a certain threshold, the
signal transmission range decreases. All of these factors affect the signal transmission
behavior of the meters.
Let n be the number of meters and m be the number of street segments
(counting repetitions) traversed by the meter reading vehicle. Group the observa-
tions Read OR Notij into n buckets, where bucket i contains observations on meter
i. Each bucket contains m observations with one from each traversed street segment.
The probit model for each group i is called the lower level model for meter i and is
given by
??1(pij) = ??i,1 + ??i,2 ? Shortest Distanceij + ??i,3 ? No of Pulsesj.
where ??i,k = E(?i,k) and E() denotes the expected value. The lower level unknown
parameter vector ?i = (?
>
i,1, ?i,2, ?i,3) for each group i are 3-dimensional vector of
coefficients and are used as dependent variables in a multivariate linear model called
the higher level model. The data matricesXi for each group i are (m?3)-dimensional
and the entries in the first column of each matrix are 1?s. The multivariate linear
44
model is B = Z? + ?, where ? is the error term. B = (?>1 , . . . , ?
> >
n ) is an (n? 3)-
dimensional matrix of the dependent variables. Z = (z>, . . . , z>)>1 n is an (n ? 2)-
dimensional matrix of the independent variables, where zi = (1,No of Customersi)
>
f?or each i is a 2-d?imensional vector. The higher level unknown parameter matrix ? =????1,1 ?2,1 ?3,1??? is a (2?3)-dimensional matrix of coefficients from the multivariate
?1,2 ?2,2 ?3,2
linear model. For each i, the multivariate linear model can be written as ?i =
?>zi + ?i, where ?i ? Multivariate Normal (0,?) is the error term for the group
i. We have to estimate the lower level parameters ?i?s for each meter i and the
higher level parameters ? and ?. In time period 1, we do not have any prior
information about the parameters, so we rely on the information obtained from the
data. Therefore, the priors are set to vague priors.
2.8.5 Gibbs Sampling Algorithm for the Hierarchical Probit Model
The Gibbs sampling algorithm is used to estimate the parameters of the hi-
erarchical probit model. The setup for this algorithm is as follows. The prior dis-
tributions for the lower level parameters are p(?i) = Multivariate Normal (?
>zi,?)
for each i. The prior distributions for the higher level parameters are p(vec[?>]) =
Multivariate Normal (M0, V0) and p(?) = Inverse-Wishart (c0, D0), where vec de-
notes the operator that transforms a matrix into a vector by concatenating columns.
We have y?i ? Multivariate Normal (Xi?i, 1) and yij = Indicator (y?ij > 0), where
yi = (yi1, . . . , y )
>
im is the dependent variable vector and y
? = (y?i i1, . . . , y
? >
im) is the
latent variable vector for group i. Therefore, the posterior distributions for the lower
45
level parameters are ?(? |y?,?,?) = Multivariate Normal ((X>X +??1)?1i i i i (X> ?i yi +
??1?>z ), (X>i i Xi+?
?1)?1) for each i, and the posterior distributions for the higher
level parameters are ?(vec[?>]|{y?i },?, {?i}) = Multivariate Normal ((Z>Z???1 +
V ?10 )
?1((Z>???1)vec[B>]+V ?1 > ?1 ?1 ?1 ?0 M0), (Z Z?? +V0 ) ) and ?(?|{yi },?, {?i}) =
Inverse-Wishart (c0 +n,D0 + (B?Z?)>(B?Z?)). The Gibbs sampling algorithm
for the hierarchical probit model has six steps.
Step 1. Choose starting values for ?i?s, ?, and ?.
Step 2. Draw [y?i |yi, ?i] for each group i.
Step 3. Draw [? |y?i i ,?,?] for each group i.
Step 4. Draw [?|{y?i },?, {?i}].
Step 5. Draw [?|{y?i },?, {?i}].
Step 6. Repeat Steps 2 to 5.
2.9 Bayesian Updating Results
2.9.1 Logit Model and Probit Model Results
The size of the data set used for estimating the parameters of the logit model
and the probit model is N = 474 ? 7 (= 3, 318). To verify that our choice of
the prior distribution on the unknown parameters for both the logit model and
the probit model does not have much effect on the posterior distribution, we also
estimate the parameters of both the models using regression. If the parameter values
46
for the regression and the Bayesian estimation match, this indicates that our choice
of the prior for the Bayesian estimation fulfills our requirement of a vague prior.
In subsequent time periods, when new data are gathered, the parameters can be
updated using the Bayesian updating algorithms.
In the MH random walk algorithm for the logit model, the prior for ? is
set to a multivariate normal distribution with the mean vector as the zero vector,
the variances are 10,000, and the covariances are zero. The first 5,000 samples
are considered as the burn-in period and the next 10,000 samples are collected for
analysis. The starting value for ? is set to the maximum likelihood estimator for
the likelihood function of the logit model. The acceptance rate for the new values of
? generated from the Markov chain is around 33% (the acceptance rate should be
between 30-35% for an optimal combination of exploration and exploitation steps).
In the Gibbs sampling algorithm for the probit model, in order to set the prior
for ?, B0 is set to the zero vector and V0 is set to the diagonal matrix with diagonal
entries of 10,000. We collected 10,000 samples for analysis. The starting value for
? is set to the zero vector.
In Tables 2.7 and 2.8, we give the logit model results and the probit model
results, respectively. The mean and standard deviation of the ?i?s from the MH
random walk algorithm and logistic regression, and the Gibbs sampling algorithm
and probit regression are presented. Since, for each i, ?i values match for both the
logistic regression and the MH random walk algorithm, and both the probit regres-
sion and the Gibbs sampling algorithm, our choice of the prior in both algorithms
serves the purpose of a vague prior. This indicates that we can perform Bayesian
47
Coefficient Logistic Regression MH Random Walk Algorithm
Intercept (?1) 2.852 2.860
(0.258) (0.267)
Shortest Distance (?2) ?0.003 ?0.003
(0.0002) (0.0001)
No of Pulses (?3) 0.060 0.060
(0.010) (0.009)
No of Customers (?4) ?0.017 ?0.018
(0.005) (0.005)
Table 2.7: Logit model results.
Coefficient Probit Regression Gibbs Sampling Algorithm
Intercept (?1) 1.721 1.733
(0.147) (0.166)
Shortest Distance (?2) ?0.002 ?0.002
(0.0001) (0.0001)
No of Pulses (?3) 0.032 0.032
(0.005) (0.005)
No of Customers (?4) ?0.013 ?0.013
(0.003) (0.003)
Table 2.8: Probit model results.
updating for the logit model and the probit model after receiving new data points
instead of using logistic regression or probit regression, respectively.
In Figures 2.7, 2.8, and 2.9, we give the density plots, trace plots, and au-
tocorrelation plots, respectively, for the MH random walk algorithm for the logit
model. These plots indicate that the Markov chain for this non-exact algorithm
has converged without any significant autocorrelation and the resulting posterior
distributions for ?i?s are normal distributions.
48
Figure 2.7: Density plots from the MH random walk algorithm for the logit model.
Figure 2.8: Trace plots from the MH random walk algorithm for the logit model.
49
Figure 2.9: Autocorrelation plots from the MH random walk algorithm for the logit
model.
2.9.2 Hierarchical Probit Model Results
The size of the data sets used for estimating the lower level parameters in the
probit models for each meter i is m = 7. The size of the data set used for estimating
the higher level parameters in the multivariate linear model is n = 474.
In the Gibbs sampling algorithm for the hierarchical probit model, in order
to set the priors for ?i?s, ?, and ?, M0 is set to the zero vector and V0 is set to
the diagonal matrix with diagonal entries of 1,000. We set c0 to seven and D0 to
the diagonal matrix with diagonal entries of three (Rossi et al. 2005, Press 2003).
We collected 10,000 samples for analysis. The starting values for the ?i?s and the
starting value for ? are set to the zero vector. The starting value for ? is sampled
from Inverse-Wishart (c0 + n,D0).
50
Coefficient Gibbs Sampling Algorithm
?1,1 11.685
(4.008)
?2,1 ?0.010
(0.009)
?3,1 ?0.101
(0.090)
?1,2 0.013
(0.061)
?2,2 ?0.0002
(0.0002)
?3,2 0.005
(0.002)
Table 2.9: Hierarchical probit model results for the higher level parameter matrix.
In Table 2.9, we give the hierarchical probit model results for ?. The mean and
standard deviation of the ?i,j?s from the Gibbs sampling algorithm are presented.
Using these values, the ?i?s are calculated for each meter i.
In Figure 2.10, we show the histograms of the means of the ?i?s from the Gibbs
sampling algorithm for the hierarchical probit model. The three histograms belong
to the coefficients of the Intercept (??i,1), the Shortest Distance (??i,2), and the No of -
Pulses (??i,3), respectively, for the lower level probit model in the hierarchical probit
model. The histograms show the variation in the coefficients for different meters
that are not captured in the logit model or the probit model. All of the meters
have the same equation for estimating the pij?s for the logit model and the probit
model. The hierarchical probit model gives individualized probability predictions
pij for each meter i. Thus, for the hierarchical probit model, each meter has its
51
Figure 2.10: Histograms of the means of the lower level parameters from the Gibbs
sampling algorithm for the hierarchical probit model.
Total number of meters in the service location layer 6,067
Number of meters in the service location layer that are read 5,720
Total number of read events 337,870
Total number of street segments in the network 1,575
Number of street segments traversed in the route 578
Number of street segments traversed in the route counting repetitions 829
Total number of nodes in the network 1,072
Duration of the route (hours) 6
Time gap between consecutive signal transmission (sec) 3
Table 2.10: Summary of the CNG data set.
unique equation for estimating the pij?s.
2.10 Description of the CNG Data Set
A larger data set was gathered during the first half of 2016 by ITRON and
provided by RouteSmart Technologies. This data set gives meter locations and
52
reading data serviced by Connecticut Natural Gas (CNG) in Hartford, Connecticut.
A summary of the data set is given in Table 2.10.
From the data, we make the following observations. There are many account
identifiers in the reading events file that have no corresponding entry in the service
location file, i.e., the RFID readers are picking up signals from nearby RFID tags
that do not require reading by the utility company. There are a total of 337,870
read events from a route that took six hours and traversed 829 street segments
(counting repetitions). Many of those read events are from unwanted RFID tags.
The read events data have a many-to-one relationship to a service location account
identifier, i.e., some of the meters are read more than once by the meter reading
device. Out of the 6,067 meters in our data set, 347 meters are missed. The vehicle
location is tracked every second. However, when the read events for a single meter
are recorded, they do not occur every second along a street segment that seems to
be within range. Rather, there is generally a regular time gap between occurrences
of read events for the same meter. This confirms the fact that the signal transmitted
by an RFID tag is at regular time intervals and is not occurring continuously. Some
meters that are very close to the vehicle route have been missed, probably due to a
discontinuous signal. There is a time gap of three seconds between successive signal
transmissions from the RFID tags in our data set. Missed reads can also be due to
the variability of the range of a meter to transmit a signal.
53
Coefficient Logistic Regression MH Random Walk Algorithm
Intercept (?1) ?1.242??? ?1.242
(0.010) (0.010)
Shortest Distance (?2) ?0.003??? ?0.003
(0.000009) (0.000009)
No of Pulses (?3) 0.019
??? 0.019
(0.00008) (0.00008)
No of Customers (? ) ?0.003???4 ?0.003
(0.00005) (0.00005)
McFadden?s Adjusted R2 0.223
???p<0.01
Table 2.11: Logit model results for the CNG data set.
2.11 Bayesian Updating Results for the CNG Data Set
2.11.1 Logit Model and Probit Model Results
The size of the data set used for estimating the parameters of the logit model
and the probit model is N = 6, 067? 829 (? 5 million).
In Table 2.11, we give the logit model results. The mean and standard de-
viation of the ?i?s from the MH random walk algorithm and logistic regression are
presented. Since for each i, ?i values match for both the logistic regression and the
MH random walk algorithm, our choice of the prior in the MH random walk algo-
rithm serves the purpose of a vague prior. The McFadden?s Adjusted R2 value of
0.223 for the logistic regression indicates very good model fit. Also, all coefficients of
the regression model are significant at the 1% level, and the signs of the coefficients
of the three independent variables are what we expected. This indicates that we
54
Figure 2.11: Density plots from the MH random walk algorithm for the logit model
for the CNG data set.
Figure 2.12: Trace plots from the MH random walk algorithm for the logit model
for the CNG data set.
can perform Bayesian updating for the logit model after receiving new data points
instead of using logistic regression.
In Figures 2.11, 2.12, and 2.13, we give the density plots, trace plots, and
autocorrelation plots, respectively, for the MH random walk algorithm for the logit
model. These plots indicate that the Markov chain for this non-exact algorithm
55
Figure 2.13: Autocorrelation plots from the MH random walk algorithm for the logit
model for the CNG data set.
Coefficient Probit Regression Gibbs Sampling Algorithm
Intercept (? ) ?1.000???1 ?1.024
(0.004) (0.091)
Shortest Distance (?2) ?0.001??? ?0.001
(0.000004) (0.000105)
No of Pulses (?3) 0.007
??? 0.007
(0.00003) (0.00069)
No of Customers (? ) ?0.002???4 ?0.002
(0.00002) (0.00011)
McFadden?s Adjusted R2 0.200
???p<0.01
Table 2.12: Probit model results for the CNG data set.
has converged without any significant autocorrelation and the resulting posterior
distributions for ?i?s are normal distributions.
In Table 2.12, we give the probit model results. The mean and standard
deviation of the ?i?s from the Gibbs sampling algorithm and probit regression are
presented. Since for each i, ?i values match for both the probit regression and the
56
Figure 2.14: Histograms of the means of the lower level parameters from the Gibbs
sampling algorithm for the hierarchical probit model for the CNG data set.
Gibbs sampling algorithm, our choice of the prior in the Gibbs sampling algorithm
serves the purpose of a vague prior. The McFadden?s Adjusted R2 value of 0.200 for
the probit regression indicates very good model fit. Also, all of the coefficients of
the regression model are significant at the 1% level, and the signs of the coefficients
of the three independent variables are what we expected. This indicates that we
can perform Bayesian updating for the probit model after receiving new data points
instead of using probit regression.
2.11.2 Hierarchical Probit Model Results
The size of the data sets used for estimating the lower level parameters in
the probit models for each meter i is m = 829. The size of the data set used for
estimating the higher level parameters in the multivariate linear model is n = 6, 067.
In Table 2.13, we give the hierarchical probit model results for ?. The mean
57
Coefficient Gibbs Sampling Algorithm
?1,1 ?0.890
(0.241)
?2,1 ?0.002
(0.0009)
?3,1 0.004
(0.002)
?1,2 ?0.0002
(0.0001)
?2,2 ?0.000003
(0.000004)
?3,2 0.0000006
(0.000005)
Table 2.13: Hierarchical probit model results for the higher level parameter matrix
for the CNG data set.
and standard deviation of the ?i,j?s from the Gibbs sampling algorithm are presented.
In Figure 2.14, we show the histograms of the means of the ?i?s from the Gibbs
sampling algorithm for the hierarchical probit model. The three histograms belong
to the coefficients of the Intercept (??i,1), the Shortest Distance (??i,2), and the No of -
Pulses (??i,3), respectively, for the lower level probit model in the hierarchical probit
model. The histograms show the variation in the coefficients for different meters
that are not captured in the logit model or the probit model. The hierarchical
probit model gives individualized probability predictions pij for each meter i. Thus,
for the hierarchical probit model, each meter has its unique equation for estimating
the pij?s.
58
2.12 Discussion of Different Formulations and Computational Exper-
iments of the Stage 1 IP
We discuss different formulations of the Stage 1 IP. The goal is to develop a
formulation that is able to obtain the optimal value within a reasonable amount
of time for a large data (several thousands of meters and few thousands of street
segments).
2.12.1 Alternative Formulations
We have already discussed that the Stage 1 IP can be linearized in the decision
variables. The formulation of the linear Stage 1 IP (denoted by L) is given by the
following.
?
(L) min cjxj (2.13)
j??E?A
s.t. aijxj ? bi ?i ? I (2.14)
j?E?A
xj ? {0, 1} ?j ? E ? A (2.15)
where aij = ?log(1? pij) ? 0 for all i, j and bi = ?log(1? Li) ? 0 for all i.
Under an affine transformation of the decision variables in L, the linear Stage
1 IP can be expressed using 0-1 knapsack constraints. Let P denote the transformed
linear Stage 1 IP and the formulation is given by the following.
59
? ?
(P) min ?cjxj + cj (2.16)
j??E?A j?E?A
s.t. aijxj ? bi ?i ? I (2.17)
j?E?A
xj ? {0, 1} ?j ? E ? A (2.18)
?
where xj = 1? xj for all j and bi = j?E?A aij ? bi for all i. We, assume that the
aij?s and bi?s are positive.
Different types of valid inequalities are generated from the 0-1 knapsack con-
straint in P and can be added to the formulation P to create stronger formulations.
We discuss these in the next five subsections.
2.12.1.1 Coefficient Round-Down Inequalities
The a?ij?s and bi?s are rounded down to ge?nerate valid inequalities. For each
constraint j?E?A aijxj ? bi, a new constraint j?E?A baijcxj ? bbic is generated.
We add these coefficient round-down constraints to P and denote the formulation
by PR.
2.12.1.2 Increasing Coefficient Extended Cover Inequalities
Without loss o?f generality, we assume that, for eac?h i ? I, ai1 ? . . . ? ai|E?A|.
For each constraint ?j?E?A aijxj ? bi, a?new constraint
k
j=1 xj ? k?1 is generated
where k is such that k k?1j=1 aij > bi but j=1 aij ? bi. These are then extended to
60
form extended cover inequalities. We add these increasing coefficient extended cover
inequalities to P and denote the formulation by PDI.
To illustrate, consider the 0-1 knapsack constraint given by 1x1 + 2x2 + 3x3 +
4x4 + 5x5 + 6x6 + 7x7 + 8x8 + 9x9 + 10x10 ? 20. The generated valid inequality is
x1 + x2 + x3 + x4 + x5 + x6 ? 5 which is then extended to x1 + x2 + x3 + x4 + x5 +
x6 + x7 + x8 + x9 + x10 ? 5.
2.12.1.3 Decreasing Coefficient Extended Cover Inequalities
Without loss o?f generality, we assume that, for eac?h i ? I, ai|E?A| ? . . . ? ai1.
For each constraint kj??E?A aijxj ? bi, a n?ew constraint j=1 xj ? k?1 is generated
where k is such that k k?1j=1 aij > bi but j=1 aij ? bi. These are extended cover
inequalities. We add these decreasing coefficient extended cover inequalities to P
and denote the formulation by PCI.
To illustrate, consider the 0-1 knapsack constraint given by 10x1 +9x2 +8x3 +
7x4 + 6x5 + 5x6 + 4x7 + 3x8 + 2x9 + 1x10 ? 20. The generated valid inequality is
x1 + x2 + x3 ? 2.
2.12.1.4 Middle Coefficient Extended Cover Inequalities
Without loss of generality, we assume that, for each i ? I, ai(|E?A|?1) ?
ai(|E?A|?3) ? . . . ? ai1 ? ai2 ? ai4 ? . . . ? ai|E?A|, when |E ? A| is even, and
ai(|E?A|?1) ? ai(|E?A|??3) ? . . . ? ai2 ? ai1 ? ai3 ? . . . ? ai|E??A|, when |E ?A| is odd.
For each constraint j?E?A aijxj ? bi, a new constraint
k
j=1 xj ? k ? 1 is gener-
61
? ?
ated where k is such that k a > b but k?1j=1 ij i j=1 aij ? bi. These are then extended
to form extended cover inequalities. We add these middle coefficient extended cover
inequalities to P and denote the formulation by PMI.
To illustrate, consider the 0-1 knapsack constraint given by 5x1 + 6x2 + 4x3 +
7x4 + 3x5 + 8x6 + 2x7 + 9x8 + 1x9 + 10x10 ? 20. The generated valid inequality is
x1 +x2 +x3 +x4 ? 3 which is then extended to x1 +x2 +x3 +x4 +x6 +x8 +x10 ? 3.
2.12.1.5 Extreme Coefficient Extended Cover Inequalities
Without loss of generality, we assume that, for each i ? I, ai2 ? ai4 ? . . . ?
ai|E?A| ? ai(|E?A|?1) ? ai(|E?A|?3) ? . . . ? ai1, when |E ? A| is even, and ai2 ? ai4 ?
. . . ? ai(|E??A|?1) ? ai|E?A| ? ai(|E?A|?2) ? . . . ??ai1, when |E ? A| is odd. For each
constraint ?j?E?A aijxj ? bi, a?new constraint
k
j=1 xj ? k?1 is generated where k
is such that k k?1j=1 aij > bi but j=1 aij ? bi. These are extended cover inequalities.
We add these extreme coefficient extended cover inequalities to P and denote the
formulation by PEI.
To illustrate, consider the 0-1 knapsack constraint given by 10x1 +1x2 +9x3 +
2x4 + 8x5 + 3x6 + 7x7 + 4x8 + 6x9 + 5x10 ? 20. The generated valid inequality is
x1 + x2 + x3 + x4 ? 3.
2.12.1.6 All Inequalities
In the formulation P, we include increasing coefficient extended cover inequal-
ities, decreasing coefficient extended cover inequalities, middle coefficient extended
62
cover inequalities, extreme coefficient extended cover inequalities, and coefficient
round-down inequalities and denote this formulation by PI. For each i ? I, five new
constraints are added to P to form PI.
2.12.2 Computational Results
We perform computational experiments that are designed to examine the per-
formance on eight models (L, P, PR, PDI, PCI, PMI, PEI, PI) and how performance
varies with respect to the size of the data set. Understanding the performance of
the various formulations of the Stage 1 IP on a large data set is important because
our CNG data set has 6,067 meters and 1,575 street segments.
We test the eight models for different values of the number of meters (|I|)
and the number of street segments (|E ? A|), where the range of both values is
10 to 2000. The probabilities (pij) are generated randomly from a Uniform (0, 1)
distribution. The costs (cj) of the street segments are generated from three different
distributions. In the first case, costs are generated randomly from a Uniform (0, 1)
distribution, and then the values are multiplied by 100. In the second case, unit
costs are considered for all street segments. In the third case, costs are generated
randomly from a truncated Normal (0, 1) distribution with the left tail truncated at
0 and the right tail truncated at 1, and then the values are multiplied by 100. We
use R software version 3.3.1 to generate the data for the computational experiments
and Gurobi version 7.0 to solve the models. We use an i7 CPU with 32 GB RAM
and a one-hour time limit. If an optimal value (denoted by v) is not found within
63
the time limit, the best feasible solution value (denoted by V) and the best available
bound (denoted by B) are reported. We ensure that the generated data are feasible
for all the eight models and that all entries in the constraint matrix are non-zero
(the pij?s in the real word will be non-zero). The constraint matrices have dimension
|I| ? |E ? A| for models L and P, 2? |I| ? |E ? A| for models PR, PDI, PCI, PMI
and PEI, and 6? |I| ? |E ? A| for model PI.
The comparison of the performance for the eight models using different values
of |I| and |E?A| based on the three cost structures are given in Tables 2.14 to 2.23.
For all these analyses, the values of the specified likelihood Li is taken to be 0.95
for all i. In Tables 2.14 to 2.23, the branch column gives the number of branches or
nodes created and the time column gives the running time in seconds. The v(LP)
column gives the LP relaxation values.
Table 2.14 gives the results for 20 meters and 10 street segments. All eight
models are solved to optimality for all three cost structures at the root node within
one-hundredth of a second. LP relaxation values indicate that formulations for PCI
and PI are stronger than the other six models for the uniform cost structure and
the normal cost structure. Formulations for PDI, PMI, and PI are stronger than
the other five models for the unit cost structure.
Table 2.15 gives the results for 10 meters and 20 street segments. All eight
models are solved to optimality for all three cost structures close to one-hundredth
of a second. For the unit cost structure, all eight models are solved at the root node.
LP relaxation values indicate that formulations for PCI and PI are stronger than
the other six models for all three cost structures.
64
Table 2.14: Results for 20 meters and 10 street segments.
Table 2.16 gives the results for 100 meters and 20 street segments. All eight
models are solved to optimality for all three cost structures in nearly one-tenth of
a second. For the unit cost structure, all eight models required a larger number of
nodes than the number of nodes in the other two cost structures. LP relaxation
values indicate that formulations for PCI and PI are stronger than the other six
65
Random Uniform Cost Unit Cost Random Normal Cost
v = 210.3068 v = 5 v = 185.8380
Model Branch Time v(LP) Branch Time v(LP) Branch Time v(LP)
L 0 0.000 170.9240 0 0.016 3.8705 0 0.016 154.1182
P 0 0.001 170.9240 0 0.019 3.8705 0 0.000 154.1182
PR 0 0.016 170.9240 0 0.016 3.8705 0 0.016 154.1182
PDI 0 0.008 170.9240 0 0.009 4.0000 0 0.008 154.1182
PCI 0 0.004 175.3579 0 0.016 3.8800 0 0.000 156.4817
PMI 0 0.017 170.9240 0 0.000 4.0000 0 0.000 154.1182
PEI 0 0.006 170.9240 0 0.016 3.8705 0 0.016 154.1182
PI 0 0.016 175.3579 0 0.016 4.0000 0 0.000 156.4817
Table 2.15: Results for 10 meters and 20 street segments.
models for the uniform cost structure and the normal cost structure.
Table 2.17 gives the results for 20 meters and 100 street segments. All eight
models are solved to optimality for all three cost structures within one-hundredth of
a second. For the unit cost structure, all eight models required a considerably larger
more number of nodes than the number of nodes in the other two cost structures.
66
Random Uniform Cost Unit Cost Random Normal Cost
v = 90.9012 v = 4 v = 78.6606
Model Branch Time v(LP) Branch Time v(LP) Branch Time v(LP)
L 9 0.047 65.3144 0 0.016 2.9883 8 0.031 56.4202
P 8 0.010 65.3144 0 0.000 2.9883 8 0.016 56.4202
PR 8 0.005 65.3144 0 0.000 2.9883 8 0.016 56.4202
PDI 8 0.010 65.3144 0 0.005 2.9883 8 0.031 56.4202
PCI 7 0.022 65.7364 0 0.016 3.0370 7 0.019 56.8099
PMI 8 0.016 65.3144 0 0.000 2.9883 8 0.031 56.4202
PEI 8 0.008 65.3144 0 0.000 2.9883 8 0.031 56.4202
PI 6 0.016 65.7364 0 0.000 3.0370 6 0.031 56.8099
Table 2.16: Results for 100 meters and 20 street segments.
LP relaxation values indicate that all eight formulations are equivalent for all three
cost structures.
Table 2.18 gives the results for 200 meters and 100 street segments. All eight
models are solved to optimality for the uniform cost structure and the normal cost
structure within one second, and for the unit cost structure within 125 seconds. For
67
Random Uniform Cost Unit Cost Random Normal Cost
v = 185.6485 v = 6 v = 161.9885
Model Branch Time v(LP) Branch Time v(LP) Branch Time v(LP)
L 19 0.190 142.3955 80 0.105 4.0408 18 0.111 125.7916
P 11 0.134 142.3955 57 0.098 4.0408 9 0.117 125.7916
PR 11 0.134 142.3955 57 0.077 4.0408 9 0.116 125.7916
PDI 11 0.151 142.3955 57 0.081 4.0408 9 0.112 125.7916
PCI 11 0.289 142.9973 80 0.097 4.0408 12 0.050 126.2463
PMI 14 0.117 142.3955 57 0.068 4.0408 16 0.157 125.7916
PEI 13 0.195 142.3955 81 0.085 4.0408 10 0.050 125.7916
PI 13 0.379 142.9973 80 0.113 4.0408 12 0.062 126.2463
Table 2.17: Results for 20 meters and 100 street segments.
the unit cost structure, all eight models required a larger number of nodes than the
number of nodes in the other two cost structures. LP relaxation values indicate that
all eight formulations are equivalent for all three cost structures.
Table 2.19 gives the results for 100 meters and 200 street segments. All eight
models are solved to optimality for the uniform cost structure and the normal cost
68
Random Uniform Cost Unit Cost Random Normal Cost
v = 42.0435 v = 4 v = 36.0238
Model Branch Time v(LP) Branch Time v(LP) Branch Time v(LP)
L 17 0.031 30.8630 244 0.081 2.2912 21 0.043 26.4746
P 16 0.047 30.8630 212 0.063 2.2912 16 0.031 26.4746
PR 16 0.031 30.8630 212 0.078 2.2912 16 0.031 26.4746
PDI 16 0.043 30.8630 212 0.067 2.2912 16 0.023 26.4746
PCI 18 0.050 30.8630 235 0.093 2.2912 18 0.038 26.4746
PMI 16 0.050 30.8630 212 0.085 2.2912 16 0.031 26.4746
PEI 16 0.028 30.8630 231 0.083 2.2912 16 0.047 26.4746
PI 18 0.047 30.8630 190 0.109 2.2912 18 0.047 26.4746
Table 2.18: Results for 200 meters and 100 street segments.
structure within a half a second, and for the unit cost structure within 160 seconds.
For the unit cost structure, all eight models required a larger number of nodes than
the number of nodes in the other two cost structures. LP relaxation values indicate
that all eight formulations are equivalent for all three cost structures.
Table 2.20 gives the results for 1000 meters and 200 street segments. All eight
69
Random Uniform Cost Unit Cost Random Normal Cost
v = 60.6152 v = 6 v = 51.9349
Model Branch Time v(LP) Branch Time v(LP) Branch Time v(LP)
L 127 0.224 40.5822 70776 68.857 3.1836 117 0.237 34.7704
P 122 0.230 40.5822 72680 73.142 3.1836 107 0.243 34.7704
PR 122 0.245 40.5822 73244 71.907 3.1836 107 0.250 34.7704
PDI 122 0.275 40.5822 69211 70.114 3.1836 107 0.275 34.7704
PCI 125 0.432 40.5822 83158 69.134 3.1836 138 0.461 34.7704
PMI 122 0.247 40.5822 69211 70.002 3.1836 107 0.253 34.7704
PEI 125 0.391 40.5822 323249 125.900 3.1836 125 0.412 34.7704
PI 124 0.729 40.5822 83247 76.796 3.1836 101 0.723 34.7704
Table 2.19: Results for 100 meters and 200 street segments.
models are solved to optimality for the uniform cost structure and the normal cost
structure within six seconds. For the unit cost structure, none of the models are
solved to optimality within the one-hour time limit. LP relaxation values indicate
that all eight formulations are equivalent for all three cost structures.
Table 2.21 gives the analysis results for 200 meters and 1000 street segments.
70
Random Uniform Cost Unit Cost Random Normal Cost
v = 31.1048 v = 5 v = 26.6314
Model Branch Time v(LP) Branch Time v(LP) Branch Time v(LP)
L 71 0.172 21.4948 28393 56.458 2.7859 71 0.158 18.4024
P 54 0.160 21.4948 28031 109.678 2.7859 70 0.181 18.4024
PR 54 0.173 21.4948 28031 109.976 2.7859 70 0.184 18.4024
PDI 54 0.211 21.4948 9234 7.190 2.7859 70 0.234 18.4024
PCI 62 0.382 21.4948 27823 159.750 2.7859 65 0.414 18.4024
PMI 54 0.219 21.4948 9234 7.364 2.7859 70 0.211 18.4024
PEI 54 0.275 21.4948 8598 11.496 2.7859 57 0.266 18.4024
PI 62 0.524 21.4948 8876 11.941 2.7859 65 0.525 18.4024
Table 2.20: Results for 1000 meters and 200 street segments.
All eight models are solved to optimality for the uniform cost structure and the
normal cost structure within five seconds. For the unit cost structure, none of the
models are solved to optimality within the one-hour time limit. LP relaxation values
indicate that all eight formulations are equivalent for all three cost structures.
Table 2.22 gives the results for 2000 meters and 1000 street segments. All eight
71
Random Uniform Cost Unit Cost Random Normal Cost
v = 50.0808 V = 8, B = 5 v = 42.8871
Model Branch Time v(LP) Branch Time v(LP) Branch Time v(LP)
L 509 2.359 36.4209 573516 3600.025 3.2982 657 2.269 31.2172
P 416 2.451 36.4209 525333 3600.025 3.2982 579 2.532 31.2172
PR 416 2.538 36.4209 532719 3600.029 3.2982 579 2.552 31.2172
PDI 416 2.809 36.4209 540225 3600.039 3.2982 579 3.031 31.2172
PCI 429 5.370 36.4209 343069 3600.051 3.2982 551 5.388 31.2172
PMI 416 2.945 36.4209 532268 3600.043 3.2982 579 3.146 31.2172
PEI 416 5.386 36.4209 327526 3600.058 3.2982 580 5.692 31.2172
PI 429 5.961 36.4209 307213 3600.118 3.2982 551 5.963 31.2172
Table 2.21: Results for 200 meters and 1000 street segments.
models are solved to optimality for the uniform cost structure and the normal cost
structure within 105 seconds. For the unit cost structure, none of the models are
solved to optimality within the one-hour time limit. LP relaxation values indicate
that all eight formulations are equivalent for all three cost structures.
Table 2.23 gives the results for 1000 meters and 2000 street segments. All
72
Random Uniform Cost Unit Cost Random Normal Cost
v = 5.0900 V = 6, B = 4 v = 4.3551
Model Branch Time v(LP) Branch Time v(LP) Branch Time v(LP)
L 78 0.688 3.7080 1012646 3600.026 2.7328 78 0.762 3.1727
P 75 0.696 3.7080 732113 3600.029 2.7328 75 0.723 3.1727
PR 75 0.720 3.7080 721351 3600.025 2.7328 75 0.708 3.1727
PDI 75 0.898 3.7080 680384 3600.037 2.7328 75 0.873 3.1727
PCI 75 2.055 3.7080 1389916 3600.044 2.7328 75 2.194 3.1727
PMI 75 0.865 3.7080 715694 3600.034 2.7328 75 0.905 3.1727
PEI 75 1.247 3.7080 938173 3600.043 2.7328 75 1.271 3.1727
PI 75 4.104 3.7080 910063 3600.073 2.7328 75 3.903 3.1727
Table 2.22: Results for 2000 meters and 1000 street segments.
eight models are solved to optimality for the uniform cost structure and the normal
cost structure within 66 seconds. For the unit cost structure, none of the models are
solved to optimality within the one-hour time limit. LP relaxation values indicate
that all eight formulations are equivalent for all three cost structures.
73
Random Uniform Cost Unit Cost Random Normal Cost
v = 7.4823 V = 9, B = 4 v = 6.4021
Model Branch Time v(LP) Branch Time v(LP) Branch Time v(LP)
L 306 10.327 5.4391 4066 3600.063 3.0643 306 10.564 4.6540
P 292 11.130 5.4391 4506 3600.139 3.0643 292 11.404 4.6540
PR 292 11.613 5.4391 4505 3600.604 3.0643 292 11.720 4.6540
PDI 292 16.794 5.4391 4384 3600.106 3.0643 292 15.500 4.6540
PCI 455 41.847 5.4391 4156 3601.407 3.0643 450 39.208 4.6540
PMI 292 16.700 5.4391 4349 3600.131 3.0643 292 15.330 4.6540
PEI 292 42.715 5.4391 4255 3600.326 3.0643 292 40.500 4.6540
PI 404 102.281 5.4391 4195 3601.171 3.0643 403 104.360 4.6540
Table 2.23: Results for 1000 meters and 2000 street segments.
2.12.3 Observations from the Computational Experiments
All eight formulations performed similarly for the three respective cost struc-
tures in terms of the number of nodes created and the LP relaxation value. The
formulations L and P had the smallest running times. Neither the affine transfor-
74
Random Uniform Cost Unit Cost Random Normal Cost
v = 1.9747 V = 8, B = 4 v = 1.6896
Model Branch Time v(LP) Branch Time v(LP) Branch Time v(LP)
L 51 6.102 1.4088 7026 3600.249 2.9371 51 6.201 1.2054
P 33 6.229 1.4088 9345 3605.910 2.9371 33 6.630 1.2054
PR 33 6.441 1.4088 9345 3605.748 2.9371 33 6.805 1.2054
PDI 33 8.728 1.4088 9511 3604.998 2.9371 33 8.950 1.2054
PCI 47 26.887 1.4088 9341 3601.234 2.9371 47 27.510 1.2054
PMI 33 8.734 1.4088 9510 3606.107 2.9371 33 9.178 1.2054
PEI 33 11.500 1.4088 4731 3600.849 2.9371 33 11.580 1.2054
PI 47 60.986 1.4088 4431 3600.431 2.9371 47 65.302 1.2054
mation of the decision variables in L into 0-1 knapsack constraints in P, nor the
addition of coefficient round-down inequalities and various other extended cover in-
equalities to P made the formulation stronger. Most likely, this is due to the fact
that Gurobi already takes into account similar transformations and valid inequali-
ties during the pre-processing stages while solving the models. The addition of the
valid inequalities to P increased the running times. Based on these observations,
we should use formulation L (linear Stage 1 IP) to solve the Stage 1 IP when using
large data sets.
The uniform cost structure and the normal cost structure performed similarly
and were easy to solve in terms of the running times and the number of nodes
created. For each pair of |I| and |E ? A|, the number of branches created and the
running times are significantly larger for the unit cost structure compared to the
other two cost structures. With larger values of |I| and |E ? A|, none of the eight
models are solved to optimality within the one-hour time limit. This is probably
due to the inherent symmetry in the Stage 1 IP for the unit cost structure, since
each street segment has the same weight in the objective function.
2.12.4 Other Insights from the Computational Experiments
In Table 2.24, we summarize the performance for formulation L (linear Stage 1
IP), based on the results from Tables 2.14 to 2.23, for the three cost structures with
respect to the number of branches created, the running times (in seconds), and the
optimality gap (100(V?B)). The value of the specified likelihood Li is set at 0.95 forB
75
Table 2.24: Comparison of the linear Stage 1 IP performance for the three different
cost structures.
all i. For each pair of |I| and |E ?A|, the linear Stage 1 IP had similar performance
for both the uniform cost structure and the normal cost structure in terms of the
76
Uniform Cost Unit Cost Normal Cost
|I| |E ? A| Branch Time Gap Branch Time Gap Branch Time Gap
20 10 0 0.000 Optimal 0 0.016 Optimal 0 0.016 Optimal
10 20 9 0.047 Optimal 0 0.016 Optimal 8 0.031 Optimal
100 20 19 0.190 Optimal 80 0.105 Optimal 18 0.111 Optimal
20 100 17 0.031 Optimal 244 0.081 Optimal 21 0.043 Optimal
200 100 127 0.224 Optimal 70776 68.857 Optimal 117 0.237 Optimal
100 200 71 0.172 Optimal 28393 56.458 Optimal 71 0.158 Optimal
1000 200 509 2.359 Optimal 573516 3600.025 60% 657 2.269 Optimal
200 1000 78 0.688 Optimal 1012646 3600.026 50% 78 0.762 Optimal
2000 1000 306 10.327 Optimal 4066 3600.063 125% 306 10.564 Optimal
1000 2000 51 6.102 Optimal 7026 3600.249 100% 51 6.201 Optimal
number of branches created, the running times, and the optimality gap. All the
models are solved to optimality with the running times ranging from one-hundredth
of a second for |I| ? |E ? A| value of 200 to 10 seconds for |I| ? |E ? A| value of
2,000,000. The number of branches created ranges from less than 10 to around 500.
For each pair of |I| and |E ? A|, the number of branches created and the running
times are significantly larger for the unit cost structure compared to the other two
cost structures. This is probably due to the inherent symmetry in the linear Stage
1 IP for the unit cost structure, since each street segment has the same weight in
the objective function. The models are not solved to optimality within the one-hour
time limit for |I| ? |E ? A| values greater than 20,000. The optimality gap is more
than 50% and more than 100% for |I| ? |E ? A| value of 200,000 and 2,000,000,
respectively.
In Table 2.25, for each of the three cost structures we show the comparison of
the objective value for the linear Stage 1 IP for Li values of 0.95 and 0.75 for all i.
If the linear Stage 1 IP is solved to optimality, the objective value is smaller for Li
values of 0.75 compared to 0.95 for each pair of |I| and |E ? A| and for each of the
three cost structures. For the unit cost structure and for |I|? |E?A| values greater
than 20,000, the interval [B,V] is tighter and shifted towards zero for Li values of
0.75 compared to 0.95. This demonstrates the fact that the smaller the specified
likelihood values for reading meters, the smaller is the Stage 1 IP objective value.
The Stage 1 IP selects the street segments with the lowest total cost (length)
that guarantee the specified likelihood of reading the meters. The Stage 2 IP adds
deadhead segments with the lowest total cost (length) to complete the full route,
77
Table 2.25: Comparison of the linear Stage 1 IP objective value for specified likeli-
hood values of 0.95 and 0.75 for all meters.
starting and ending at the depot. It might be the case that the street segments
selected by the Stage 1 IP are not the best selections for the full route. The street
segments selected by the Stage 1 IP might be farther away from each other. This
may lead to a larger total length for the full route. To help prevent this, an alternate
78
Uniform Cost Unit Cost Normal Cost
|I| |E ? A| 0.95 0.75 0.95 0.75 0.95 0.75
20 10 210.3068 90.1101 5 3 185.8380 77.9464
10 20 90.9012 44.0025 4 2 78.6606 37.7965
100 20 185.6485 101.2878 6 4 161.9885 88.9582
20 100 42.0435 23.7557 4 3 36.0238 20.3450
200 100 60.6152 27.9134 6 4 51.9349 23.9070
100 200 31.1048 16.9587 5 3 26.6314 14.5172
1000 200 50.0808 23.3395 B = 5, V = 8 B = 3, V = 5 42.8871 19.9795
200 1000 5.0900 2.7574 B = 4, V = 6 B = 3, V = 4 4.3551 2.3593
2000 1000 7.4823 3.8587 B = 4, V = 9 B = 2, V = 6 6.4021 3.3016
1000 2000 1.9747 0.9722 B = 4, V = 8 B = 2, V = 5 1.6896 0.8319
objective function for the Stage 1 IP may be minimizing the total number of street
segments (unit cost structure), i.e., cj = 1 for all j. For the unit cost structure,
our computational experiments showed that the linear Stage 1 IP with a one-hour
running time had an optimality gap of more than 100% for the |I| ? |E ? A| value
of 2,000,000. Since the |I| ? |E ? A| value for our data set (6, 067 ? 1, 575) is an
order of magnitude larger, it is not possible to arrive at Stage 1 IP solutions with
smaller optimality gaps in a reasonable amount of time. So, rather than using the
unit cost structure as the only objective function in the Stage 1 IP, it might be
useful to explore the option of having the unit cost structure added to the Stage
1 IP as the second objective function. We compare the routes generated using the
single-objective Stage 1 IP (discussed in Section 2.4) and the bi-objective Stage 1
IP in the simulation experiments conducted in Section 2.14 with our data set. The
bi-objective Stage 1 IP is given on the next page.
The objective function (2.19) is the same as the objective function (2.1) in
the single-objective Stage 1 IP. The objective function (2.20) minimizes the total
number of street segments selected. Constraints (2.21) and (2.22) are the same
as constraints (2.2) and (2.3), respectively, in the single-objective Stage 1 IP. A
lexicographic approach (establishing a pre-defined ordering between the competing
objective functions) is used to solve the bi-objective Stage 1 IP. First, the single-
objective Stage 1 IP is solved with the objective function (2.19). Then, the value of
the objective function (2.20) is improved without allowing the value of the objective
function (2.19) to increase.
79
?
(Bi-objective Stage 1 IP) min cjxj (2.19)
j??E?A
min xj (2.20)
j??E?A
s.t. (1? p xjij) ? (1? Li) ?i ? I (2.21)
j?E?A
xj ? {0, 1} ?j ? E ? A (2.22)
2.13 Heuristics for the Stage 2 IP
Frederickson et al. (1978) showed that the mixed rural postman problem is
NP-complete. A mixed rural postman problem with a particular node (depot) that
is a part of the route is also NP-complete. This is the problem we solve in the Stage
2 IP. Therefore, for any realistic large data sets, similar to the size of our CNG data
set (6,067 meters, 1,575 street segments, and 1,072 nodes), any exact method of
solving the Stage 2 IP potentially has long running times with solutions that have
large optimality gaps. We discuss a fast metaheuristic to generate near-optimal
solutions in a short amount of time.
2.13.1 Route Generator
Given a set of required street segments, the goal is to find the shortest route
for the meter reading vehicle that starts and ends at the depot and traverses all
required street segments. By assuming that the vehicle always takes the shortest
80
Reverse Change the direction in which a required edge is traversed by the
vehicle. Required arcs remain unaffected.
Relocate Remove a required street segment from the current solution and
insert it at a different location in the route.
2-opt Reverse the order in which the vehicle visits the street segments in
a subsequence of required street segments.
Table 2.26: Local search operators in the variable neighborhood descent metaheuris-
tic.
path between any two required street segments, the aim of the route generator is
reduced to finding an optimal permutation of all required street segments and, in
the case of edges, the direction in which they should be traversed. This shortest
path between each pair of nodes is computed using Dijkstra?s algorithm (Dijkstra
1959).
The route generator has two phases, the constructive phase and the improve-
ment phase. During the constructive phase, the route starts from the depot node
and, based on the nearest-neighbor principle, consecutively visits the closest re-
quired street segments. Since the edges can be traversed in both directions, the
node (among the two nodes representing an edge) closest to the preceding node
in the route is visited first. After a complete initial route is constructed, the im-
provement phase improves the solution using three different local search operators
embedded in a variable neighborhood descent metaheuristic. The three local search
operators are briefly described in Table 2.26. The variable neighborhood descent
metaheuristic framework is considered very effective for solving routing problems
(Defryn and So?rensen 2017, Hansen et al. 2017, Wassan et al. 2017).
81
2.13.2 Route Trimmer
The full route obtained from the route generator contains deadhead street seg-
ments in addition to the required street segments. This leads to the fact that the
probabilities of successfully reading the meters are more than the specified likeli-
hoods (Li), which is already guaranteed by the required street segments. Although
this will further reduce the chance of missing meters, it comes at a cost of increased
route length. To account for this, the route trimmer aims at decreasing the to-
tal route length while still assuring feasibility, i.e., the probability of successfully
reading each meter i from the full route is at least the specified likelihood Li.
The route trimmer makes use of a remove and repair procedure to decrease
the number of required street segments and, therefore, the total route length. In
Table 2.27, we give the algorithm for the remove and repair procedure of the route
trimmer. Lines 1-4 are the initialization steps. HaveToBeRequired is the subset of
the required street segments that should be a part of the route. CandidateList is
the subset of the required street segments that could potentially be removed from
the route or replaced with other street segments. RequiredSegments is the set of
required street segments that will be used by the route generator to generate the
full route. BestRoute is the route that is feasible and shortest in length. Line 5
indicates that the algorithm will loop until the CandidateList is empty.
Lines 6-13 represent the remove procedure. Street segment smc in the Candi-
dateList with the highest marginal cost in the BestRoute is removed. CandidateList
and RequiredSegments are updated and route Rsmc is generated using the route gen-
82
1: HaveToBeRequired ?MR;
2: CandidateList ? ER ? AR \MR;
3: RequiredSegments ? ER ? AR;
4: BestRoute ? Full route obtained using route generator;
5: while CandidateList 6= ? do
6: Find smc ? CandidateList which has the highest marginal cost
in BestRoute;
7: CandidateList ? CandidateList \ smc;
8: RequiredSegments ? HaveToBeRequired ? CandidateList ;
9: Generate full route Rsmc using route generator;
10: if Rsmc meets the specified likelihood Li of reading each meter
i then
11: CountInfeasible ? 0;
12: if Length of Rsmc < Length of BestRoute then
13: BestRoute ? Rsmc ;
14: else
15: Success ? FALSE;
16: PossibleSegments ? ?;
17: for all sn ?/ RequiredSegments ? smc do
18: if constraint (2.2) is feasible for all the meters with
xj = 1 and j ? RequiredSegments ? sn then
19: PossibleSegments ? PossibleSegments ? sn;
20: for all sp ? PossibleSegments do
21: RequiredSegments ? RequiredSegments ? sp;
s
22: Generate full route R psmc using route generator;
23: RequiredSegments ? RequiredSegments \ sp;
s
24: if Length of R psmc < Length of BestRoute then
25: BestRoute ? sR psmc ;
26: Success ? TRUE;
27: if Success = FALSE then
28: HaveToBeRequired ? HaveToBeRequired ? smc;
29: CountInfeasible ? CountInfeasible + 1;
30: if CountInfeasible > 10 then
31: break;
Table 2.27: Algorithm for the remove and repair procedure of the route trimmer.
erator. If Rsmc is feasible, CountInfeasible, an infeasibility counter for a route, is set
to zero, and if the length of Rsmc is smaller than the length of the BestRoute, Rsmc
becomes the new BestRoute.
Lines 14-30 represent the repair procedure. If Rsmc is infeasible, the repair
83
procedure tries to add another street segment to remove the infeasibility. Success,
a binary variable representing the success of the repair procedure, is initialized to
FALSE. PossibleSegments is the subset of street segments outside of RequiredSeg-
ments and smc, each of which, if added to RequiredSegments, has the potential of
generating a shorter route compared to the current BestRoute. If any street segment
sn, outside of the RequiredSegments and smc, satisfies constraint (2.2) of the Stage 1
IP along with the RequiredSegments, then sn is added to the PossibleSegments. For
s
each street segment sp in PossibleSegments, a route R
p
smc is generated using the route
generator with the required street segments being the RequiredSegments and sp. The
s
routes R psmc , for each sp, and the BestRoute are compared and the shortest route is
set as the new BestRoute. If the repair procedure improves the BestRoute, Success
is updated to TRUE. If Success is FALSE after the repair procedure, i.e., both the
remove and the repair procedures involving street segment smc is unsuccessful, then
smc is added to the HaveToBeRequired and the CountInfeasible is increased by one.
The loop repeats by choosing a new smc from the current CandidateList, given that
it is non-empty, based on the current BestRoute unless CountInfeasible is greater
than the stopping criterion, which is set to 10.
When the remove and repair procedure terminates, a second procedure of the
route trimmer looks for remaining redundancies in the list of required street segments
for the current BestRoute. When a required street segment lies on the shortest
path between its predecessor and successor required street segments, that particular
required street segment would be visited by the vehicle as a deadhead segment.
Therefore, it can be removed from the list of the required street segments without
84
affecting feasibility or the total route length. This simplifies the representation of
the vehicle route and speeds up the simulation experiments conducted in Section
2.14.
The route generator is used extensively in the route trimming procedure. This
strengthens our motivation for choosing a fast metaheuristic for the route generator.
To further speed up the route generator and to avoid solving the same instance
multiple times during our simulation experiments, a pool of solutions is maintained.
Before using the route generator, this pool is checked for instances that have been
solved already.
In Figure 2.15, we show how the route generator and the route trimmer work
for a small example. Figure 2.15a shows the street network. The route of the vehicle
starts from and ends at the depot denoted by the black square. The remaining
nodes in the network are denoted by black dots. The black lines are the edges in
the network (assume that there are no arcs). Figure 2.15b shows the Stage 1 IP
solution. The red lines are the required street segments that the meter reading
vehicle needs to traverse to guarantee the specified service levels (Li). Figure 2.15c
shows the route produced by the route generator. The blue lines are the deadhead
segments added by the route generator to connect the required street segments in
the shortest possible manner. Figure 2.15d shows the route produced by the route
trimmer. The route trimmer finds the required street segment that has the largest
marginal cost in the current route (Figure 2.15c), denoted by the yellow line, and
replaces it with another street segment, denoted by the green line, such the new
route (Figure 2.15d) has a smaller length and the specified service levels (Li) are
85
(a) Street network (b) Stage 1 IP solution
(c) Route generator solution (d) Route trimmer solution
Figure 2.15: (Color online) The route generator and the route trimmer applied to a
small example. Red lines denote the required street segemnts. Blue lines denote the
deadhead segments. Yellow line denotes the required street segment that is removed.
Green line denotes the new street segement added to the route as a replacement for
the yellow line.
still satisfied.
2.14 Simulation Experiments
On a regular basis, utility companies need to decide on the routes for their
meter reading vehicles. The quality of a route is determined by its length and ro-
bustness. Shorter routes are more cost effective as fuel and labor costs are lower.
86
The smaller the number of missed meters the more robust the route is for reading
the uncertain RFID signals. To compare and quantify the performance of the three
Bayesian updating models on route quality, simulation experiments are conducted
using the CNG data set with 6,067 meters, 1,575 street segments, and 1,072 nodes.
As the meter reading vehicle makes more trips and collects more data, the param-
eters of the Bayesian updating models should get closer to the actual parameter
values. The actual parameter values depend on the street network and the distri-
bution of the meters in the street network. This will be demonstrated by the fact
that the vehicle routes will be adjusted over time to reduce the number of missed
meters and still being cost-effective.
2.14.1 Actual Reading Probabilities
To calculate the probabilities pij?s for the three Bayesian updating models, we
use the parameter values estimated in Section 2.11. The pij?s obtained are considered
to be the actual probabilities, denoted by pij?s, with which meter i is read from street
segment j at least once. This assumption is reasonable since the size of the data set
on which the model parameters are estimated is very large (N ? 5 million for the
logit model and the probit model, and m = 829 and n = 6, 067 for the hierarchical
probit model). To determine whether or not meter i is successfully read from street
segment j, we consider a binomial random variable Yij ? Binomial (M, pij), where
M is the number of times street segment j is traversed in the route. Meter i
is considered to be missed from the route of the vehicle if Yij = 0 for all street
87
segments j in the route, i.e., meter i has not been read from any of the street
segments traversed by the vehicle.
For the logit model, ln(pij/1? pij) = ?1.242? 0.003? Shortest Distanceij +
0.019 ? No of Pulsesj ? 0.003 ? No of Customersi, where Shortest Distanceij is in
meters.
For the probit model, ??1(pij) = ?1.024?0.001?Shortest Distanceij+0.007?
No of Pulsesj ? 0.002? No of Customersi, where Shortest Distanceij is in meters.
For the hierarchical probit model, E(? ) = E(?Ti )zi for each i. Therefore,
?? ? ? ?????? i,1? ????
? ????0.890 ?0.0002 ? ????? ?? = ??
?
i,2 ? ??? 1 ???0.002 ?0.000003?????? ?? .No of Customersi
??i,3 0.004 0.0000006
So, ??1(pij) = (?0.890 ? 0.0002 ? No of Customersi) + (?0.002 ? 0.000003 ?
No of Customersi)?Shortest Distanceij+(0.004+0.0000006?No of Customersi)?
No of Pulsesj, where Shortest Distanceij is in meters.
2.14.2 Simulation Model Overview
A simulation starts from an initial vehicle route. This is iteration zero or the
initialization step. To construct the initial route, pij is set to 1 if meter i is within
500 feet from street segment j and 0 otherwise. First, the linear Stage 1 IP is solved
to obtain the required street segments. The specified likelihood (Li) values do not
affect the Stage 1 IP solution in iteration zero because of the particular choice of
88
the pij?s. Then the full route is produced using the route generator and the route
trimmer. This initial route is a deterministic CEVRP solution which is currently
used by utility companies. Therefore, we use this initial route as a benchmark for our
experiments. In the first iteration, the Yij?s are generated based on the initial route.
Since we do not have any information about the pij?s during the first iteration (this
is analogous to time period 1 as described in Section 2.8), the prior distributions for
the parameters of the Bayesian updating models are set to the vague priors. Using
the priors and the meter reading data (Yij as the dependent variable and Shortest -
Distanceij, No of Pulsesj, and No of Customersi as the independent variables), we
obtain the posterior distributions of the parameters, and thereby the updated pij?s.
A new route is produced based on the updated pij?s. For all subsequent iterations,
the Yij?s are generated based on the current route; the posterior distributions from
the previous iteration are used as the prior distributions.
An iteration in the simulation experiment represents the generation of a new
route for the next meter reading day after updating the probabilities pij?s using
the previous meter reading data. Therefore, our simulation model can be used by
a utility company as a decision-support tool to generate robust and cost effective
routes. The only difference would be that the utility companies have access to the
actual dependent variable values (Read OR Notij) after the vehicle has traversed
the route, whereas, for the simulations we generate the Yij?s based on the pij?s and
the route to identify the missed meters.
89
Figure 2.16: (Color online) A view of a portion of the actual street network with
meter locations in the UTM format serviced by Connecticut Natural Gas in our data
set from Hartford, Connecticut. The red dots represent the meters. This network
is used for our simulation experiments.
2.14.3 Generating the Network
Our data set is from Hartford, Connecticut. The OAR Bench software (Lum
et al. 2018) is used to extract the metadata of the street network (information about
nodes, edges, and arcs). The coordinate system used by OAR Bench is the World
Geodetic System (WGS) 84. The meter locations in the ArcGIS data are also in
WGS 84 format. WGS 84 is the reference coordinate system used by the Global
90
Positioning System (GPS). The Universal Transverse Mercator (UTM) conformal
projection (map projections that preserve angles locally) uses a 2-dimensional Carte-
sian coordinate system to give locations on the surface of the Earth by dividing the
Earth into sixty zones. The metadata of the street network and the meter locations
are converted from the WGS 84 format to the UTM format using the UTM zone
number of Hartford, Connecticut. The street network and the meter locations in the
UTM format are considered to be on a flat surface. Therefore, Euclidean geometry
can be used to find spatial distances between any two points in the network. Figure
2.16 gives a view of the actual street network with meter locations in the UTM
format.
2.14.4 Simulation Results
Simulations are conducted for each of the three Bayesian updating models, for
both the single-objective and the bi-objective linear Stage 1 IP, and for Li values of
0.95 and 0.75 for all i. In total, we perform 12 (3? 2? 2) simulation experiments.
Each simulation is run for nine iterations. The initial route in iteration zero depends
on the version of the Stage 1 IP used for that particular simulation experiment. A
simulation using the single-objective linear Stage 1 IP has a different initial route
compared to a simulation using the bi-objective linear Stage 1 IP.
The meter reading vehicles are driven at five miles per hour in residential
neighborhoods. Drivers are encouraged to drive at a slow speed to increase the
chances of reading the uncertain RFID signals. We assume five minutes to manually
91
read a meter (these are the meters that are infeasible with respect to constraints
(2.2) in the Stage 1 IP) after parking the vehicle on the closest street segment. The
route lengths are in miles. For each meter that is supposed to be manually read, we
add 0.42 miles to the full route length as a proxy for the distance that could have
been traversed in five minutes at five miles per hour. We use an i7 CPU with 32
GB RAM for the simulations. We use R software version 3.3.1 to run the Bayesian
updating models and Gurobi version 7.5 to solve the linear Stage 1 IP. Each time
the linear Stage 1 IP is solved or the route generator is used, a time limit of 10
minutes and two minutes, respectively, is imposed.
In Figures 2.17 and 2.18, we show the results of the simulation experiments for
the route length and the number of missed meters, respectively. Both figures show
the results of the three Bayesian updating models in four different scenarios, namely,
single-objective linear Stage 1 IP and likelihood values of 0.75 (scenario a), single-
objective linear Stage 1 IP and likelihood values of 0.95 (scenario b), bi-objective
linear Stage 1 IP and likelihood values of 0.75 (scenario c), and bi-objective linear
Stage 1 IP and likelihood values of 0.95 (scenario d). Figure 2.17 shows the results
from iteration 0 (initial route) through iteration 9, whereas, Figure 2.18 shows the
results from iteration 1 through iteration 9. The missed meters (Figure 2.18) in
iteration k are based on the route (Figure 2.17) in iteration k ? 1 and also leads to
the formation of the new route in iteration k.
In each of the four scenarios depicted in Figures 2.17 and 2.18, the logit model
and the probit model do not show any significant differences. The initial routes are
approximately 20 miles and there are 279 missed meters in scenarios a and b and
92
80 80
logit logit
probit probit
hierarchical probit hierarchical probit
60 60
40 40
20 20
0 2 4 6 8 0 2 4 6 8
Iteration Iteration
(a) Single-objective Stage 1 and likelihood values of 0.75 (b) Single-objective Stage 1 and likelihood values of 0.95
80 80
logit logit
probit probit
hierarchical probit hierarchical probit
60 60
40 40
20 20
0 2 4 6 8 0 2 4 6 8
Iteration Iteration
(c) Bi-objective Stage 1 and likelihood values of 0.75 (d) Bi-objective Stage 1 and likelihood values of 0.95
Figure 2.17: (Color online) Simulation results for the route length.
379 missed meters in scenarios c and d. The logit model and the probit model, on
average, generated routes 17 miles long and missed 148 meters in scenarios a and
c, and routes 37 miles long and missed 60 meters in scenarios b and d. An increase
in the likelihood values from 0.75 to 0.95 increased the route length from 17 miles
to 37 miles and reduced the number of missed meters from 148 to 60. However, the
choice of the Stage 1 IP did not have any substantial impact on the results. For
the routes to be operationally impactful to the extent that there is no need to send
93
Route length (in miles) Route length (in miles)
Route length (in miles) Route length (in miles)
400 400
logit logit
probit probit
300 hierarchical probit 300 hierarchical probit
200 200
100 100
0 0
2 4 6 8 2 4 6 8
Iteration Iteration
(a) Single-objective Stage 1 and likelihood values of 0.75 (b) Single-objective Stage 1 and likelihood values of 0.95
400 400
logit logit
probit probit
300 hierarchical probit 300 hierarchical probit
200 200
100 100
0 0
2 4 6 8 2 4 6 8
Iteration Iteration
(c) Bi-objective Stage 1 and likelihood values of 0.75 (d) Bi-objective Stage 1 and likelihood values of 0.95
Figure 2.18: (Color online) Simulation results for the number of missed meters.
another vehicle at a later time to read the missed meters, the number of missed
meters needs to be reduced further.
The hierarchical probit model, in each of the four scenarios shown in Figure
2.17, shows a large increase in route lengths compared to the initial routes of 20 miles.
However, the route lengths gradually decreased to around 28 miles in scenarios a
and c, and around 52 miles in scenarios b and d. The hierarchical probit model
captures the uncertain behavior of each meter uniquely, and therefore it explores
the street network to learn the reading potential of the different street segments. As
94
Missed meters Missed meters
Missed meters Missed meters
a result, the hierarchical probit model, as shown in Figure 2.18, on average, missed
six meters in scenarios a and c and one meter in scenarios b and d. The choice of
the Stage 1 IP did not have any significant impact on the results. The probability
estimates of the hierarchical probit model get very close to the actual probabilities
within a few iterations. Therefore, even with likelihood values of 0.75, the routes
generated are very high quality (only six meters are missed from these routes that
are just 8 miles longer than the initial routes). These routes should be of high
operational and practical relevance for a utility company. Likelihood values greater
than 0.75 make the routes substantially longer, thereby, increasing the cost.
In Bayesian updating, the posterior distribution of the model parameters is
a compromise between prior information and the information provided by the new
data. This helps in the inference on the parameters. If the new data are smaller in
size (this leads to model parameters with distributions of high variance), we want
to rely more on prior knowledge. Conversely, if the data are plentiful and contain
high-quality information, then we should not care much about the form of the prior
information. The Bayesian updating process automatically considers this trade-off.
For the hierarchical probit model, the meter reading data obtained from a route are
divided into small segments to learn about the behavior of each meter separately.
Therefore, the size of the data is effectively a few orders of magnitude smaller for
the hierarchical probit model compared to the logit model and the probit model.
This leads to more dependence on the priors for the posterior distributions of the
hierarchical probit model. When we estimated the parameters of the Bayesian up-
dating models in Section 2.11 using our real meter reading data, the logit model
95
Model d1 h d2 Total time
Initial (benchmark) route 20 329 25.58 33.12
Logit or Probit for likelihood values of 0.75 17 148 24.62 17.37
Logit or Probit for likelihood values of 0.95 37 60 18.37 13.62
Hierarchical Probit for likelihood values of 0.75 28 6 8.35 6.65
Hierarchical Probit for likelihood values of 0.95 52 1 6.99 10.95
Table 2.28: Average comparison of the total time to read all the meters.
and the probit model used 6, 067? 829 (? 5 million) data points, whereas, the hier-
archical probit model used 829 data points for each of the lower level probit models
and 6,067 data points for the higher level multivariate linear model. The simulation
experiments were started using vague priors because we did not have any prior in-
formation on the parameters to accurately capture the signal transmission behavior
of the meters in the street network. Therefore, to lower the variance of the posterior
distributions, the hierarchical probit model produced longer routes to gather more
meter reading data and, thereby, improved the quality of the probability estimates.
Finally, the logit and the probit model need to estimate only four parameters. How-
ever, the hierarchical probit model needs to estimate three lower level parameters
for each of the 6,067 meters and seven higher level parameters. Therefore, the hi-
erarchical probit model is able to capture the heterogeneous behavior of the meters
and build routes accordingly, unlike the logit and the probit model which capture
the average effect of all the meters.
For a utility company, the total cost of reading meters in each time period (it-
eration) is divided into two phases. The first phase is the cost of the CEVRP routes
to read the meters automatically. The second phase is the cost of reading the meters
that are missed from the first phase route. Typically, a public utility commission
96
does not allow billing cycles to shift more than two or three days. Therefore, it is
necessary to send out another vehicle to manually read the missed meters within a
few days after the completion of the first phase. We compare the quality of the initial
(benchmark) route and the routes generated by the three Bayesian updating models
taking into account the cost from both phases. Let d1 denote the length (in miles) of
the route in the first phase. A meter reading vehicle is driven at five miles per hour,
so the first phase route will take d1/5 hours to complete. Let h denote the number of
meters missed from the first phase route. During the second phase, the missed me-
ters are read manually to ensure success with probability one (a vehicle is driven at
a speed of around 15 miles per hour and stops at each missed meter to read it). The
second phase is a standard VRP route. A utility company knows which meters are
missed from the first phase route. It will have the exact standard VRP route length
to read the missed meters. In our simulations, we discuss the results averaged over
the nine iterations and need to estimate the route length for the second p?hase. Kwon
et al. (1995) showed that (0.8326 ? 0.0011(h + 1) + 1.1147G/(h + 1)) (h+ 1)D,
where D is the area of the rectangular network and G is the ratio of length and
breadth of the network such that G ? 1, gives a reasonable estimate of the standard
VRP route length as a function of the number of customers (h) on the route. For
our data set, G = 1.5 and D = 8.8 square miles. Let d2 denote the estimate of the
length (in miles) of the route in the second phase; the route will take d2/15 hours
to complete. If we assume five minutes to manually read a meter, it will take h/12
hours to read h missed meters. The total time (in hours) required to read all meters
is the sum of the first phase time (d1/5) and the second phase time (d2/15 + h/12).
97
In Table 2.28, we show the average comparison of the total time for the benchmark
route and the routes generated by the three Bayesian updating models. All three
models took much less time than the benchmark route. For the logit and the probit
models, the total time was a few hours less for likelihood values of 0.95 compared to
likelihood values of 0.75. This is because the time required to read the missed me-
ters in the second phase was smaller even though the first phase routes were longer
for likelihood values of 0.95. The hierarchical probit model does considerably better
than the logit or the probit models. For likelihood values of 0.75, the hierarchical
probit model takes only 6.5 hours to read all the meters. The hierarchical probit
model for likelihood values of 0.95 has a longer first phase route without significantly
reducing the second phase time compared to the likelihood values of 0.75.
We point out that there is an inherent trade-off between the first phase time
and the second phase time. Longer first phase routes would reduce the number
of missed meters. Therefore, longer route times in the first phase would generally
lead to shorter route times in the second phase and vice-versa. Beyond a certain
level, lengthening the first phase route would lead to a diminishing return on the
total time to read all meters. We should be able to find the optimal value for the
likelihood levels (unique for each statistical model and street network) of reading
the meters with respect to the total time. Likelihood levels greater than or lesser
than the optimal value would lead to an increase in the total time.
98
2.15 Conclusions and Future Directions
We developed an iterative methodology to read uncertain RFID signals from
utility meters using vehicles at some distance. Every time we get access to new meter
reading data, we learn about the probabilities pij?s. The two-stage IP, representing
the meter reading problem, is then re-solved to generate routes that are robust for
addressing the uncertainty. Even though the routes generated by our procedure are
a few miles longer than the benchmark route, the routes are cost-effective when
compared to the costs incurred by sending a vehicle at a later time to read the
missed meters. The two-stage IP formulation is deterministic even though the meter
reading problem has an inherent stochastic set up. The Stage 1 IP, which gives us
the required street segments to reach a specified service level for each meter, is linear
in the decision variables. Computational experiments showed that the linear Stage
1 IP can be optimally solved within a few seconds for large data sets. Since the
Stage 2 IP is NP-complete, we developed a fast metaheuristic that generated and
further improved the full route, and still maintained the specified service levels for
each meter. We developed three Bayesian updating models to learn from the new
incoming data in an efficient way and avoid the drawbacks faced in regression. We
cross-checked our choice for the priors in the Bayesian models by comparing the
parameter estimates of the logit model and the probit model with their regression
counterparts. We showed that the hierarchical probit Bayesian updating model
produces more accurate probability estimates for each meter compared to the other
two models. We conducted simulation experiments to compare the route qualities
99
for the three Bayesian models and the benchmark routes used by utility companies.
In our simulations, we used an actual street network and meter locations, which is
different from the artificial networks used in the literature. Our simulation results
showed that the routes generated by the hierarchical probit model with likelihood
values of 0.75 are operationally useful because almost all meters are read with only a
few miles of extra travel compared to the initial route and the total time in the two
phases to read all the meters is 6.5 hours which fits into a typical 8 hour workday
schedule for the drivers.
Typically, the drivers of service vehicles are more comfortable traveling through
the same neighborhoods on their routes. For a utility company, this leads to meter
reading routes that look similar in every period. Because of this, we do not have
any information about the actual meter reading potential from most of the street
segments in the network. The iterative framework proposed and discussed in this
paper to generate robust routes will be more effective when we have actual meter
reading data from a larger set of street segments. In future work, using Bayesian
decision theory and route optimization, we would like to be able to identify new
street segments to traverse at each time period so that the information gain is
maximized without having to travel many extra miles.
100
Chapter 3: Data-Driven Analysis of the Variability of Routes in the
Capacitated Vehicle Routing Problem
3.1 Introduction
Everyday, delivery and service companies dispatch many vehicles to deliver
customer products and provide services to customers in a city. These companies
generate vehicle routes using software and algorithms provided by third-party ven-
dors. These algorithms focus on minimizing the total route time or average route
time for the fleet of vehicles serving a city.
Typically, these companies need to maintain a well-defined workload balance
amongst the company drivers, i.e., drivers should have similar route times (inclusive
of customer service times). Large differences in workloads are not only unfair to the
drivers, but they might also lead to increased operating and delivery costs for these
companies. Other factors such as mileage, number of vehicles used, driver wages, and
traffic conditions during the actual service hours also contribute to the costs that are
often not considered when generating routes using these algorithms. It is important
to consider the difference between planned costs and operational costs. Third-party
routing software and algorithms consider the costs that are generally taken into
101
Statistic Program Alpha Program Beta
Number of routes generated 25 23
Route time minimum 5.17 5.24
Route time 1st quartile 5.30 5.65
Route time median 5.34 5.94
Route time 3rd quartile 5.39 6.12
Route time maximum 5.46 6.43
Route time inter-quartile range 0.09 0.47
Route time range 0.29 1.19
Route time standard deviation 0.07 0.33
Route time mean 5.33 5.85
Route time total 133.27 135.33
Table 3.1: Summary statistics of route times (in hours) for routes generated by two
third-party software programs on an actual street network to serve customers.
account while building routes. These costs form the planned costs. However, the
operational costs might have extra components in addition to the planned costs
depending on the specific delivery and service company. Special attention should
be given to the additional elements that form the operational costs while generating
the routes.
In Table 3.1, we show the summary statistics of route times (in hours) for
routes generated by two third-party software programs on an actual street network
to serve customers. The company that markets Program Alpha was bidding for a
client that already uses Program Beta to generate its routes. The solution generated
by Program Alpha requires 25 vehicles to serve the city in a total of 133.27 hours.
The solution generated by Program Beta requires two fewer vehicles with two more
hours than the solution produced by Program Alpha. The route times generated
by Program Beta have greater variability (greater workload imbalance) compared
to the route times generated by Program Alpha. Drivers might be more satisfied
102
Instance name Number of customers Vehicle capacity Instance new name
E-n22-k4 21 6000 E-n22-c6000
E-n23-k3 22 4500 E-n23-c4500
E-n30-k3 29 4500 E-n30-c4500
E-n33-k4 32 8000 E-n33-c8000
E-n51-k5 50 160 E-n51-c160
E-n76-k7 75 220 E-n76-c220
E-n76-k8 75 180 E-n76-c180
E-n76-k10 75 140 E-n76-c140
E-n76-k14 75 100 E-n76-c100
E-n101-k8 100 200 E-n101-c200
E-n101-k14 100 112 E-n101-c112
Table 3.2: Eleven CVRP instances (Christofides and Eilon 1969a).
with the routes produced by Program Alpha because they will be getting similar
wages due to the similar number of hours they are driving and serving customers on
a route. In contrast, it might be cheaper to operate two fewer vehicles with just two
extra hours for the company when using the routes produced by Program Beta. For
this example, the summary statistics of the route times are not enough to determine
which of the two solutions is better in terms of total operating and delivery costs. It
is important for the client to examine the effect of the workload imbalance created
by Program Beta. The planned costs to generate the route using Program Beta
do not include the cost of workload imbalance. However, in practice, the workload
imbalance might be an integral part of the operational costs.
3.2 Capacitated Vehicle Routing Problem Instances
We focus on the Capacitated Vehicle Routing Problem (CVRP) to understand
the importance of route balance in calculating the total operating and delivery costs
103
for delivery companies. A CVRP models a real-life scenario, where a vehicle has a
capacity constraint and each customer has a demand. A company needs to fulfill all
customer demands in a city using a fleet of vehicles. The number of vehicles required
to serve the city depends on the total demand, the capacity of each vehicle, and the
algorithm (see Table 3.1). In this chapter, we use the 11 CVRP instances given
by Christofides and Eilon (1969a) which have the node coordinates to perform our
analysis. In Table 3.2, we give the name of each instance, the number of customers,
the capacity of each vehicle, and the new name of each instance that we will use. For
example, in E-n22-k4, there are 22 nodes with 21 customers and 1 depot (denoted
by n22) and 4 vehicles (denoted by k4). The capacity of each vehicle is 6000. The
four instances starting with E-n76 have the same locations for the customers and
the depot and the same customer demands. They differ with respect to the number
of vehicles and the capacity of each vehicle. This also holds for the two instances
starting with E-n101. As the vehicle capacity decreases for these six instances, we
expect to use more vehicles to serve the customers.
The optimal or best-known solution to each of the 11 instances given in the
Capacitated Vehicle Routing Problem Library (2014) uses exactly the number of
vehicles given in the instance name shown in the left-most column in Table 3.2. In
our work, we will let our algorithm determine how many vehicles to use. We will use
the new names (that reflect only vehicle capacity) shown in the right-most column
of Table 3.2. For example, E-n22-k4 becomes E-n22-c6000.
104
3.3 Importance of Standard Deviation of Routes
We want to understand and compare the effects of the standard deviation of
the route lengths on the total route time, the total route time under random traffic
conditions, and the total operating and delivery costs for a company to serve its
customers.
We use a weighted savings modification of the Clarke and Wright (C&W)
algorithm (Clarke and Wright 1964) to generate routes for an instance. In the
initialization step of the standard C&W algorithm, each customer is served by its
own vehicle from the depot. If i and j are two customers that are served separately
from the depot (denoted by 0), then the savings obtained from merging the two
routes is sij = ci0 + c0j ? cij, where ci0 and c0j are the costs of serving i from the
depot and j from the depot, respectively, and cij is the cost of going from i to j.
At each iteration of the standard C&W algorithm, routes are merged based on the
largest savings. At each iteration of our weighted savings modification algorithm,
we determine the three largest savings s1, s2, and s3. Then we randomly choose to
merge the routes with probability pk = sk/(s1 + s2 + s3) where k ? {1, 2, 3}. For
each instance, we generated 1000 solutions using our weighted savings modification
algorithm with the objective of minimizing the total route length. In Table 3.3,
we show the breakdown of the 1000 solutions for each instance in terms of the
number of vehicles required to serve the customers. To illustrate, in 1000 solutions
generated by the weighted savings modification algorithm for the instance E-n22-
c6000, 824 solutions required four vehicles and 176 solutions required five vehicles.
105
Table 3.3: Breakdown of 1000 CVRP solutions for each of the 11 instances in terms
of the number of vehicles required to serve the customers.
The four instances starting with E-n76 and the two instances starting with E-n101
increasingly require more vehicles to serve the customers as the capacity of the
vehicles decreases.
We build three linear regression models that can be represented by yij =
??1 + ??2 ? No of Routesij + ??3 ? No of Customersi + ??4 ? Vehicle Capacityi + ??5 ?
106
Number of vehicles
Instance 3 4 5 6 7 8 9 10 11 14 15 16
E-n22-c6000 824 176
E-n23-c4500 1000
E-n30-c4500 1000
E-n33-c8000 997 3
E-n51-c160 500 500
E-n76-c220 1000
E-n76-c180 967 33
E-n76-c140 135 865
E-n76-c100 15 979 6
E-n101-c200 1000
E-n101-c112 999 1
Parameter Value
Fixed cost for using a vehicle $50 per day
Mileage 10 miles per gallon
Gas price $2.5 per gallon
Speed of vehicle with no traffic 25 miles per hour
Regular work hours for a driver 6 hours per day
Total wage for regular work hours $18 per hour for all 6 hours (18 ? 6 =
$108)
Per hour wage for overtime work hours $27 per hour for each overtime hour
Service time for each customer 10 minutes
Table 3.4: Parameter values that are used to calculate the total operating and
delivery costs for a company to serve its customers (Levy 2018).
Standard Deviationij, where yij = E(Yij), ??k = E(?k), k ? {1, . . . , 5}, i ? {1, . . . , 11}
denotes the instance number, j ? {1, . . . , 1000} denotes the solution number of an
instance, and E() denotes the expected value. No of Routesij is specific for each so-
lution of an instance as shown in Table 3.3. No of Customersi and Vehicle Capacityi
is the same for any solution given an instance as shown in Table 3.2. Standard -
Deviationij is the route length standard deviation calculated for each solution, and
so it is specific for each solution of an instance. The dependent variable Yij is total
route time for Model 1, total route time under random traffic conditions for Model
2, and total cost under random traffic conditions for Model 3. All three dependent
variables are specific for each solution of an instance. In Table 3.4, we give the val-
ues of the parameters (Levy 2018) that are used to calculate the dependent variable
values. For each solution of an instance, we have the route length for each vehicle,
and we divide that by the speed of the vehicle with no traffic to get the route time
for each vehicle. We sum the route times for all vehicles for each solution to get the
total route times for Model 1. We randomly increase the route time for each vehi-
107
cle by t%, where t ? Uniform (0, 10), to capture random traffic conditions during
actual service by the vehicles. This assumes independence of traffic patterns across
streets. There has been some work in the literature that looks at correlated traffic
patterns between streets (Laporte et al. 1992, Kenyon and Morton 2003, Rostami
et al. 2017) while generating the routes. However, the correlation of traffic patterns
will not have an effect on our understanding of the impact of route length standard
deviation on the total cost. We sum the increased route times for all vehicles for each
solution to get the total route times under random traffic conditions for Model 2.
The total costs under random traffic conditions for Model 3 has three components:
driving cost, fixed cost, total wages. Driving cost and fixed cost are invariant under
random traffic conditions. For each solution of an instance, total route length ? (gas
price/mileage) gives us the driving cost; number of vehicles ? fixed cost for using
a vehicle gives us the fixed cost. We add the increased route time under random
traffic conditions and the customer service time (number of customers served on the
route ? service time for each customer) for each vehicle to get the total working
hours under random traffic conditions for each driver. If the total working hours are
less than the regular work hours, then the driver is paid the total wage for regular
work hours. If the total working hours are greater than the regular work hours, an
additional per hour overtime wage is paid to the driver for each overtime hour. We
sum the wages for all drivers for each solution to get the total wages. The dependent
variables of Model 1 and Model 2 are in hours; the dependent variable of Model 3
is in dollars.
In Table 3.5, we present the linear regression results for all three models. The
108
Coefficient Model 1 Model 2 Model 3
Intercept (? ??? ??? ???1) ?5.266 ?5.530 ?257.014
(0.167) (0.176) (5.568)
No of Routes (?2) 1.688
??? 1.774??? 137.939???
(0.011) (0.012) (0.382)
No of Customers (? ) 0.272???3 0.286
??? 9.111???
(0.002) (0.002) (0.068)
Vehicle Capacity (?4) 0.002
??? 0.002??? 0.062???
(0.00002) (0.00002) (0.0006)
Standard Deviation (?5) 0.076
??? 0.080??? 3.285???
(0.001) (0.001) (0.034)
Adjusted R2 0.90 0.90 0.97
???p<0.001
Table 3.5: Linear regression results for three models.
size of the data set is 11,000 (11 ? 1000) for each model. We give the means of the
coefficients of the independent variables and their standard errors in parenthesis. All
five coefficients for all three models are significant at the 0.1% level. The adjusted
R2 values indicate very good model fits. The coefficient of No of Routes is very
large for Model 3 compared to Models 1 and 2. An additional route leads to an
increased fixed cost (for using an additional vehicle) and increased total wages (at
least for the regular work hours of an additional driver) in the calculation of the
total operating and delivery costs. However, it reduces the total overtime work
hours, thereby, reducing the cost for the total overtime wages. Comparing the
coefficient of Standard Deviation across the three models, we see that the effect of
the route length standard deviation increases for the total route time under random
traffic conditions compared to the total route time with no traffic. It is largest for
109
the total operating and delivery costs under random traffic conditions. Since the
driving cost and fixed cost are invariant under random traffic conditions, the route
length standard deviation affects the total wages because drivers having to work
more than the regular working hours will have a large impact on the cost. When
routing algorithms are used to generate routes, it is important to consider the traffic
conditions during the time of actual service and, thereby, also consider the effect on
the total operating and delivery costs.
The regression results show that route balance is not just of secondary im-
portance due to perhaps union regulations. Route balance directly affects total
operating and delivery costs. Under random traffic conditions, route times increase
from the solution that was originally produced by the routing algorithm. When
starting with routes that are already imbalanced, longer routes will take even more
time to complete in the presence of heavy traffic. Imbalanced routes directly and
significantly affect the total wages component in the total cost calculation because
longer routes lead to a larger chance of a driver working more than the regular
work hours. Thus, when routes are imbalanced, a delivery company may pay more
overtime wages to its drivers. Our regression analysis shows that when we consider
a real-world CVRP (in terms of complexity and random traffic conditions), as op-
posed to benchmark problems found in the literature (e.g., a simple extension of the
TSP), route balance is an important determinant of low-cost solutions.
We can interpret the regression analysis in a practical setting as follows. Sup-
pose we seek to solve a real-world CVRP (using data like those presented in Table
3.4) and obtain a high-quality solution in terms of total cost. We can apply a sim-
110
ple algorithm, such as the weighted savings modification of the C&W algorithm,
repeatedly to obtain many solutions. We can then select a solution with a small to-
tal route length and a small route length standard deviation to achieve low operating
and delivery costs.
3.4 Effect of Reducing Standard Deviation on Cost
After we select a high-quality solution among the many solutions generated by
the C&W savings algorithm, we might still be able to reduce the total operating and
delivery costs under random traffic conditions by decreasing the standard deviation
of the routes. We implement fast and easy modifications to the solutions of the
C&W savings algorithm in order to achieve this reduction.
We consider random traffic versions of the instances given in Table 3.2. For
each instance, we divide the distance between each pair of customers (including the
depot) by the speed of the vehicle with no traffic. This gives us the travel time with
no traffic between each pair of customers. We generate 1000 different pseudo versions
of each instance by randomly increasing the travel time between customer pairs by
t%, where t ? Uniform (0, 10). This captures random traffic conditions during actual
service by the vehicles. For each pseudo version (random traffic induced instance),
we use the C&W savings algorithm to generate the solution (set of routes) with the
objective of minimizing the total increased route times including service times. For
each instance, we have 1000 C&W solutions under random traffic conditions (Group
A). Group A is our benchmark because it is obtained by minimizing the total route
111
times including service times which is typical of any route generating algorithm.
We apply three different modifications of the record-to-record (RTR) travel
algorithm (Golden et al. 1998) to each solution. Route time now includes the time
to service customers. A record is the total route time of the C&W solution. A
deviation is r% of the record, where r ? {0, . . . , 10}. The RTR travel algorithm
will accept a new solution (set of routes) if the total route time is less than the sum
of record and deviation. The objective of the RTR travel algorithm is to minimize
the standard deviation of the route time. In the first modification (Scenario X) of
the RTR travel algorithm, we sort the routes of a solution from the longest to the
shortest in terms of the route time. We try to move a customer from a longer route
to a shorter route so that the new total route time is less than the sum of record and
deviation, and the route time standard deviation decreases. The sequence of trials
for moving a customer from a long route starts with the longest route followed by
the second longest route and so on. The sequence of trials for moving a customer
to a short route starts with the shortest route followed by the second shortest route
and so on. After every customer move, we update the sequence of routes. In the
second modification (Scenario Y) of the RTR travel algorithm, we use a total route
time constraint in addition to using the sorted sequence as described in Scenario
X. The total route time is segmented in the following way: the segment value for
anything less than six hours is considered as six hours, for anything between six and
seven hours is considered as seven hours, and so on. This segmentation is in line
with the wage structure for the drivers described in Table 3.4. A customer move is
allowed under Scenario Y using the sorted sequence of routes if the new total route
112
Figure 3.1: Example to illustrate an iteration of the RTR travel algorithm explaining
the three scenarios.
time is less than the sum of record and deviation and less than the segment value for
the current total route time, and the route time standard deviation decreases. After
every customer move, we update the sequence of routes and the total route time
segment value. In the third modification (Scenario Z) of the RTR travel algorithm,
we use a route time constraint for each route in addition to using the sorted sequence
as described in Scenario X. The route times are segmented in the same way as in
Scenario Y. A customer move is allowed under Scenario Z using the sorted sequence
of routes if the new total route time is less than the sum of record and deviation, the
new route times for each route are less than the respective segment values for the
current route times, and the route time standard deviation decreases. After every
customer move, we update the sequence of routes and the route time segment values
for each route.
In Figure 3.1, we show an example to illustrate an iteration of the RTR travel
algorithm explaining the three scenarios. Suppose the C&W savings algorithm gen-
erates three routes a0, b, and c0 for a CVRP instance with the route times as 4.5
113
hours, 6 hours, and 7.5 hours, respectively. The total route time is 18 hours, and
the route time standard deviation is 1.5 hours. Let us look at an iteration of the
RTR travel algorithm with the deviation parameter r as 3%. Therefore, the sum of
record and deviation for the routes a0, b, and c0 is 18.54 (18+0.03?18) hours. The
segment value for the total route time is 18 hours. The segment values for routes
a0, b, and c0 are 6 hours, 6 hours, and 8 hours, respectively. Suppose a customer
move from c0 to a0 changes the two routes to a1 and c1 with the route times as
6.7 hours and 5.5 hours, respectively. The new total route time is 18.2 hours, and
the new route time standard deviation is 0.60 hours. This is a valid customer move
under Scenario X because the new total route time is less than the sum of record
and deviation, and the route time standard deviation decreases. However, this is
not a valid customer move for Scenarios Y and Z. The new total route time is more
than the total route time segment value violating Scenario X, and the route time for
a1 is more than the segment value for a0 violating Scenario Z. Suppose a customer
move from c0 to a0 changes the two routes to a2 and c2 with the route times as
6.2 hours and 5.6 hours, respectively. The new total route time is 17.8 hours, and
the new route time standard deviation is 0.31 hours. This is a valid customer move
under Scenarios X and Y because the new total route time is less than the sum of
record and deviation and less than the total route time segment value, and the route
time standard deviation decreases. However, this is not a valid customer move for
Scenario Z. The route time for a2 is more than the segment value for a0 violating
Scenario Z. Suppose a customer move from c0 to a0 changes the two routes to a3 and
c3 with the route times as 6 hours and 6.5 hours, respectively. The new total route
114
Figure 3.2: A flowchart showing the relation between the four groups and three
scenarios.
time is 18.5 hours, and the new route time standard deviation is 0.29 hours. This is
a valid customer move under Scenarios X and Z because the new total route time
is less than the sum of record and deviation, the route times for a3 and c3 are less
than the segment values for a0 and c0, respectively, and the route time standard
deviation decreases. However, this is not a valid customer move for Scenario Y.
The new total route time is more than the total route time segment value violating
Scenario Y.
We apply all three modified versions of the RTR travel algorithm to each of the
1000 C&W solutions of every instance under random traffic conditions (Group A).
For each modified version of the RTR travel algorithm, each C&W solution produces
11 different solution sequences, one for each value (0 through 10) of the deviation
115
parameter r. The RTR travel algorithm stops when it is not possible to find a
solution that decreases the route time standard deviation and satisfies all criteria of
the particular modification (scenario) of the RTR travel algorithm. Out of the 11
sequences starting from any C&W solution under any particular scenario, we retain
three solutions. First, we retain the best intermediate solution (any solution other
than the initial C&W solution and the final solution) in terms of the shortest total
route time. For each instance and each scenario, we have 1000 best intermediate
total route time solutions (Group B). Second, we retain the best final solution in
terms of the shortest total route time. For each instance and each scenario, we have
1000 best final total route time solutions (Group C). Third, we retain the best final
solution in terms of the smallest route time standard deviation. For each instance
and each scenario, we have 1000 best final route time standard deviation solutions
(Group D). We do not retain the best intermediate solution in terms of the smallest
route time standard deviation because the standard deviation will always be the
smallest in the final solution of each sequence. The reason is that the objective
of the RTR travel algorithm is to minimize the route time standard deviation. In
Figure 3.2, we show the relation between the four groups and three scenarios using a
flowchart. The C&W solutions on the random traffic induced instances form Group
A. Each of the three Scenarios X, Y, and Z of the RTR travel algorithm are applied
to each solution from Group A to obtain three solutions, one each in Groups B,
C, and D. For each solution, we then calculate the total operating and delivery
cost under random traffic conditions using the parameters given in Table 3.4. To
calculate the total cost for each solution, we need the route times of each driver
116
Instance Group A Group B Group C Group D
E-n22-c6000 740.98 737.46 737.19 747.78
E-n23-c4500 1051.56 1000.87 1004.88 1043.74
E-n30-c4500 1007.10 977.38 970.48 983.18
E-n33-c8000 1395.63 1388.76 1392.75 1521.67
E-n51-c160 1194.81 1170.38 1166.95 1175.44
E-n76-c220 1545.11 1538.11 1531.57 1554.61
E-n76-c180 1630.70 1626.05 1623.57 1678.71
E-n76-c140 2021.21 2005.65 1997.69 1975.17
E-n76-c100 2653.86 2649.10 2646.53 2671.40
E-n101-c200 1863.45 1850.06 1836.63 1935.02
E-n101-c112 2594.27 2590.75 2587.77 2553.66
Bold indicates best solution
Table 3.6: Average total cost under random traffic conditions for Scenario X.
and the total route length. We already have the route times for each driver from
the C&W savings algorithm for Group A, and from modifications of the RTR travel
algorithm for Groups B, C, and D. The total route length is computed from the
original instances without the traffic modifications based on the solution. Finally,
for each instance, each scenario, and each group, we find the average of the total
cost under random traffic conditions of the 1000 solutions. The total cost value is
given in dollars.
In Tables 3.6, 3.7, and 3.8, we present the average total cost under random
traffic conditions of the 1000 solutions for each instance and each group for Scenarios
X, Y, and Z, respectively. A scenario represents the modified version of the RTR
travel algorithm applied to the solution of the C&W algorithm to reduce the route
time standard deviation. Group A corresponds to the C&W solutions; the average
total cost values for Group A are the same in all three scenarios. For all three
scenarios, the attempt to reduce the route time standard deviation pays off because
117
Instance Group A Group B Group C Group D
E-n22-c6000 740.98 737.46 737.22 746.78
E-n23-c4500 1051.56 997.57 989.91 992.19
E-n30-c4500 1007.10 977.88 967.33 972.92
E-n33-c8000 1395.63 1386.97 1384.72 1402.00
E-n51-c160 1194.81 1171.39 1167.62 1133.95
E-n76-c220 1545.11 1535.47 1523.73 1520.45
E-n76-c180 1630.70 1626.09 1623.24 1629.33
E-n76-c140 2021.21 2007.44 1998.10 1973.95
E-n76-c100 2653.86 2649.38 2647.04 2671.25
E-n101-c200 1863.45 1848.39 1828.70 1859.06
E-n101-c112 2594.27 2590.82 2587.91 2553.27
Bold indicates best solution
Table 3.7: Average total cost under random traffic conditions for Scenario Y.
Instance Group A Group B Group C Group D
E-n22-c6000 740.98 737.45 737.19 747.09
E-n23-c4500 1051.56 993.15 988.66 1015.55
E-n30-c4500 1007.10 976.30 963.77 972.64
E-n33-c8000 1395.63 1386.97 1388.63 1471.47
E-n51-c160 1194.81 1167.26 1164.33 1161.32
E-n76-c220 1545.11 1535.97 1525.45 1543.83
E-n76-c180 1630.70 1625.86 1623.17 1672.28
E-n76-c140 2021.21 2005.45 1997.26 1974.16
E-n76-c100 2653.86 2649.10 2646.53 2671.29
E-n101-c200 1863.45 1847.69 1829.93 1905.80
E-n101-c112 2594.27 2590.70 2587.58 2554.12
Bold indicates best solution
Table 3.8: Average total cost under random traffic conditions for Scenario Z.
none of the average total cost values from Group A is the best for any instance.
After we have obtained an initial solution by reducing the total route time using a
fast and easy to implement algorithm such as the C&W savings algorithm, there
is still room to improve a solution in terms of the total cost by reducing the route
time standard deviation.
In Table 3.9, we present the best average total cost under random traffic
118
Instance Group A Best Scenario Percent Savings
E-n22-c6000 740.98 737.19 X and Z 0.51
E-n23-c4500 1051.56 988.66 Z 5.98
E-n30-c4500 1007.10 963.77 Z 4.30
E-n33-c8000 1395.63 1384.72 Y 0.78
E-n51-c160 1194.81 1133.95 Y 5.09
E-n76-c220 1545.11 1520.45 Y 1.60
E-n76-c180 1630.70 1623.17 Z 0.46
E-n76-c140 2021.21 1973.95 Y 2.34
E-n76-c100 2653.86 2646.53 X and Z 0.28
E-n101-c200 1863.45 1828.70 Y 1.86
E-n101-c112 2594.27 2553.27 Y 1.58
Table 3.9: Best average total cost under random traffic conditions across all three
scenarios and percent savings compared to Group A.
conditions for each instance by comparing the best values across all three scenarios.
We also present the percent savings of the best average total cost compared to the
average total cost of Group A, which is our benchmark. In six of eleven instances,
Scenario Y provided the best solution. In the remaining five instances, Scenario
Z provided the best solution. Scenario X provided the best solution in two of the
five instances. For the two instances where Scenarios X and Z gave the same best
solution, the additional constraints on the route times of each vehicle in the modified
RTR travel algorithm did not have any effect. The percent savings is calculated as
100 ? (C&W Cost - Best Cost)/C&W Cost. The maximum percent savings is 5.98%
for instance E-n23-c4500, and the minimum percent savings is 0.28% for instance
E-n76-c100. The average percent savings for the 11 instances is 2.25%.
119
3.5 Contribution of Standard Deviation to Cost
We have shown that total route length and route length standard deviation
are both important in reducing the total operating and delivery costs under random
traffic conditions. We now want to understand the separate contributions of these
two factors. We want to determine whether a higher fraction of solutions with the
best total cost are the ones with lower total route length or lower route length
standard deviation.
We use the weighted savings modification of the C&W algorithm with the
top three savings as described in Section 3.3 to generate 1000 solutions for each
instance with the objective of minimizing the total length. The breakdown of the
1000 solutions for each instance in terms of the number of vehicles required to serve
the customers is given in Table 3.3. We choose the 100 best solutions with the lowest
total route length (Bucket RL) and 100 best solutions with the lowest route length
standard deviation (Bucket SD). In total, we have chosen at most 200 solutions out of
1000 because some solutions will appear in both Bucket RL and Bucket SD. For each
of the 200 solutions, we divide the route length for each vehicle by the speed of the
vehicle with no traffic to get the route time for each vehicle. We randomly increase
the route time for each vehicle by t%, where t ? Uniform (0, 10), to capture random
traffic conditions during actual service by a vehicle. We repeat the randomization
1000 times for each of the 200 solutions. For each of the 1000 randomized versions,
we calculate the total operating and delivery costs under random traffic conditions
using the parameters given in Table 3.4. Driving cost and fixed cost are invariant
120
Instance Bucket RL Bucket SD Both Buckets
E-n22-c6000 20 12 12
E-n23-c4500 18 20 18
E-n30-c4500 19 18 17
E-n33-c8000 20 2 2
E-n51-c160 19 20 19
E-n76-c220 13 18 11
E-n76-c180 18 14 12
E-n76-c140 10 20 10
E-n76-c100 10 13 3
E-n101-c200 11 12 3
E-n101-c112 3 18 1
Table 3.10: Number of solutions from the respective buckets for the 20 best total
cost solutions under random traffic conditions.
for the randomized versions of each solution. The total wages will be specific to
each randomized version. For each of the 200 solutions, we find the average total
cost among the 1000 randomized versions. We then choose the 20 best solutions
with the lowest total cost under random traffic conditions among the 200 solutions
and determine whether these are from Bucket RL or Bucket SD.
In Table 3.10, we give the number of solutions for each instance that are
from Bucket RL, Bucket SD, or both buckets among the 20 best solutions with
the lowest total cost under random traffic conditions. In four of eleven instances,
there are more solutions from Bucket RL than from Bucket SD. In the remaining
seven instances, there are more solutions from Bucket SD than from Bucket RL. For
the seven instances, greater number of the best solutions in terms of the total cost
under random traffic conditions are also the best in terms of route length standard
deviation. For nine of eleven instances, the number of solutions in both buckets is
close to the maximum possible, which is the minimum number of solutions among
121
the two buckets. However, for instances E-n76-100 and E-n101-c200, the number of
solutions in both buckets is only three compared to the maximum possible number
of 10 and 11, respectively. Important observations can be made from Table 3.10 and
the last column of Table 3.9. The percent savings in total cost from the RTR travel
algorithm tend to be larger for those instances where we have a greater number of
solutions from Bucket SD and the number of solutions in both buckets is close to the
maximum possible. This observation makes sense because the RTR travel algorithm
is able to substantially reduce the total cost under random traffic conditions for those
instances where the total cost is driven by the standard deviation of the routes with
smaller total route lengths.
3.6 Conclusions and Future Directions
It is not always possible to accurately judge the quality of a route with respect
to total cost by only assessing the total route length or the total route time. We used
standard CVRP instances from the literature to show that, under random traffic
conditions, the standard deviation of routes has an impact on total route time and
on total operating and delivery costs. For delivery companies, it is important to
understand the objective function that third-party routing software programs are
using. We implemented fast and easy modifications to the solutions of the C&W
savings algorithm. We used modified versions of the RTR travel algorithm to reduce
the standard deviation of routes. We showed that the improved solutions in terms
of standard deviation have also improved in terms of total cost. Finally, we showed
122
that the RTR travel algorithm could make a substantial impact if the total cost of
a solution is driven by the standard deviation.
There are big gaps in the literature in assessing the impact of the standard
deviation of routes on overall route quality. There could be improvements in heuris-
tics that further improve the total cost savings by adjusting the standard deviation
of the routes in a smarter way. We need to understand the properties of those
instances that have the potential of producing solutions with lower total cost and
smaller standard deviation. It might be useful for the delivery companies if we
could quantify such instances using some easy to understand metrics. These met-
rics would potentially indicate when it would be worthwhile to try to reduce the
standard deviation in order to produce substantial cost reductions.
123
Chapter 4: Data-Driven Estimation of the Route Length for the Close-
Enough Traveling Salesman Problem
4.1 Introduction
The Close-Enough Traveling Salesman Problem (CETSP) is a variant of the
Traveling Salesman Problem (TSP). Both are defined on a Euclidean plane. The
TSP requires a salesman to visit customers at their exact locations starting and
ending at the depot. In the CETSP, every customer has a service region and is
considered visited when the salesman visits any point in the customer?s service
region. A service region is assumed to be a circular disk centered at the customer
location with a specified radius. Similar to the TSP, the objective of the CETSP is to
visit all customers in the shortest distance traveled starting and ending at the depot.
The TSP is a special case of the CETSP when all customer radii are zero, making
the CETSP at least as difficult to solve as the TSP. In order to solve an instance
of the CETSP, it is not enough to determine the sequence in which the customers
are visited. We must also determine the locations at which these customers are
visited within their respective service regions. In Figure 4.1, we show an example
of a CETSP with 12 customers denoted by C1, . . . , C12 and a depot denoted by C0.
124
Figure 4.1: An example of a CETSP with 12 customers.
The service region is given by a circle centered at the customer?s location. A feasible
CETSP tour is shown by the solid lines with arrows. The tour passes through at
least one point in the service region of each customer. In practice, applications such
as meter reading by utility companies using radio frequency identification (RFID)
technology and surveillance by a pilot in an airplane or an unmanned drone can
be modeled as a CETSP. In both of these applications, it is sufficient to get close
125
enough to the target and not exactly visit the target.
Over the years, many exact and heuristic algorithms have been developed to
solve the CETSP. Gulczynski et al. (2006) and Dong et al. (2007) developed several
heuristics that first selected a set of supernodes such that each customer service
region contains at least one supernode. Then a TSP tour was generated through
these supernodes. Mennell (2009), Mennell et al. (2011), and Wang et al. (2019)
developed several heuristics based on Steiner zones. Silberholz and Golden (2007),
Yuan et al. (2007), and Yang et al. (2018) developed heuristics based on genetic
algorithms. Carrabs et al. (2017) provided tight lower and upper bounds. Behdani
and Smith (2014) and Coutinho et al. (2016) developed exact approaches using a
discretization scheme and a branch-and-bound algorithm, respectively.
In some applications, it may not be necessary to find the routes of the CETSP
using exact or heuristic algorithms. For example, routing companies participating
in competitive bidding might need to respond to a large number of requests regard-
ing route costs in a very short amount of time. In such cases, it is sufficient to
estimate the route lengths using information about the actual instances. Also, dur-
ing post-disaster aerial surveillance planning or using drones to deliver emergency
medical supplies, route length estimation would quickly need to assess whether the
duration to cover a region of interest would exceed the drone battery life. Route-
length estimation for an instance would approximate the route length generated by a
particular heuristic and not necessarily approximate the optimal or the best-known
solution. The variables in the estimation model would capture the features of the
instance that would be exploited by a specific heuristic. For practical purposes,
126
routing companies need to know the actual costs that would be incurred using a
specific algorithm and not the optimal costs even if the optimal costs are lower than
the actual costs. The estimation model would be unique to the algorithm that was
applied even though the general framework would apply to any routing algorithm.
Estimating TSP route lengths has been studied in the operations research
literature since the late 1950s. Beardwood et al. (1959) were among the first to
estimate route lengths using analytically derived formulas. These formulas were
improved by Christofides and Eilon (1969b) using empirically estimated parameters.
Golden and Alt (1979) provided interval estimates of the optimal solution. Chien
(1992), Kwon et al. (1995), Hindle and Worthington (2004), and Cavdar and Sokol
(2015) used parameters for the shape of the area covering the customers and the
depot, the distance between customers, and the coordinates of the customers. Nicola
et al. (2019) provided a detailed regression-based estimation method. However,
estimating the route length for a CETSP has not been addressed in the literature.
We will use a regression model to estimate the CETSP route lengths.
4.2 The Regression Model
The Steiner zone variable neighborhood search (SZVNS) heuristic given in
Wang et al. (2019) finds high-quality solutions to instances of the CETSP. There
are three steps in the SZVNS heuristic. First, Steiner zones of degree three and less
that are not dominated by other Steiner zones of degree three and less are detected.
Second, a set covering problem is solved to choose a subset of Steiner zones such that
127
Independent Variable Definition
n Number of nodes
A Area of the smallest rectangle covering all nodes
MinP Minimum distance across all pairs of nodes
MaxP Maximum distance across all pairs of nodes
VarP Variance of distances across all pairs of nodes
SumMinP Sum of distances to the nearest neighbor of each node
SumMaxP Sum of distances to the farthest neighbor of each node
MinM Minimum distance to the average node
MaxM Maximum distance to the average node
SumM Sum of distances to the average node
VarM Variance of distances to the average node
VarX?VarY Product of variances of the nodes across two axes
AvgR Average radius of the customer service regions
VarR Variance of the radii of the customer service regions
SZ Number of Steiner zones of degree three and less that
are not dominated by other Steiner zones of degree three
and less
Table 4.1: Definitions of the independent variables for the linear regression model.
each customer service region has at least one selected Steiner zone. Third, different
search operators are incorporated into a variable neighborhood search framework to
improve solutions.
We build a linear regression model to estimate the route length generated
by the SZVNS heuristic for the CETSP. The model can be represented by yi =
??1 + ??2? ni + ??3?Ai + ??4?MinPi + ??5?MaxPi + ??6?VarPi + ??7? SumMinPi +
??8 ? SumMaxPi + ??9 ? MinMi + ??10 ? MaxMi + ??11 ? SumMi + ??12 ? VarMi +
??13 ? (VarX ? VarY)i + ??14 ? AvgRi + ??15 ? VarRi + ??16 ? SZi, where yi = E(Yi),
??k = E(?k), k ? {1, . . . , 16}, i denotes a CETSP instance, and E() denotes the
expected value. The dependent variable Yi is the route length generated by the
SZVNS heuristic. In Table 4.1, we give the definitions of the independent variables
for the linear regression model. Nodes represent customers and the depot. n and
128
Figure 4.2: Node locations of instance d493 from the second group of 62 instances.
A capture the size of the instance. MinP, MaxP, VarP, SumMinP, and SumMaxP
capture the distances between nodes. MinM, MaxM, SumM, and VarM capture
the distances to the average node. VarX?VarY captures the spread of the instance
across the two axes. AvgR and VarR capture the mean and variability of the radii
of the customer service regions. The service region radius of a depot is always zero.
SZ captures the feature of the instance that is exploited by the SZVNS heuristic.
4.3 Regression Data and Model Fit Measures
We use the 842 CETSP benchmark instances and their route lengths generated
using the SZVNS heuristic given in Wang et al. (2019). The instances can be
divided into two groups. The first group has 780 instances with the node locations
generated randomly. All customers in an instance have the same radius for the
129
service regions. The second group has 62 instances with the node locations generated
in various structured ways. Some instances have different radii for the customer
service regions. In Figure 4.2, we show the node locations of instance d493 from
the second group. This instance has nodes forming concentric rectangles and nodes
distributed randomly.
We use mean percentage error (MPE) and mean absolute percentage error
(MAPE) to assess the quality of the approximation of the CETSP route lengths
from the regression model (yi) with respect to the route?lengths from the SZVNS
heurist?ic (Yi). MPE and MAPE are defined as 100 ? (
N
i=1(Yi ? yi)/Yi)/N and
100?( Ni=1 | Yi?yi | /Yi)/N , respectively, whereN denotes the number of instances.
A value of MPE close to zero indicates that there is almost an equal distribution
of instances with route lengths being overestimated (Yi < yi) and underestimated
(Yi > yi). The value of MAPE is always positive, and a low value indicates that
the route length estimates are close to the SZVNS route lengths for most of the
instances.
We use R2, adjusted R2, Studentized residuals, outliers, normal probability
plots, Shapiro-Wilk hypothesis test for normality, Mallows?s Cp, and Bayesian in-
formation criterion (BIC) to assess the quality of the model fits. R2 increases as
extra variables are added to the model. Adjusted R2 can decrease if the penalty for
adding an extra variable to the model is more than the improvement to the model.
Residuals (Yi ? yi) have a mean of zero. Studentized residuals are scaled residuals
with unit variance. The Studentized residual plot should show a horizontal band of
points around the horizontal axis at zero to denote the absence of any heteroscedas-
130
ticity in the model. The plot also shows whether the linear form of the model is
adequate to capture all underlying patterns. Any data point (instance) with a Stu-
dentized residual value of greater than 2 or less than ?2 can be considered as an
outlier and may be removed from the model for a better fit. The normal probability
plot matches the quantiles of the Studentized residuals with the quantiles of the
standard normal distribution. The Shapiro-Wilk hypothesis test indicates whether
(null hypothesis) or not (alternative hypothesis) the Studentized residuals follows
a standard normal distribution. Mallows?s Cp and BIC are used in the context of
model selection where the goal is to find the best model involving a subset of the
independent variables. Mallows?s Cp addresses the issue of model overfitting by pe-
nalizing for adding extra variables. Mallows?s Cp value should be greater but close
to the number of independent variables in the model (p) to indicate the absence
of overfitting. Mallow?s Cp is equivalent to Akaike information criterion (AIC) for
linear regression. Both AIC and BIC penalize a model for having more independent
variables. However, BIC uses a larger penalty as the number of instances increases.
The lower the value of BIC, the better is the model fit.
4.4 Regression Results
4.4.1 Results on all 842 Instances
In Table 4.2, we present the regression results for all 842 instances with and
without outliers. The regression model with outliers (842 instances) has an adjusted
R2 value of 0.978 which might indicate a very good model fit. VarR is insignificant
131
Coefficient With outliers Without outliers
Intercept (? ) ?49.226??? ?18.970???1
n (?2) 0.197
??? 0.059???
A (?3) ?0.006? 0.006???
MinP (? ) ?7.580???4 ?0.053
MaxP (?5) 7.942
??? 4.696???
VarP (?6) ?0.362??? ?0.263???
SumMinP (? ??? ???7) 0.190 0.190
SumMaxP (?8) ?0.005? 0.006???
MinM (? ) 4.742???9 0.343
MaxM (?10) ?3.259? ?1.353???
SumM (? ? ???11) 0.014 ?0.012
VarM (?12) 0.357
??? 0.389???
VarX?VarY (?13) 0.00002??? 0.000003???
AvgR (? ) ?17.039??? ?12.876???14
VarR (?15) 0.056 0.156
???
SZ (? ??? ???16) ?0.009 ?0.005
Number of instances 842 811
Adjusted R2 0.978 0.999
MPE 1.530% 0.327%
MAPE 21.141% 9.821%
.p<0.1; ?p<0.05; ??p<0.01; ???p<0.001
Table 4.2: Regression results for the 842 instances with and without outliers.
at the 10% level. A, SumMaxP, MaxM, and SumM are significant at the 5% level.
All other variables are significant at the 0.1% level. In Figure 4.3, we give the
Studentized residual plot of 842 instances. The lines show the Studentized residuals
at values of 2 and ?2. The Studentized residual plot show some linear trend which
may be due to outliers. There are 31 instances from the second group of 62 instances
with Studentized residual values greater than 2 or less than ?2. The Studentized
residuals have values from 18 to ?21. In Figure 4.4, we give the histogram of
Studentized residuals of 842 instances. The histogram shows that there is almost
an equal distribution of instances with positive and negative residuals. This is also
132
Figure 4.3: Studentized residual plot of 842 instances. The lines indicate the Stu-
dentized residual values of 2 and ?2.
Figure 4.4: Histogram of Studentized residuals of 842 instances.
indicated by the MPE value of 1.530% which is close to zero. In Figure 4.5, we
give the normal probability plot of Studentized residuals of 842 instances. Both the
histogram and the normal probability plot show that the Studentized residuals do
133
Figure 4.5: Normal probability plot of Studentized residuals of 842 instances.
not follow a standard normal distribution. The non-normality of the Studentized
residuals is indicated by the Shapiro-Wilk hypothesis test which rejects the null
hypothesis at the 0.1% significance level. The MAPE value indicates that the route-
length estimates from the regression model differ by an average of 21.141% from the
SZVNS route lengths. Even though the model has a very high adjusted R2 value,
MAPE and the result of the Shapiro-Wilk hypothesis test indicate that the model
does not perform well, specifically for the second group of 62 instances with 31
outliers.
We now examine the results of the model without the outliers (Table 4.2). The
regression model without outliers (811 instances) has an adjusted R2 value of 0.999
which might indicate a very good model fit. MinP and MinM are not significant at
the 10% level. All other variables are significant at the 0.1% level. The MPE value
of 0.327% is close to zero which indicates that there is almost an equal distribution
134
of instances with positive and negative residuals. The MAPE value indicates that
the route-length estimates from the regression model differ by an average of 9.821%
from the SZVNS route lengths. Even though the average route-length prediction
error dropped from 21.141% to 9.821% after removing the 31 outlier instances, the
prediction error of about 10% might still be above an acceptable limit for a routing
manager. Therefore, it is worthwhile to examine the performance of the regression
model separately on the first group of 780 instances and on the second group of 62
instances.
Nicola et al. (2019) provided detailed regression-based estimation models for
the TSP and similar routing problems. The estimation models use around 400
instances. The quality of the estimation models was assessed using MPE and MAPE,
in addition to adjusted R2. For many of the regression models they studied, the
adjusted R2 values were in the range 0.97 to 0.99 with MAPE values greater than
10%. These observations indicated that instances with different geometric properties
can have large errors in a practical sense if their route lengths are estimated using
the same model even if the model has a very high adjusted R2. However, splitting up
the instances according to their geometric properties and building specific estimation
models significantly lowered the MAPE values.
4.4.2 Results on the Second Group of 62 Instances
In Table 4.3, we present the regression results for the second group of 62 in-
stances with and without outliers. The regression model with outliers (62 instances)
135
Coefficient With outliers Without outliers
Intercept (?1) 49.407 60.461
n (?2) 0.099 0.029
A (?3) 0.003 0.0008
MinP (?4) ?19.059 ?20.600?
MaxP (? .5) 8.129 9.245
??
VarP (?6) ?0.112 ?0.099
SumMinP (?7) 0.184
? 0.166??
SumMaxP (?8) 0.008 0.010
MinM (?9) 26.047
. 12.840
MaxM (?10) ?7.918 ?9.189
SumM (?11) ?0.020 ?0.024
VarM (?12) 0.112 0.019
VarX?VarY (?13) 0.000005 0.000007
AvgR (?14) ?17.349??? ?19.728???
VarR (?15) ?0.038 0.202
SZ (?16) ?0.010??? ?0.008???
Number of instances 62 57
Adjusted R2 0.954 0.978
MPE ?1.237% 1.184%
MAPE 26.546% 25.638%
.p<0.1; ?p<0.05; ??p<0.01; ???p<0.001
Table 4.3: Regression results for the second group of 62 instances with and without
outliers.
has an adjusted R2 value of 0.954 which might indicate a very good model fit. AvgR
and SZ are significant at the 0.1% level. SumMinP is significant at the 5% level.
MaxP and MinM are significant at the 10% level. All other variables are not signif-
icant at the 10% level. In Figure 4.6, we give the Studentized residual plot for the
second group of 62 instances. The lines show the Studentized residuals at values of
2 and ?2. The Studentized residual plot shows some linear trend which may be due
to outliers. There are five instances with Studentized residual values greater than
2 or less than ?2. The Studentized residuals have values from 5 to ?6. In Figure
4.7, we give the histogram of Studentized residuals for the second group of 62 in-
136
Figure 4.6: Studentized residual plot for the second group of 62 instances. The lines
indicate the Studentized residual values of 2 and ?2.
Figure 4.7: Histogram of Studentized residuals for the second group of 62 instances.
stances. The histogram shows that there is almost an equal distribution of instances
with positive and negative residuals. This is also indicated by the MPE value of
137
Figure 4.8: Normal probability plot of Studentized residuals for the second group
of 62 instances.
?1.237% which is close to zero. In Figure 4.8, we give the normal probability plot
of Studentized residuals for the second group of 62 instances. Both the histogram
and the normal probability plot show that the Studentized residuals do not follow
a standard normal distribution. The non-normality of the Studentized residuals is
indicated by the Shapiro-Wilk hypothesis test which rejects the null hypothesis at
the 0.1% significance level. The MAPE value indicates that the route-length esti-
mates from the regression model differ by an average of 26.546% from the SZVNS
route lengths. Even though the model has a very high adjusted R2 value, MAPE
and the result of the Shapiro-Wilk hypothesis test indicate that the model does not
perform well.
We now examine the results of the model without the outliers (Table 4.3).
The regression model without outliers (57 instances) has an adjusted R2 value of
138
Figure 4.9: Studentized residual plot for the first group of 780 instances. The lines
indicate the Studentized residual values of 2 and ?2.
0.978 which might indicate a very good model fit. AvgR and SZ are significant
at the 0.1% level. MaxP and SumMinP are significant at the 1% level. MinP is
significant at the 5% level. All other variables are not significant at the 10% level.
The MPE value of 1.184% is close to zero which indicates that there is almost an
equal distribution of instances with positive and negative residuals. The MAPE
value indicates that the route-length estimates from the regression model differ by
an average of 25.638% from the SZVNS route lengths. The average route-length
prediction error did not change much after removing the five outlier instances. The
prediction error of nearly 26% is very high for practical purposes.
139
Coefficient With outliers
Intercept (?1) 15.231
???
n (?2) 0.225
A (?3) 0.048
???
MinP (?4) ?0.654???
MaxP (? ?5) 0.224
VarP (?6) 0.106
.
SumMinP (? ) 0.361???7
SumMaxP (?8) 0.036
?
MinM (?9) 0.459
??
MaxM (?10) ?0.418?
SumM (? ) ?0.067?11
VarM (? ) 0.683???12
VarX?VarY (? ???13) 0.014
AvgR (?14) ?9.092???
SZ (?16) ?0.026
Adjusted R2 0.921
MPE ?0.192%
MAPE 3.984%
.p<0.1; ?p<0.05; ??p<0.01; ???p<0.001
Table 4.4: Regression results for the first group of 780 instances with outliers.
Figure 4.10: Histogram of Studentized residuals for the first group of 780 instances.
140
Figure 4.11: Normal probability plot of Studentized residuals for the first group of
780 instances.
4.4.3 Results on the First Group of 780 Instances
In Table 4.4, we present the regression results for the first group of 780 in-
stances. The regression model has an adjusted R2 value of 0.921 which indicates a
very good model fit. VarR is not present in the model because the variance of the
radii for the customer service regions is zero for all instances in this group. n and
SZ are not significant at the 10% level. VarP is significant at the 10% level. MaxP,
SumMaxP, MaxM, and SumM are significant at the 5% level. MinM is significant
at the 1% level. All other variables are significant at the 0.1% level. In Figure 4.9,
we give the Studentized residual plot for the first group of 780 instances. The lines
show the Studentized residuals at values of 2 and ?2. The Studentized residual plot
shows a horizontal band of points. There are 40 instances with Studentized resid-
141
ual values greater than 2 or less than ?2. The Studentized residuals have values
from 3 to ?4. Even though there are outliers, the magnitude of the deviations of
the outliers is not very high. In Figure 4.10, we give the histogram of Studentized
residuals for the first group of 780 instances. The histogram shows that there is
almost an equal distribution of instances with positive and negative residuals. This
is also indicated by the MPE value of ?0.192% which is close to zero. In Figure
4.11, we give the normal probability plot of Studentized residuals for the first group
of 780 instances. Both the histogram and the normal probability plot show that the
Studentized residuals follow a standard normal distribution. The normality of the
Studentized residuals is indicated by the Shapiro-Wilk hypothesis test which does
not reject the null hypothesis at the 10% significance level. The MAPE value indi-
cates that the route-length estimates from the regression model differ by an average
of 3.984% from the SZVNS route lengths. The adjusted R2 value, MAPE, and the
result of the Shapiro-Wilk hypothesis test indicate that the model performs well
in predicting the SZVNS route lengths. Therefore, it would be useful to generalize
the use of a linear regression model in predicting the SZVNS route lengths using
the variables given in Table 4.1 for the CETSP. We need to validate the regression
model on out-of-sample data for the generalization to work well on instances similar
to this group.
142
Figure 4.12: Studentized residual plot for the first set of 520 training instances. The
lines indicate the Studentized residual values of 2 and ?2.
4.4.4 Cross-validation for the First Group of 780 Instances
We perform 3-fold cross-validation to test the performance of the linear re-
gression model on out-of-sample data. We randomly partition the first group of 780
instances into three equal groups of size 260 each. First, we train the model on the
second and the third groups and test the model on the first group. Second, we train
the model on the first and the third groups and test the model on the second group.
Third, we train the model on the first and the second groups and test the model on
the third group. We look at the performance of the regression model in each of the
three scenarios with the training data sets of size 520, and the testing data set of
size 260.
In Table 4.5, we present the regression results for three training sets. The
regression model on the first training set has an adjusted R2 value of 0.919 which
143
Coefficient Training set 1 Training set 2 Training set 3
Intercept (? ) 16.436???1 11.987
??? 17.691???
n (?2) 0.009 0.628
??? ?0.041
A (? ) 0.061???3 0.035
??? 0.043???
MinP (? ) ?0.635??4 ?0.608?? ?0.972???
MaxP (?5) 0.152 0.178 0.301
?
VarP (? ) 0.131?6 0.198
?? ?0.039
SumMinP (?7) 0.379
??? 0.400??? 0.331???
SumMaxP (? ) 0.061??8 ?0.008 0.050??
MinM (?9) 0.543
?? 0.463? 0.410?
MaxM (?10) ?0.533? ?0.185 ?0.505?
SumM (? ?11) ?0.094 ?0.035 ?0.047
VarM (? ) 0.591??? 0.820???12 0.739
???
VarX?VarY (? ??? ???13) 0.011 0.016 0.015???
AvgR (? ???14) ?9.145 ?9.101??? ?9.014???
SZ (?16) ?0.025 ?0.034 ?0.025
Adjusted R2 0.919 0.927 0.920
MPE (in-sample) ?0.193% ?0.179% ?0.190%
MAPE (in-sample) 4.075% 3.817% 3.948%
MPE (out-of-sample) ?0.712% 0.315% ?0.246%
MAPE (out-of-sample) 3.945% 4.388% 4.183%
.p<0.1; ?p<0.05; ??p<0.01; ???p<0.001
Table 4.5: Regression results on each training set of 520 instances from the first
group of 780 instances.
indicates a very good model fit. VarR is not present in the model. n, MaxP, and
SZ are not significant at the 10% level. VarP, MaxM, and SumM are significant at
the 5% level. MinP, SumMaxP, and MinM are significant at the 1% level. All other
variables are significant at the 0.1% level. In Figure 4.12, we give the Studentized
residual plot for the first set of 520 training instances. The lines show the Studentized
residuals at values of 2 and ?2. The Studentized residual plot shows a horizontal
band of points. There are 23 instances with Studentized residual values greater
than 2 or less than ?2. The Studentized residuals have values from 3 to ?4. Even
though there are outliers, the magnitude of the deviations of the outliers is not
144
Figure 4.13: Histogram of Studentized residuals for the first set of 520 training
instances.
Figure 4.14: Normal probability plot of Studentized residuals for the first set of 520
training instances.
very high. In Figure 4.13, we give the histogram of Studentized residuals for the
first set of 520 training instances. The histogram shows that there is almost an
145
equal distribution of instances with positive and negative residuals. This is also
indicated by the MPE value of ?0.193% which is close to zero. In Figure 4.14,
we give the normal probability plot of Studentized residuals for the first set of 520
training instances. Both the histogram and the normal probability plot show that
the Studentized residuals follow a standard normal distribution. The normality of
the Studentized residuals is indicated by the Shapiro-Wilk hypothesis test which
does not reject the null hypothesis at the 10% significance level. The MAPE value
indicates that the route-length estimates from the regression model differ by an
average of 4.075% from the SZVNS route lengths. The adjusted R2 value, MAPE,
and the result of the Shapiro-Wilk hypothesis test indicate that the model performs
well in predicting the SZVNS route lengths. The out-of-sample MPE and MAPE
values are ?0.712% and 3.945%, respectively, which indicate that the regression
model performs well on new instances from this group.
The regression model on the second training set has an adjusted R2 value
of 0.927 which indicates a very good model fit. VarR is not present in the model.
MaxP, SumMaxP, MaxM, SumM, and SZ are not significant at the 10% level. MinM
is significant at the 5% level. MinP and VarP are significant at the 1% level. All
other variables are significant at the 0.1% level. In Figure 4.15, we give the Studen-
tized residual plot for the second set of 520 training instances. The lines show the
Studentized residuals at values of 2 and ?2. The Studentized residual plot shows a
horizontal band of points. There are 24 instances with Studentized residual values
greater than 2 or less than ?2. The Studentized residuals have values from 4 to
?3. Even though there are outliers, the magnitude of the deviations of the outliers
146
Figure 4.15: Studentized residual plot for the second set of 520 training instances.
The lines indicate the Studentized residual values of 2 and ?2.
Figure 4.16: Histogram of Studentized residuals for the second set of 520 training
instances.
is not very high. In Figure 4.16, we give the histogram of Studentized residuals for
the second set of 520 training instances. The histogram shows that there is almost
147
Figure 4.17: Normal probability plot of Studentized residuals for the second set of
520 training instances.
an equal distribution of instances with positive and negative residuals. This is also
indicated by the MPE value of ?0.179% which is close to zero. In Figure 4.17, we
give the normal probability plot of Studentized residuals for the second set of 520
training instances. Both the histogram and the normal probability plot show that
the Studentized residuals follow a standard normal distribution. The normality of
the Studentized residuals is indicated by the Shapiro-Wilk hypothesis test which
does not reject the null hypothesis at the 1% significance level. The MAPE value
indicates that the route-length estimates from the regression model differ by an av-
erage of 3.817% from the SZVNS route lengths. The adjusted R2 value, MAPE, and
the result of the Shapiro-Wilk hypothesis test indicate that the model performs well
in predicting the SZVNS route lengths. The out-of-sample MPE and MAPE val-
ues are 0.315% and 4.388%, respectively, which indicate that the regression model
148
Figure 4.18: Studentized residual plot for the third set of 520 training instances.
The lines indicate the Studentized residual values of 2 and ?2.
Figure 4.19: Histogram of Studentized residuals for the third set of 520 training
instances.
performs well on new instances from this group.
The regression model on the third training set has an adjusted R2 value of
149
Figure 4.20: Normal probability plot of Studentized residuals for the third set of
520 training instances.
0.920 which indicates a very good model fit. VarR is not present in the model.
n, VarP, SumM, and SZ are not significant at the 10% level. MaxP, MinM, and
MaxM are significant at the 5% level. SumMaxP is significant at the 1% level.
All other variables are significant at the 0.1% level. In Figure 4.18, we give the
Studentized residual plot for the third set of 520 training instances. The lines show
the Studentized residuals at values of 2 and ?2. The Studentized residual plot shows
a horizontal band of points. There are 30 instances with Studentized residual values
greater than 2 or less than ?2. The Studentized residuals have values from 3 to
?4. Even though there are outliers, the magnitude of the deviations of the outliers
is not very high. In Figure 4.19, we give the histogram of Studentized residuals for
the third set of 520 training instances. The histogram shows that there is almost
an equal distribution of instances with positive and negative residuals. This is also
150
indicated by the MPE value of ?0.190% which is close to zero. In Figure 4.20, we
give the normal probability plot of Studentized residuals for the third set of 520
training instances. Both the histogram and the normal probability plot show that
the Studentized residuals follow a standard normal distribution. The normality of
the Studentized residuals is indicated by the Shapiro-Wilk hypothesis test which
does not reject the null hypothesis at the 10% significance level. The MAPE value
indicates that the route-length estimates from the regression model differ by an
average of 3.948% from the SZVNS route lengths. The adjusted R2 value, MAPE,
and the result of the Shapiro-Wilk hypothesis test indicate that the model performs
well in predicting the SZVNS route lengths. The out-of-sample MPE and MAPE
values are ?0.246% and 4.183%, respectively, which indicate that the regression
model performs well on new instances from this group.
The average MPE and MAPE values on the three training data sets (in-sample)
of size 520 are ?0.188% and 3.947%, respectively. The average MPE and MAPE
values on the three testing data sets (out-of-sample) of size 260 are ?0.196% and
4.172%, respectively. The average MAPE value is very close for in-sample data and
out-of-sample data. The performance of the regression model on out-of-sample data
indicates that the linear model can be applied to predict SZVNS route lengths on
CETSP instances with properties similar to the first group of 780 instances. Given
the robustness of the linear regression model, it would be useful to find the best
subset model without compromising the performance.
151
Number of variables
Variable 1 2 3 4 5 6 7 8 9 10 11 12 13 14
n * * *
A * * * * * * * * * * * *
MinP * * * * * * *
MaxP * * * * * *
VarP * * *
SumMinP * * * * * * * * * * * *
SumMaxP * * * * * * * * * * *
MinM * * * * * * * *
MaxM * * * * *
SumM * * * * *
VarM * * * * * * * * * *
VarX?VarY * * * * * * * * *
AvgR * * * * * * * * * * * * *
SZ *
Table 4.6: Best subset models based on R2 for the first group of 780 instances.
4.4.5 Model Selection for the First Group of 780 Instances
In Table 4.6, we present the best subset models based on R2 for the first group
of 780 instances. VarR is not present in any of the models. We select the best 1-
variable model, best 2-variable model, and so on based on the highest R2 values. As
we increase the number of variables from k to k+ 1, the best (k+ 1)-variable model
might drop some variables that were in the best k-variable model. Out of the 14
best subset models shown, we select three models based on adjusted R2, Mallows?s
Cp, and BIC. In Figures 4.21, 4.22, and 4.23, we show the 14 best subset models
for the first group of 780 instances according to adjusted R2, Mallows?s Cp, and
BIC, respectively. The 14 models are arranged in decreasing order of performance
according to the respective criterion with the first row indicating the best model
and the last row indicating the worst model. We select the best models (first row)
152
Figure 4.21: Plot showing the best subset models for the first group of 780 instances
arranged according to adjusted R2.
Figure 4.22: Plot showing the best subset models for the first group of 780 instances
arranged according to Mallows?s Cp.
according to each of the three criteria for further analysis of model performance.
In Table 4.7, we present the regression results for three best subset models
153
Figure 4.23: Plot showing the best subset models for the first group of 780 instances
arranged according to BIC.
Figure 4.24: Studentized residual plot of 780 instances for the best adjusted R2
model. The lines indicate the Studentized residual values of 2 and ?2.
based on adjusted R2, Mallows?s Cp, and BIC. The regression model for the best
subset based on adjusted R2 has 13 variables and an adjusted R2 value of 0.921
154
Coefficient Best adjusted R2 Best Mallows?s Cp Best BIC
Intercept (?1) 15.320
??? 15.485??? 16.334???
n (?2) 0.188
A (? ??? ???3) 0.048 0.046 0.044
???
MinP (? ) ?0.668??? ?0.734??? ?0.674???4
MaxP (?5) 0.224
? 0.230?
VarP (?6) 0.101
.
SumMinP (?7) 0.359
??? 0.362??? 0.362???
SumMaxP (? ? ??? ???8) 0.036 0.025 0.027
MinM (? ) 0.455?? 0.508??? 0.498???9
MaxM (? ? ?10) ?0.410 ?0.273
SumM (? ) ?0.064?11
VarM (? ??? ???12) 0.685 0.805 0.797
???
VarX?VarY (?13) 0.014??? 0.011??? 0.012???
AvgR (? ) ?9.059??? ?9.059??? ?9.059???14
SZ (?16)
Number of variables 13 10 8
Adjusted R2 0.921 0.921 0.920
MPE ?0.192% ?0.194% ?0.202%
MAPE 3.983% 3.995% 4.008%
.p<0.1; ?p<0.05; ??p<0.01; ???p<0.001
Table 4.7: Regression results on the best subset models based on R2 for the first
group of 780 instances.
which indicates a very good model fit. SZ is not in the model. n is not significant at
the 10% level. VarP is significant at the 10% level. MaxP, SumMaxP, MaxM, and
SumM are significant at the 5% level. MinM is significant at the 1% level. All other
variables are significant at the 0.1% level. In Figure 4.24, we give the Studentized
residual plot of 780 instances for the best adjusted R2 model. The lines show the
Studentized residuals at values of 2 and ?2. The Studentized residual plot shows a
horizontal band of points. There are 38 instances with Studentized residual values
greater than 2 or less than ?2. The Studentized residuals have values from 3 to
?4. Even though there are outliers, the magnitude of the deviations of the outliers
155
Figure 4.25: Histogram of Studentized residuals of 780 instances for the best ad-
justed R2 model.
Figure 4.26: Normal probability plot of Studentized residuals of 780 instances for
the best adjusted R2 model.
is not very high. In Figure 4.25, we give the histogram of Studentized residuals of
780 instances for the best adjusted R2 model. The histogram shows that there is
156
Figure 4.27: Studentized residual plot of 780 instances for the best Mallows?s Cp
model. The lines indicate the Studentized residual values of 2 and ?2.
almost an equal distribution of instances with positive and negative residuals. This
is also indicated by the MPE value of ?0.192% which is close to zero. In Figure
4.26, we give the normal probability plot of Studentized residuals of 780 instances
for the best adjusted R2 model. Both the histogram and the normal probability
plot show that the Studentized residuals follow a standard normal distribution. The
normality of the Studentized residuals is indicated by the Shapiro-Wilk hypothesis
test which does not reject the null hypothesis at the 10% significance level. The
MAPE value indicates that the route-length estimates from the regression model
differ by an average of 3.983% from the SZVNS route lengths. The adjusted R2
value, MAPE, and the result of the Shapiro-Wilk hypothesis test indicate that the
model performs well in predicting the SZVNS route lengths.
The regression model for the best subset based on Mallows?s Cp has 10 variables
157
Figure 4.28: Histogram of Studentized residuals of 780 instances for the best Mal-
lows?s Cp model.
Figure 4.29: Normal probability plot of Studentized residuals of 780 instances for
the best Mallows?s Cp model.
and an adjusted R2 value of 0.921 which indicates a very good model fit. n, VarP,
SumM, and SZ are not in the model. MaxP and MaxM are significant at the 5%
158
level. All other variables are significant at the 0.1% level. In Figure 4.27, we give the
Studentized residual plot of 780 instances for the best Mallows?s Cp model. The lines
show the Studentized residuals at values of 2 and ?2. The Studentized residual plot
shows a horizontal band of points. There are 41 instances with Studentized residual
values greater than 2 or less than ?2. The Studentized residuals have values from 3
to?4. Even though there are outliers, the magnitude of the deviations of the outliers
is not very high. In Figure 4.28, we give the histogram of Studentized residuals of
780 instances for the best Mallows?s Cp model. The histogram shows that there is
almost an equal distribution of instances with positive and negative residuals. This
is also indicated by the MPE value of ?0.194% which is close to zero. In Figure
4.29, we give the normal probability plot of Studentized residuals of 780 instances
for the best Mallows?s Cp model. Both the histogram and the normal probability
plot show that the Studentized residuals follow a standard normal distribution. The
normality of the Studentized residuals is indicated by the Shapiro-Wilk hypothesis
test which does not reject the null hypothesis at the 10% significance level. The
MAPE value indicates that the route-length estimates from the regression model
differ by an average of 3.995% from the SZVNS route lengths. The adjusted R2
value, MAPE, and the result of the Shapiro-Wilk hypothesis test indicate that the
model performs well in predicting the SZVNS route lengths.
The regression model for the best subset based on BIC has eight variables and
an adjusted R2 value of 0.920 which indicates a very good model fit. n, MaxP, VarP,
MaxM, SumM, and SZ are not in the model. All other variables are significant at
the 0.1% level. In Figure 4.30, we give the Studentized residual plot of 780 instances
159
Figure 4.30: Studentized residual plot of 780 instances for the best BIC model. The
lines indicate the Studentized residual values of 2 and ?2.
Figure 4.31: Histogram of Studentized residuals of 780 instances for the best BIC
model.
for the best BIC model. The lines show the Studentized residuals at values of 2 and
?2. The Studentized residual plot shows a horizontal band of points. There are
160
Figure 4.32: Normal probability plot of Studentized residuals of 780 instances for
the best BIC model.
39 instances with Studentized residual values greater than 2 or less than ?2. The
Studentized residuals have values from 3 to ?4. Even though there are outliers, the
magnitude of the deviations of the outliers is not very high. In Figure 4.31, we give
the histogram of Studentized residuals of 780 instances for the best BIC model. The
histogram shows that there is almost an equal distribution of instances with positive
and negative residuals. This is also indicated by the MPE value of ?0.202% which
is close to zero. In Figure 4.32, we give the normal probability plot of Studentized
residuals of 780 instances for the best BIC model. Both the histogram and the
normal probability plot show that the Studentized residuals follow a standard normal
distribution. The normality of the Studentized residuals is indicated by the Shapiro-
Wilk hypothesis test which does not reject the null hypothesis at the 10% significance
level. The MAPE value indicates that the route-length estimates from the regression
161
model differ by an average of 4.008% from the SZVNS route lengths. The adjusted
R2 value, MAPE, and the result of the Shapiro-Wilk hypothesis test indicate that
the model performs well in predicting the SZVNS route lengths.
All three best subset models based on adjusted R2, Mallows?s Cp, and BIC
performed similarly as indicated by their adjusted R2 value, MAPE, and the result
of the Shapiro-Wilk hypothesis test. The MAPE value was 3.983% for the best
subset model on 13 variables according to adjusted R2, 3.995% for the best subset
model on 10 variables according to Mallows?s Cp, and 4.008% for the best subset
model on eight variables according to BIC. The linear model with eight variables
predicts the SZVNS route lengths for CETSP instances with properties similar to
the first group of 780 instances almost as well as any other linear model with more
variables. All eight variables in the model are highly significant at the 0.1% level
which indicates that these variables are essential in predicting the SZVNS route
lengths.
4.5 Conclusions and Future Directions
We showed that it is possible to have a fast and accurate method of predicting
CETSP route lengths using a linear regression model without generating the actual
routes. The exact model would be able to predict the route lengths generated
by using a specific routing algorithm. However, the overall regression framework
can be adapted to any algorithm or heuristic. We showed that the SZVNS route
lengths for CETSP could be predicted with an average error of about 4% using
162
a linear regression model. Therefore, we recommend to use the linear regression
model with A, MinP, SumMinP, SumMaxP, MinM, VarM, VarX?VarY, and AvgR
as the independent variables for predicting the SZVNS routes lengths on CETSP
instances with random node locations, and all customers having the same radius
for the service regions. However, the linear regression model did not perform well
for instances with node locations generated in different structured ways. There
could be improvements in predicting the route lengths for CETSP instances with
node locations not generated randomly that are also fast to compute. We need to
represent these instances in the regression models using more appropriate variables.
163
Chapter 5: Intersection Inspection Rural Postman Problem on a Mixed
Graph
5.1 Introduction
The Rural Postman Problem (RPP) is an important arc routing problem. In
an RPP, we need to find the shortest way of connecting a given set of required
street segments to form a full route. It has many real-world applications including
street sweeping, meter reading, postal delivery, and snow plowing. Frederickson
et al. (1978) showed that the RPP on a mixed graph is NP-complete. Corbera?n
et al. (2014) described problem formulations for directed, mixed, and windy graphs.
There has also been interest in the RPP with turn penalties which was introduced
by Benavent and Soler (1999). The idea is that the quality of a tour is determined
not only by the length of the tour but also by the types of turns that are made at
street intersections. For example, most truck drivers prefer to travel straight ahead
for as long as possible. Turning left or even turning right can be dangerous and time
consuming. U-turns are impossible for long trucks. In snow plowing in the U.S.,
left turns and U-turns are often discouraged because they take more time and push
snow into an intersection. So, along with solving an RPP, there is a cost associated
164
Figure 5.1: (Color online) An intersection with two left turns.
with each turn that needs to be taken into account. Clossey et al. (2001) used a
two-stage approach to deal with turn penalties. In the first stage, the problem was
solved as an RPP to obtain a Eulerian graph. In the second stage, an end-pairing
algorithm was used to generate a Eulerian tour taking into consideration the turn
penalties. Cerrone et al. (2019) further improved this two-stage approach using
heuristics.
We introduce another important variant of the RPP involving turns. City
governments and highway authorities carry out road inspections to decide which
street segments to repair by taking videos using a camera mounted on a vehicle.
This process is similar to Google generating street view images. The vehicle taking
165
Figure 5.2: (Color online) An intersection with two right turns.
the videos needs to proceed straight or take a left turn to cover an intersection fully.
A right turn does not always capture an intersection fully and a U-turn does not
cross an intersection. In Figure 5.1, we show an intersection with two left turns. A
blue dot on a street segment or an intersection represents that a proper pass by the
vehicle is required through that region to cover them. One left turn is going from
east to south, covering the street segment to the east and the left side of the street
segment to the south. The other left turn is going from south to west, covering the
street segment to the west and the right side of the street segment to the south.
Only one pass in either direction is required to cover a street segment unless there
is a concrete barrier through the middle of the street segment that blocks the view
166
Figure 5.3: (Color online) Map of Dupont Circle in Washington, DC.
of the camera mounted on the vehicle. Each of these two left turns covers the
intersection. However, the street segment to the north is not covered. In Figure 5.2,
we show an intersection with two right turns. One right turn is going from south to
east, covering the street segment to the east and the right side of the street segment
to the south. The other right turn is going from north to west, covering the street
segment to the north and the street segment to the west. These two right turns do
not cover the intersection and the left side of the street segment to the south. In
Figure 5.3, we show the map of Dupont Circle in Washington, DC. Even though
there are multiple lanes in the circle, just one pass is required through any lane to
cover the circle as denoted by the red line. The red dots denote the various street
167
Figure 5.4: (Color online) The RPP and the IIRPP solutions on a small grid-like
street network.
segments connected to the circle that need to be traversed since they are not covered
during the pass through the circle. It is not required to cover the circle in one pass,
rather segments of the circle can be covered by different sections of the route.
The Intersection Inspection Rural Postman Problem (IIRPP) is a hybrid of
arc routing and node routing problems. We address the problem on a mixed graph,
the most general case. In addition to solving an RPP, we have to make sure there
is at least one left turn or straight turn at an intersection that is required to be
168
inspected. We consider the RPP as a special case of the IIRPP when there are no
intersections to be inspected. Therefore, the IIRPP is at least as difficult to solve as
the RPP. Unlike the RPP with turn penalties, there is no route cost associated with
the turns in the IIRPP. In Figure 5.4, we show the RPP and the IIRPP solutions on
a small grid-like street network. Figure 5.4a shows the street network. The blue lines
are the edges (two-way street segments), and the blue arrows are the arcs (one-way
street segments). The red lines are the required edges, and the red arrows are the
required arcs. The green node is the starting and ending point of the route. The red
nodes are the intersections to be inspected. The length of each edge or arc is 1 unit.
Figure 5.4b shows the optimal RPP solution (a? b? c? d? e? f) with a route
length of 6 units. Figure 5.4c shows the optimal IIRPP solution (a? b? c? d?
e ? f ? g ? h ? i ? j) with a route length of 10 units. Figure 5.4d shows the
optimal IIRPP solution (a? b? c? d? e? f? g? h? i? j? k? l) with
a route length of 12 units. A route that is feasible for the RPP might not be feasible
for the IIRPP. A feasible IIRPP route is always feasible for the RPP. Therefore, the
optimal IIRPP objective value cannot be less than the optimal RPP objective value.
In Figure 5.4, we demonstrate that the optimal IIRPP objective value, for a street
network with a given set of required edges, required arcs, and intersections to be
inspected, is a non-decreasing function as we add more intersections for inspection.
169
Figure 5.5: A mixed graph G (street network) with four nodes. Costs are shown
adjacent to the two arcs and one edge.
5.2 Problem Formulations on a Mixed Graph
Consider a street network as a mixed graph denoted by G = (V,E?A), where
E denotes the set of the edges (two-way street segments), A denotes the set of the
arcs (one-way street segments), and V denotes the set of nodes (intersections). Let
cij ? 0 be the traversal cost (length) of street segment (i, j), where i ? V and j ? V .
It is possible to travel from i to j and also from j to i on an edge (i, j) ? E. However,
it is only possible to travel from i to j and not from j to i on an arc (i, j) ? A. In
Figure 5.5, we show a mixed graph G (street network) with four nodes, two arcs,
one edge, and traversal costs.
Let ER ? E and AR ? A be the set of required edges and arcs, respectively,
that need to be video recorded. VR = {i ? V |?j ? V, (i, j) ? AR or (j, i) ?
AR or (i, j) ? ER or (j, i) ? ER} ? V is the set of nodes defining the required
edges and arcs. GR = (VR, ER?AR) is the graph induced by the required edges and
arcs. The vertex sets of the connected components of GR are denoted by V1, . . . , Vp.
170
Let ?+(i) = {j ? V |(i, j) ? A} be the set of nodes connected to node i by an
outgoing arc from i. Let ??(i) = {j ? V |(j, i) ? A} be the set of nodes connected
to node i by an incoming arc to i. Let ?(i) = {j ? V |(i, j) or (j, i) ? E} be the
set of nodes connected to node i by an edge. VI ? V is the set of nodes that
need to be recorded. For practicality, we assume that VI ? VR. Let o ? VR \ VI
denote the starting and ending node of the route. At node i, RT (i) = {(k, i, j)|k ?
??(i)??(i), j ? ?+(i)??(i), (k, i)(i, j) is a right turn}, LT (i) = {(k, i, j)|k ? ??(i)?
?(i), j ? ?+(i) ? ?(i), (k, i)(i, j) is a left turn}, and ST (i) = {(k, i, j)|k ? ??(i) ?
?(i), j ? ?+(i)??(i), (k, i)(i, j) is a straight turn} denotes the right turns, left turns,
and straight turns, respectively, using 3-tuples. We define the non-negative integer
decision variable xij to be the number of times street segment (i, j) is traversed in
the route. The RPP formulation on the graph G is given on the next page.
The objective function (5.1) minimizes the total cost (length) of the route.
Constraint (5.2) ensures that the depot is a part of the route. Constraints (5.3)
ensure that the required edges are a part of the route. Constraints (5.4) are the
flow conservation constraints. Constraints (5.5) are the disjoint subtour elimination
constraints. Constraints (5.6) define the decision variables for the required arcs.
Constraints (5.7) define the decision variables for the non-required arcs. Constraints
(5.8) define the decision variables for the edges.
171
? ?
(RPP) min cijxij + cijxji (5.1)
? (i,j)?E?A (i,j)?E
s.t. xoj ? 1 (5.2)
j??+(o)??(o)
xij +?xji ? 1 ?(i, j?) ? ER (5.3)
? xij ? xji = 0 ?i ? V (5.4)?j??
+(i)??(i) ? j??
?(i)??(i)
? ?x ?ij ? 1 ?S = ?l?QVl, Q ? {1, . . . , p} (5.5)
i?S j?(?+(i)??(i))\S
xij ? 1 and integer ?(i, j) ? AR (5.6)
xij ? 0 and integer ?(i, j) ? A \ AR (5.7)
xij, xji ? 0 and integer ?(i, j) ? E (5.8)
5.2.1 IIRPP Formulation using Node Transformations
Let G1 = (V 1, A1) denote the transformed graph, where V 1 = V 1 ? V 1E A is the
set of nodes and A1 = A1 ?A1E A ?A1T is the set of arcs. V 1A = {ni,j,i, ni,j,j|(i, j) ? A}
and V 1E = {ni,j,i, nj,i,i, ni,j,j, nj,i,j|(i, j) ? E} are the set of nodes in G1 corresponding
to each arc (i, j) ? A and each edge (i, j) ? E, respectively, in the original graph
G. A1E = {(ni,j,i, ni,j,j), (nj,i,j, nj,i,i), (nj,i,i, ni,j,i), (ni,j,j, nj,i,j)|(i, j) ? E} and A1A =
{(ni,j,i, ni,j,j)|(i, j) ? A} are the set of arcs in G1 corresponding to each edge (i, j) ?
E and each arc (i, j) ? A, respectively, in the original graph G. The cost of the
arc (ni,j,i, ni,j,j) ? A1A is equal to the cost of the arc (i, j) ? A. The cost of the arcs
172
Figure 5.6: Transformed graph G1 from the original graph G shown in Figure 5.5.
(n 1i,j,i, ni,j,j) ? AE and (n 1j,i,j, nj,i,i) ? AE are equal to the cost of the edge (i, j) ? E.
Arcs (nj,i,i, ni,j,i) ? A1E and (n 1 1i,j,j, nj,i,j) ? AE represent the U-turns in G and have
a cost of zero. A1T = {(nk,i,i, ni,j,i)|i ? V, (k, i, j) ? RT (i) ? LT (i) ? ST (i)} is the
set of arcs in G1 representing the right turns, left turns, and straight turns in the
original graph G, each with a cost of zero.
In Figure 5.6, we show the transformed graph G1 from the original graph G
shown in Figure 5.5. Node 1 in G is split into nodes n121, n211, n141, and n311 in
G1. Node 2 in G is split into nodes n 1122 and n212 in G . Nodes 3 and 4 in G are
represented as nodes n313 and n144, respectively, in G
1. Arcs (1,4) and (3,1) in G
are represented as arcs (n141, n144) with a cost of c14 and (n313, n311) with a cost
of c31, respectively, in G
1. Edge (1,2) in G is represented as arcs (n121, n122) and
(n212, n211) each with a cost of c12 and U-turn arcs (n211, n121) and (n122, n212) each
with a cost of zero in G1. The right turn (3,1)(1,2), left turn (3,1)(1,4), and straight
173
turn (2,1)(1,4) in G are represented as arcs (n311, n121), (n311, n141), and (n211, n141),
respectively, in G1, each with a cost of zero.
? ?
(IIRPP-F1) min cijxn n + cijxn n (5.9)i,j,i i,j,j j,i,j j,i,i
? (i,j)?E?A (i,j)?E
s.t. xn ? 1 (5.10)o,j,ono,j,j
j??+(o)??(o)
xn n + xn ?n ? 1 ?(i, j) ? ER (5.11)i,j,i i,j,j j,i,j j,i,i
xn ? x = 0 ?n 1i,j,ini,j,j nk,i,ini,j,i i,j,i ? VA (5.12)
? k???(i)??(i)
xn n ? x = 0 ?n ? V 1 (5.13)i,j,j j,k,j ni,j,ini,j,j i,j,j A
k??+(j)??(j) ?
xn n ? xn n ? xn n = 0 ?n 1i,j,i i,j,j j,i,i i,j,i k,i,i i,j,i i,j,i ? VE (5.14)
? k???(i)??(i)\{j}
xn n + xn ? x = 0 ?n ? V 1 (5.15)i,j,j j,k,j i,j,jnj,i,j ni,j,ini,j,j i,j,j E
k??+(j)???(j)\{i}
xn n ? 1 ?i ? VI (5.16)k,i,i i,j,i
(k,?i,j)?LT?(i)?ST (i)? ? ??xn n ? 1 ?S1 = ? 1l?QVl , Q ? {1, . . . , p}k,i,i i,j,i
n 1k,i,i?S (n 1 1k,i,i,ni,j,i)?AT :ni,j,i?/S
(5.17)
xn ? 1 and integer ?(i, j) ? A (5.18)i,j,ini,j,j R
xn n ? 0 and integer ?(i, j) ? A \ AR (5.19)i,j,i i,j,j
xn n , xn n , xn n , xn n ? 0 and integer ?(i, j) ? E (5.20)i,j,i i,j,j j,i,j j,i,i j,i,i i,j,i i,j,j j,i,j
x 1n ? 0 and integer ?(n , n ) ? A (5.21)k,i,ini,j,i k,i,i i,j,i T
174
Let V 1 1 1 1R = VE ? VA ? V . V 1E = {nR R R i,j,i, nj,i,i, ni,j,j, nj,i,j|(i, j) ? ER} ?
V 1E is the set of nodes in G
1 corresponding to the required edges in G. V 1A =R
{ni,j,i, ni,j,j|(i, j) ? A } ? V 1R A is the set of nodes in G1 corresponding to the re-
quired arcs in G. Let A1R = A
1
E ? A1 ? A1 ? A1. A1R AR TR A = {(nR i,j,i, ni,j,j)|(i, j) ?
A 1 1R} ? AA is the set of arcs in G corresponding to the required arcs in G.
A1E = {(ni,j,i, ni,j,j), (n 1R j,i,j, nj,i,i), (nj,i,i, ni,j,i), (ni,j,j, nj,i,j)|(i, j) ? ER} ? AE is the
set of arcs in G1 corresponding to the required edges in G. A1T = {(nR k,i,i, ni,j,i)|i ?
V, (k, i, j) ? RT (i) ? LT (i) ? ST (i), n 1 1 1k,i,i ? VR, ni,j,i ? VR} ? AT is the set of
arcs in G1 corresponding to the turns between required arcs or edges in G. Let
G1R = (V
1
R, A
1
R) be the graph induced from G
1. The vertex sets of the connected
components of G1R are denoted by V
1
1 , . . . , V
1
p . We define the non-negative integer
decision variable xn n to be the number of times arc (na,b,c, nd,e,f ) is traversed ina,b,c d,e,f
the route on G1. The IIRPP formulation on the transformed graph G1 (IIRPP-F1)
based on the original graph G is given on the previous page.
The objective function (5.9) minimizes the total cost (length) of the route.
Constraint (5.10) ensures that the depot in G is a part of the route. Constraints
(5.11) ensure that the required edges in G are a part of the route. Constraints (5.12),
(5.13), (5.14), and (5.15) are the flow conservation constraints for the nodes in G1.
Constraints (5.16) ensure that the nodes in G that need to be video recorded are
covered by at least one left turn or straight turn. Constraints (5.17) are the disjoint
subtour elimination constraints. Constraints (5.18) define the decision variables cor-
responding to the required arcs in G. Constraints (5.19) define the decision variables
corresponding to the non-required arcs in G. Constraints (5.20) define the decision
175
variables corresponding to the edges inG. Constraints (5.21) define the decision vari-
ables for the arcs in G1 representing the right turns, left turns, and straight turns in
G. This formulation does not limit an intersection being covered with only left turns
and straight turns. If the field of view of the camera improves in the future enabling
two right turns to cover an intersection (the two right turns as shown in Figure 5.2
might be able to cover?the intersection with an improv?ed camera), we would modify
constraints (5.16) as 1(k,i,j)?LT (i)?ST (i) xn n + (k,i,j)?RT (i) xn n ? 1 fork,i,i i,j,i 2 k,i,i i,j,i
all i ? VI .
5.2.2 IIRPP Formulation using Path Transformations
Let G2 = (V 2, A2) denote the transformed graph, where V 2 = V ? V 2 ? V 2E A
is the set of nodes and A2 = A2E ? A2A is the set of arcs. V 2A = {ni,j|(i, j) ? A}
and V 2E = {ni,j, nj,i|(i, j) ? E} are the set of nodes in G2 corresponding to each arc
(i, j) ? A and each edge (i, j) ? E, respectively, in the original graph G. A2A =
{(i, ni,j), (n 2i,j, j)|(i, j) ? A} and AE = {(i, ni,j), (ni,j, j), (j, nj,i), (nj,i, i)|(i, j) ? E}
are the set of arcs in G2 corresponding to each arc (i, j) ? A and each edge (i, j) ?
E, respectively, in the original graph G. The cost of the arcs (i, n ) ? A2i,j A and
(n 2i,j, j) ? AA are half the cost of the arc (i, j) ? A. The cost of the arcs (i, n ) ? A2i,j E,
(ni,j, j) ? A2E, (j, n 2j,i) ? AE, and (nj,i, i) ? A2E are half the cost of the edge (i, j) ? E.
In Figure 5.7, we show the transformed graph G2 from the original graph G
shown in Figure 5.5. All four nodes in G are also in G2. Nodes n12, n21, n14, and
n31 are added in G
2. Arc (1,4) in G is replaced by arcs (1, n14) and (n14, 4) each
176
Figure 5.7: Transformed graph G2 from the original graph G shown in Figure 5.5.
with a cost of c 214/2 in G . Arc (3,1) in G is replaced by arcs (3, n31) and (n31, 1)
each with a cost of c31/2 in G
2. Edge (1,2) in G is replaced by arcs (1, n12), (n12, 2),
(2, n21), and (n21, 1) each with a cost of c12/2 in G
2.
Let V 2 = V ?V 2 ?V 2 ? V 2. V 2R R E A E = {n , n |(i, j) ? E } ? V 2 is the set ofR R R i,j j,i R E
nodes in G2 corresponding to the required edges in G. V 2 2A = {nR i,j|(i, j) ? AR} ? VA
is the set of nodes in G2 corresponding to the required arcs in G. Let A2R = A
2
E ?R
A2 ? A2. A2A A = {(i, ni,j), (ni,j, j)|(i, j) ? A 2R} ? AA is the set of arcs in G2 corre-R R
sponding to the required arcs in G. A2E = {(i, ni,j), (ni,j, j), (j, nj,i), (nj,i, i)|(i, j) ?R
ER} ? A2E is the set of arcs in G2 corresponding to the required edges in G. Let
G2R = (V
2 2
R, AR) be the graph induced from G
2. The vertex sets of the connected
177
components of G2R are denoted by V
2
1 , . . . , V
2
p . We define the binary decision variable
zn bn denoting the first time arcs (na,b, b) and (b, nb,c) are traversed consecutivelya,b b,c
in the route on G2, i.e., the first time the turn (a, b)(b, c) is made in G. We define
the non-negative integer decision variables yn b and ybn to be the number of timesa,b b,c
arcs (na,b, b) and (b, nb,c) are traversed non-consecutively or consecutively after the
first time, respectively, in the route on G2. The IIRPP formulation on the trans-
formed graph G2 (IIRPP-F2) based on the original graph G is given on the next
page.
The objective function (5.22) minimizes the total cost (length) of the route.
Constraint (5.23) ensures that the depot in G is a part of the route. Constraints
(5.24) ensure that the required arcs in G are a part of the route. Constraints (5.25)
ensure that the required edges in G are a part of the route. Constraints (5.26) and
(5.27) are the flow conservation constraints for the nodes in G2. Constraints (5.28)
ensure that the nodes in G that need to be video recorded are covered by at least one
left turn or straight turn. Constraints (5.29) are the disjoint subtour elimination
constraints. Constraints (5.30) define the decision variables corresponding to the
arcs in G. Constraints (5.31) define the decision variables corresponding to the
edges in G. Constraints (5.32) define the decision variables corresponding to the
right turns, left turns, and straight turns in G. This formulation does not limit
an intersection being covered with only left turns and straight turns. If the field
of view of the camera improves in the future enablin?g two right turns to cover
an?intersection, we would modify constraints (5.28) as (k,i,j)?LT (i)?ST (i) zn +k,iini,j
1
2 (k,i,j)?RT (i) zn ? 1 for all i ? V .k,iini,j I
178
?
1 ? ? ?
(IIRPP-F2) min c ?y + z + y + z ?
?
? ij jn2 j,i nk,jjnj,i nj,ii nj,iini,k? (i,j)?E ? k??
?(j)??(j)\{i} ? k??
+(i)???(i)\{j}
1
+ c ?ij yin + zn in + yn j + z ?
2 i,j k,i i,j i,j
ni,jjnj,k
(i,j)?E?A k???(i)??(i)\{j} k??+(j)??(j)\{i}
? (5.22)
s.t. yon ? 1 (5.23)o,j
j??+(o?)??(o) ?
yin + zn in + yn j + zn jn ? 2 ?(i, j) ? A (5.24)i,j k,i i,j i,j i,j j,k R
k???(i?)??(i) k??+(j)??(?j)
yin + zn + y + z +i,j k,iini,j ni,jj ni,jjnj,k
k???(i?)??(i)\{j} k??+(j?)??(j)\{i}
yjn + zn jn + yn i + zn in ? 2 ?(i, j) ? Ej,i k,j j,i j,i j,i i,k R
k???(j)??(j)\{i} k??+(i)??(i)\{j}
? ? (5.25)
yn + z ? y ? z = 0 ?n ? V 2 ? V 2i,jj ni,jjnj,k ini,j nk,iini,j i,j E A
k??+(j)??(j)\{i} k???(i)??(i)\{j}
? ? (5.26)
yin ? yn i = 0 ?i ? V (5.27)i,j j,i
j??+(i)???(i) j???(i)??(i)
? zn in? ? 1 ?i ? VI ?? (5.28)k,i i,j(?k,i,j)?LT (i)?S?T (i)? ? ?y ?? 2 2in + zn in ? 1 ?S = ?l?QVl , Q ? {1, . . . , p}i,j k,i i,j
i?S2 j?(?+(i)??(i))\S2 k?(??(i)??(i))?S2
(5.29)
yin , yn j ? 0 and integer ?(i, j) ? A (5.30)i,j i,j
yin , yn j, yjn , yn i ? 0 and integer ?(i, j) ? E (5.31)i,j i,j j,i j,i
zn in ? {0, 1} ?i ? V, (k, i, j) ? RT (i1)7?9 LT (i) ? ST (i) (5.32)k,i i,j
Figure 5.8: (Color online) RPP-H route on a small grid-like street network.
5.2.3 Heuristics
We develop three heuristics for the IIRPP. The first heuristic (RPP-H) starts
by solving the RPP optimally on G. For each i ? VI that is not covered by at least
one left turn or straight turn, the RPP route is locally modified to add the cheapest
possible left turn or straight turn at i without affecting other parts of the route.
Therefore, the RPP objective value is a lower bound for the RPP-H objective value.
The second heuristic (IIRPP-F1-H) starts by solving the IIRPP-F1 optimally on
G1 without the disjoint subtour elimination constraints 5.17. The third heuristic
(IIRPP-F2-H) starts by solving the IIRPP-F2 optimally on G2 without the disjoint
subtour elimination constraints 5.29. The disjoint subtours obtained are connected
by solving a generalized traveling salesman problem (Kara et al. 2012) optimally
on G1 and G2 to obtain the IIRPP-F1-H and IIRPP-F2-H solutions, respectively.
In Figure 5.8, we show the RPP-H solution on a small grid-like street network.
180
Figure 5.9: (Color online) IIRPP-F1-H and IIRPP-F2-H produce the same route on
a small grid-like street network.
The blue lines are the edges and the red lines are the required edges. There are no
arcs in this grid. The green node is the starting and ending point of the route. The
red nodes are the intersections to be inspected. The length of each edge is 1 unit.
The optimal RPP solution (a? b? c? d? e? f) has a route length of six units.
Both the red nodes 5 and 6 are not covered by the optimal RPP solution. Figure
5.8a shows the intermediate RPP-H solution (a? b? b1 ? b2 ? c? d? e? f)
with a route length of eight units covering node 5. Figure 5.8b shows the final RPP-
H solution (a ? b ? b1 ? b2 ? c ? c1 ? c2 ? d ? e ? f) with a route length
of 10 units covering nodes 5 and 6. In Figure 5.9, we show the IIRPP-F1-H and
IIRPP-F2-H solution on a small grid-like street network. Figures 5.8 and 5.9 use
the same grid. IIRPP-F1-H and IIRPP-F2-H heuristics produce the same route on
this grid. Figure 5.9a shows the intermediate IIRPP-F1-H and IIRPP-F2-H solution
obtained by optimally solving the IIRPP-F1 and IIRPP-F2 without the respective
181
disjoint subtour elimination constraints. The intermediate solution has two disjoint
subtours (a? b) and (c? d? e? f) with a total route length of six units. In this
example, the first subtour (a? b) connects the green node and the second subtour
(c? d? e? f) covers all the red edges and red nodes. Figure 5.9b shows the final
IIRPP-F1-H and IIRPP-F2-H solution obtained by optimally solving a generalized
traveling salesman problem connecting the two disjoint subtours. The final IIRPP-
F1-H and IIRPP-F2-H solution (a ? b ? b1 ? c ? d ? e ? f ? f1) has a route
length of eight units. The optimal IIRPP solution (1? 2? 3? 6? 5? 2? 1)
on this grid has a route length of six units.
5.3 Computational Experiments
5.3.1 Test Instances
We randomly generate test instances. Each graph (street network) represents
a strongly connected grid-like structure. Each node is located within 10% of its
exact position on the grid in order to produce street segments with slightly different
lengths. The cost assigned to each street segment is proportional to the Euclidean
distance between the nodes defining the street segment, where the average cost for
a street segment is equal to 100. We have four scenarios for the number of nodes
|V | ? {5 ? 5, 6 ? 6, 7 ? 7, 8 ? 8}. We first fix the position of the nodes. We then
randomly assign a percentage pa of street segments to be arcs and a percentage 1?pa
of street segments to be edges, with two scenarios for pa ? {5%, 15%}. We randomly
remove a percentage pd of arcs and the same percentage pd of edges, with two
182
Figure 5.10: Example of an 8? 8 instance.
scenarios for pd ? {5%, 15%}. We use only strongly connected graphs. We randomly
assign a percentage pr of the remaining arcs as required and the same percentage pr
of the remaining edges as required, with two scenarios for pr ? {20%, 40%}. Finally,
we assign a percentage pi of nodes defining the required edges and arcs to be video
recorded, with two scenarios for pi ? {20%, 40%}. We only use a graph that has at
least one left turn or straight turn possible for each node that needs to be recorded.
Therefore, for each of the four grids, we have 16 (2?2?2?2) scenarios. We generate
10 graphs for each scenario, giving a total of 640 (4 ? 16 ? 10) test instances. In
Figure 5.10, we show an example of an 8? 8 instance.
183
5.3.2 Computational Results
In our computational experiments, we use Gurobi version 8.1, an i7 CPU
with 32GB RAM, and a one-hour time limit. Practically, we do not need to fully
transform the graph G to solve IIRPP-F1 and IIRPP-F2. For G1, we only need
to transform the nodes in VI , and the edges and arcs connected to those nodes.
The remaining nodes, edges, and arcs need not be transformed because that would
unnecessarily increase the problem complexity. For G2, we only need to transform
the edges and arcs defined by the nodes in VI on both ends. The remaining edges
and arcs need not be transformed because that would unnecessarily increase the
problem complexity. Therefore, the transformed graphs G1 and G2 could be mixed
graphs. The RPP is solved optimally for all 640 instances within the time limit.
All three heuristics RPP-H, IIRPP-F1-H, and IIRPP-F2-H are completed on all 640
instances within the time limit. For IIRPP-F1 and IIRPP-F2, we obtain the best
feasible solution and the best lower bound for instances that are not solved optimally
within the time limit. However, for IIRPP-F1 on eight instances and for IIRPP-F2
on one instance, a feasible solution was not found within the time limit.
In Tables 5.1 to 5.4, we compare the transformed graphs G1 and G2 to the
original graph G. The first seven columns give the instance parameters and the
number of nodes, edges, and arcs in G. The graph transformations depend on ER,
AR, and VI . The next five columns give the number of nodes, edges, and arcs in
G1 averaged over 10 instances, the number of instances out of 10 that are optimally
solved (NOS) by IIRPP-F1, and the number of instances out of 10 that are not
184
optimally solved but for which a feasible solution is obtained (NFS) by IIRPP-F1.
The last five columns give the number of nodes, edges, and arcs in G2 averaged over
10 instances, the number of instances out of 10 that are optimally solved (NOS) by
IIRPP-F2, and the number of instances out of 10 that are not optimally solved but
for which a feasible solution is obtained (NFS) by IIRPP-F2.
In Table 5.1, the 160 instances on the grid with 25 nodes have an average of
33.00 edges and 4.00 arcs. G1 has an average of 45.33 nodes, 22.54 edges, and 63.01
arcs. G2 has an average of 27.37 nodes, 31.87 edges, and 8.63 arcs. Both IIRPP-F1
and IIRPP-F2 produced optimal solutions on all 160 instances. In Table 5.2, the
160 instances on the grid with 36 nodes have an average of 49.25 edges and 5.75
arcs. G1 has an average of 67.63 nodes, 33.33 edges, and 97.89 arcs. G2 has an
average of 40.26 nodes, 47.21 edges, and 14.09 arcs. IIRPP-F1 produced optimal
solutions on 158 instances and feasible solutions on the remaining two instances.
IIRPP-F2 produced optimal solutions on all 160 instances. In Table 5.3, the 160
instances on the grid with 49 nodes have an average of 68.75 edges and 7.75 arcs. G1
has an average of 94.85 nodes, 31.17 edges, and 142.33 arcs. G2 has an average of
55.21 nodes, 65.83 edges, and 19.80 arcs. IIRPP-F1 produced optimal solutions on
150 instances and feasible solutions on nine instances. IIRPP-F1 did not produce
a feasible solution on the remaining one instance. IIRPP-F2 produced optimal
solutions on 159 instances and feasible solution on the remaining one instance. In
Table 5.4, the 160 instances on the grid with 64 nodes have an average of 91.75
edges and 10 arcs. G1 has an average of 126.88 nodes, 35.21 edges, and 194.76 arcs.
G2 has an average of 72.64 nodes, 87.64 edges, and 26.86 arcs. IIRPP-F1 produced
185
optimal solutions on 112 instances and feasible solutions on 41 instances. IIRPP-F1
did not generate a feasible solution to the remaining seven instances. IIRPP-F2
produced optimal solutions on 142 instances and feasible solutions on 17 instances.
IIRPP-F2 did not produce a feasible solution on the remaining one instance. The
only instance on 64 nodes that did not have a feasible solution by IIRPP-F2 was
also among the seven instances that did not have a feasible solution by IIRPP-F1.
With increasing grid size, the size of the transformed graph increases and it is more
difficult to generate the optimal solution within the time limit. However, the number
of nodes and number of arcs are always greater for G1 compared to G2 which makes
IIRPP-F1 harder to solve.
In Tables 5.5 to 5.8, we show the percentage optimality gap between the best
feasible solution (V) and the best lower bound (B) for the IIRPP formulations.
The percentage gap is 100? V?B . In calculating the average percentage optimality
V
gap, we use only those instances that have feasible solutions for both IIRPP-F1
and IIRPP-F2. The first four columns give the instance parameters. The next two
columns give the average percentage gap for the IIRPP-F1 for the two pi scenarios.
The last two columns give the average percentage gap for the IIRPP-F2 for the two
pi scenarios.
In Table 5.5, all instances are optimal for both IIRPP-F1 and IIRPP-F2.
In Table 5.6, all instances with pi = 0.2 are optimal and the average percentage
gap is 0.26% for instances with pi = 0.4 for IIRPP-F1. All instances are optimal
for IIRPP-F2. In Table 5.7, the average percentage gap is 0.25% for instances
with pi = 0.2 and 0.27% for instances with pi = 0.4 for IIRPP-F1. All instances
186
are optimal for IIRPP-F2. In Table 5.8, the average percentage gap is 2.45% for
instances with pi = 0.2 and 1.72% for instances with pi = 0.4 for IIRPP-F1. The
average percentage gap is 0.50% for instances with pi = 0.2 and 0.78% for instances
with pi = 0.4 for IIRPP-F2. The larger percentage optimality gaps for IIRPP-F1
compared to IIRPP-F2 indicate that IIRPP-F1 is harder to solve.
In Tables 5.9 to 5.12, we show the running times for the RPP and IIRPP
formulations. We use all instances in calculating the average RPP running times.
In calculating the average IIRPP running times, we use only those instances that
have feasible solutions for both IIRPP-F1 and IIRPP-F2. The first four columns
give the instance parameters. The fifth column gives the average running time for
the RPP, which is equivalent to the IIRPP with pi = 0. The next two columns give
the average running time for the IIRPP-F1 for the two pi scenarios. The last two
columns give the average running time for the IIRPP-F2 for the two pi scenarios.
In Table 5.9, the average running time is 0.02 seconds for RPP. The average
running time is 0.10 seconds for instances with pi = 0.2 and 0.28 seconds for in-
stances with pi = 0.4 for IIRPP-F1. The average running time is 0.06 seconds for
instances with pi = 0.2 and 0.07 seconds for instances with pi = 0.4 for IIRPP-F2.
In Table 5.10, the average running time is 0.04 seconds for RPP. The average run-
ning time is 1.37 seconds for instances with pi = 0.2 and 107.28 seconds for instances
with pi = 0.4 for IIRPP-F1. The average running time is 0.36 seconds for instances
with pi = 0.2 and 2.75 seconds for instances with pi = 0.4 for IIRPP-F2. In Table
5.11, the average running time is 0.05 seconds for RPP. The average running time
is 261.90 seconds for instances with pi = 0.2 and 286.15 seconds for instances with
187
pi = 0.4 for IIRPP-F1. The average running time is 21.17 seconds for instances
with pi = 0.2 and 56.82 seconds for instances with pi = 0.4 for IIRPP-F2. In Table
5.12, the average running time is 0.09 seconds for RPP. The average running time is
1076.64 seconds for instances with pi = 0.2 and 1338.47 seconds for instances with
pi = 0.4 for IIRPP-F1. The average running time is 320.72 seconds for instances
with pi = 0.2 and 747.15 seconds for instances with pi = 0.4 for IIRPP-F2. The
larger running times for IIRPP-F1 compared to IIRPP-F2 indicate that IIRPP-F1
is harder to solve.
In Tables 5.13 to 5.16, we show the percentage gap between the RPP optimal
solution (RPP OS) and the best feasible solutions from the IIRPP (IIRPP FS)
formulations. The percentage gap is 100? IIRPP FS?RPP OS . We use all instances in
RPP OS
calculating the average RPP optimal solutions. In calculating the average percentage
gap, we use only those instances that have feasible solutions for both IIRPP-F1 and
IIRPP-F2. The first four columns give the instance parameters. The fifth column
gives the average optimal solution for the RPP, which is equivalent to the IIRPP
with pi = 0. The next two columns give the average percentage gap for the IIRPP-
F1 for the two pi scenarios. The last two columns give the average percentage gap
for the IIRPP-F2 for the two pi scenarios.
In Table 5.13, the average RPP optimal solution is 2088.01. The average
percentage gap is 3.86% for instances with pi = 0.2 and 7.41% for instances with
pi = 0.4 for both IIRPP-F1 and IIRPP-F2. In Table 5.14, the average RPP optimal
solution is 3110.55. The average percentage gap is 3.73% for instances with pi = 0.2
and 8.56% for instances with pi = 0.4 for IIRPP-F1. The average percentage gap is
188
3.73% for instances with pi = 0.2 and 8.45% for instances with pi = 0.4 for IIRPP-
F2. In Table 5.15, the average RPP optimal solution is 4245.20. The average
percentage gap is 3.64% for instances with pi = 0.2 and 7.76% for instances with
pi = 0.4 for IIRPP-F1. The average percentage gap is 3.62% for instances with
pi = 0.2 and 7.73% for instances with pi = 0.4 for IIRPP-F2. In Table 5.16, the
average RPP optimal solution is 5695.42. The average percentage gap is 6.04% for
instances with pi = 0.2 and 8.92% for instances with pi = 0.4 for IIRPP-F1. The
average percentage gap is 3.75% for instances with pi = 0.2 and 8.51% for instances
with pi = 0.4 for IIRPP-F2. The larger percentage gaps for IIRPP-F1 compared to
IIRPP-F2 indicate that IIRPP-F1 is harder to solve.
In Tables 5.17 to 5.20, we compare the RPP based and IIRPP based heuristics
with the IIRPP formulations. We show the percentage gap between the heuristic so-
lutions (H FS) and the IIRPP best feasible solution (IIRPP BFS). For each instance,
the IIRPP best feasible solution is the lower of the two best feasible solutions from
IIRPP-F1 and IIRPP-F2. The percentage gap is 100? H FS?IIRPP BFS . We compare
IIRPP BFS
the running times of the heuristic solutions with the best running time of the IIRPP
best feasible solution. If the best feasible solutions from IIRPP-F1 and IIRPP-F2
are equal for an instance, we select the lower of the two running times. In calcu-
lating the average heuristic solutions and the average running times, we use only
those instances that have at least a feasible solution from IIRPP-F1 or IIRPP-F2.
The first five columns give the instance parameters. The sixth column gives the
average IIRPP best feasible solution and the seventh column gives the average best
running time. The next three columns give the average percentage gap and the
189
average running time for RPP-H, and the number of instances out of 10 (NBS) for
which the RPP-H solution is equal to or lower than the corresponding IIRPP best
feasible solution. The next three columns give the average percentage gap and the
average running time for IIRPP-F1-H, and the number of instances out of 10 (NBS)
for which the IIRPP-F1-H solution is equal to or lower than the corresponding
IIRPP best feasible solution. The last three columns give the average percentage
gap and the average running time for IIRPP-F2-H, and the number of instances
out of 10 (NBS) for which the IIRPP-F2-H solution is equal to or lower than the
corresponding IIRPP best feasible solution.
In Table 5.17, the average IIRPP best feasible solution is 2205.38 and the
average best running time is 0.06 seconds. The average percentage gap is 8.30% and
the average running time is 0.03 seconds for RPP-H. The RPP-H solutions are at
least as good as the IIRPP best feasible solutions on an average of 2.94 out of 10
instances. The average percentage gap is 16.82% and the average running time is
0.13 seconds for IIRPP-F1-H. The IIRPP-F1-H solutions are at least as good as the
IIRPP best feasible solutions on an average of 2.00 out of 10 instances. The average
percentage gap is 16.42% and the average running time is 0.07 seconds for IIRPP-
F2-H. The IIRPP-F2-H solutions are at least as good as the IIRPP best feasible
solutions on an average of 1.81 out of 10 instances. In Table 5.18, the average
IIRPP best feasible solution is 3299.45 and the average best running time is 1.56
seconds. The average percentage gap is 9.81% and the average running time is 0.06
seconds for RPP-H. The RPP-H solutions are at least as good as the IIRPP best
feasible solutions on an average of 1.44 out of 10 instances. The average percentage
190
gap is 20.37% and the average running time is 0.40 seconds for IIRPP-F1-H. The
IIRPP-F1-H solutions are at least as good as the IIRPP best feasible solutions on an
average of 0.50 out of 10 instances. The average percentage gap is 20.38% and the
average running time is 0.14 seconds for IIRPP-F2-H. The IIRPP-F2-H solutions are
at least as good as the IIRPP best feasible solutions on an average of 0.50 out of 10
instances. In Table 5.19, the average IIRPP best feasible solution is 4501.58 and the
average best running time is 76.43 seconds. The average percentage gap is 11.14%
and the average running time is 0.10 seconds for RPP-H. The RPP-H solutions are
at least as good as the IIRPP best feasible solutions on an average of 0.75 out of 10
instances. The average percentage gap is 22.21% and the average running time is
1.10 seconds for IIRPP-F1-H. The IIRPP-F1-H solutions are at least as good as the
IIRPP best feasible solutions on an average of 0.31 out of 10 instances. The average
percentage gap is 21.43% and the average running time is 0.31 seconds for IIRPP-
F2-H. The IIRPP-F2-H solutions are at least as good as the IIRPP best feasible
solutions on an average of 0.25 out of 10 instances. In Table 5.20, the average
IIRPP best feasible solution is 6111.23 and the average best running time is 673.01
seconds. The average percentage gap is 10.26% and the average running time is 0.20
seconds for RPP-H. The RPP-H solutions are at least as good as the IIRPP best
feasible solutions on an average of 0.63 out of 10 instances. The average percentage
gap is 20.86% and the average running time is 2.87 seconds for IIRPP-F1-H. The
IIRPP-F1-H solutions are at least as good as the IIRPP best feasible solutions on an
average of 0.31 out of 10 instances. The average percentage gap is 20.90% and the
average running time is 0.67 seconds for IIRPP-F2-H. The IIRPP-F2-H solutions are
191
at least as good as the IIRPP best feasible solutions on an average of 0.25 out of 10
instances. IIRPP-F1-H and IIRPP-F2-H produce similar average percentage gaps.
However, IIRPP-F2-H is faster than IIRPP-F1-H. RPP-H is the fastest and the best
of the three heuristics. The average percentage gap for RPP-H is around 10% from
the IIRPP best feasible solution on instances up to 64 nodes within 0.20 seconds,
which is half of the average percentage gaps for IIRPP-F1-H and IIRPP-F2-H.
5.3.3 Summary of Results
In Table 5.21, we summarize the results shown in Tables 5.1 to 5.4. As the size
of the grid increases from 25 to 64, the average number of arcs increases from 4.00
to 10.00 in G. The average number of nodes increases from 45.33 to 126.88 and the
average number of arcs increases from 63.01 to 194.76 in G1. The average number
of nodes increases from 27.37 to 72.64 and the average number of arcs increases
from 8.63 to 26.86 in G2. This shows that the graph transformation required to
solve IIRPP-F2 is significantly smaller than the graph transformation required to
solve IIRPP-F1. IIRPP-F1 produced optimal solutions on 580 instances, feasible
solutions on 52 instances, and did not generate a feasible solution to the remaining
eight instances. IIRPP-F2 produced optimal solutions on 621 instances, feasible
solutions on 18 instances, and did not generate a feasible solution to the remaining
one instance. This clearly shows the difference in solution quality between IIRPP-F1
and IIRPP-F2. The smaller size of G2 computationally helps IIRPP-F2 to produce
better solutions compared to IIRPP-F1 within a reasonable time limit.
192
In Table 5.22, we summarize the results shown in Tables 5.5 to 5.8. As the size
of the grid increases from 25 to 64, the average percentage optimality gap increases
from 0.00% to 2.45% for instances with pi = 0.2 and from 0.00% to 1.72% for
instances with pi = 0.4 for IIRPP-F1. All instances both with pi = 0.2 and pi = 0.4
are optimal up to the grid size of 49 for IIRPP-F2. Instances with pi = 0.2 and
pi = 0.4 have an average percentage optimality gap of 0.50% and 0.78%, respectively,
for the grid size of 64 for IIRPP-F2. This shows that the IIRPP-F2 produces better
quality solutions compared to IIRPP-F1.
In Table 5.23, we summarize the results shown in Tables 5.9 to 5.12. As the
size of the grid increases from 25 to 64, the average running time increases from
0.02 seconds to 0.09 seconds for RPP. The average running time increases from 0.10
seconds to 1076.64 seconds for instances with pi = 0.2 and from 0.28 seconds to
1338.47 seconds for instances with pi = 0.4 for IIRPP-F1. The average running
time increases from 0.06 seconds to 320.72 seconds for instances with pi = 0.2 and
from 0.07 seconds to 747.15 seconds for instances with pi = 0.4 for IIRPP-F2. The
average running times for RPP is significantly smaller than IIRPP-F1 and IIRPP-F2
indicating the difference in complexity between RPP and IIRPP. IIRPP-F2 is faster
than IIRPP-F1 by 3.4 times for instances with pi = 0.2 and 1.8 times for instances
with pi = 0.4. This shows that IIRPP-F2 can find solutions faster than IIRPP-F1
because of the smaller size of G2. Instances with higher pi values takes more time
to be solved.
In Table 5.24, we summarize the results shown in Tables 5.13 to 5.16. As the
size of the grid increases from 25 to 64, the average optimal solution increases from
193
2088.01 to 5695.42 for RPP. The average percentage gap increases from 3.86% to
6.04% for instances with pi = 0.2 and from 7.41% to 8.92% for instances with pi = 0.4
for IIRPP-F1. The average percentage gap remains about 3.80% for instances with
pi = 0.2 and increases from 7.41% to 8.51% for instances with pi = 0.4 for IIRPP-
F2. The average route lengths for RPP increases between 4% and 9% to cover
the intersections that need to be photographed. When all instances in a group are
optimally solved by both IIRPP-F1 and IIRPP-F2, the average percentage gap is
equal for IIRPP-F1 and IIRPP-F2. Otherwise, the average percentage gap is lower
for IIRPP-F2 than IIRPP-F1 because for most of the instances IIRPP-F2 produces
better quality solutions compared to IIRPP-F1.
In Table 5.25, we summarize the results shown in Tables 5.17 to 5.20. As the
size of the grid increases from 25 to 64, the average IIRPP best feasible solution
increases from 2205.38 to 6111.23 and the average best running time increases from
0.06 seconds to 673.01 seconds. The average percentage gap remains between 8.30%
to 11.14% and the average running time increases from 0.03 seconds to 0.20 seconds
for RPP-H. The number of RPP-H solutions that are at least as good as the IIRPP
best feasible solutions decreases from 2.94 to 0.63 out of 10 instances. The average
percentage gap remains between 16.82% to 22.21% and the average running time
increases from 0.13 seconds to 2.87 seconds for IIRPP-F1-H. The number of IIRPP-
F1-H solutions that are at least as good as the IIRPP best feasible solutions decreases
from 2.00 to 0.31 out of 10 instances. The average percentage gap remains between
16.42% to 21.43% and the average running time increases from 0.13 seconds to 2.87
seconds for IIRPP-F2-H. The number of IIRPP-F2-H solutions that are at least
194
as good as the IIRPP best feasible solutions decreases from 1.81 to 0.31 out of
10 instances. IIRPP-F1-H and IIRPP-F2-H have similar average percentage gaps
between 16.50% and 22.50%. However, IIRPP-F2-H is about 3.5 times faster on
average than IIRPP-F1-H indicating that the underlying graph transformation for
IIRPP-F2 is computationally easier than the underlying graph transformation for
IIRPP-F1. RPP-H is about 3.3 times faster on average than IIRPP-F2-H and has
half the average percentage gap of IIRPP-F2-H. On average, RPP-H comes within
11% of the IIRPP best feasible solution in 0.20 seconds for instances with grid size
of 64.
5.4 Conclusions and Future Directions
We introduced an important variant of the RPP involving turns (IIRPP) which
is relevant for road inspections. We gave two formulations IIRPP-F1 and IIRPP-F2,
based on two different graph transformations. Using running times and optimality
gaps, we showed that IIRPP-F2 is a faster and stronger formulation compared to
IIRPP-F1 because of the smaller size of the transformed graph G2 compared to
G1. With increasing size of the instances, the running times and the number of
instances that could not be solved optimally with the formulations increased and
are substantial for IIRPP-F1. Even IIRPP-F2 could not solve 18 out of 160 instances
with a grid size of 64 within one hour. We also showed that with a larger number
of intersections to be inspected (larger values of pi), the route lengths increase, and
the instances take longer to solve. We developed a RPP based heuristic and two
195
IIRPP based heuristics. RPP-H performed better than IIRPP-F1-H and IIRPP-
F2-H, coming within 11% of the IIRPP solutions in less than a fraction of a second
for instances with a grid size of 64. In future work, we expect to develop branch-
and-cut algorithms to solve larger instances optimally within a reasonable amount of
time. We also plan to develop smarter heuristics to reduce the gap between heuristic
solutions and optimal IIRPP solutions. We have some ideas on how to accomplish
this, but leave it for further study.
196
Table 5.1: Comparison of the transformed graphs G1 and G2 with the original graph
G on 25 nodes.
197
IIRPP-F1 IIRPP-F2
|V | p p |E| |A| p p |V 1| |E1| |A1a d r i | NOS NFS |V 2| |E2| |A2| NOS NFS
25 0.05 0.05 37 2 0.2 0.2 36.20 30.60 37.30 10 0 25.00 37.00 2.00 10 0
25 0.05 0.05 37 2 0.2 0.4 49.40 24.50 75.20 10 0 28.30 35.40 8.50 10 0
25 0.05 0.05 37 2 0.4 0.2 42.70 27.60 56.00 10 0 26.60 36.20 5.20 10 0
25 0.05 0.05 37 2 0.4 0.4 61.80 18.10 111.40 10 0 30.40 34.40 12.60 10 0
25 0.05 0.15 33 2 0.2 0.2 35.00 27.30 32.40 10 0 25.40 32.80 2.80 10 0
25 0.05 0.15 33 2 0.2 0.4 44.40 21.90 60.90 10 0 26.00 32.50 4.00 10 0
25 0.05 0.15 33 2 0.4 0.2 40.20 24.80 46.30 10 0 26.40 32.30 4.80 10 0
25 0.05 0.15 33 2 0.4 0.4 59.30 16.60 99.80 10 0 32.40 29.40 16.60 10 0
25 0.15 0.05 33 6 0.2 0.2 35.10 27.80 35.60 10 0 25.40 32.80 6.80 10 0
25 0.15 0.05 33 6 0.2 0.4 44.30 23.30 60.10 10 0 26.90 32.10 9.70 10 0
25 0.15 0.05 33 6 0.4 0.2 42.20 23.90 57.10 10 0 26.20 32.40 8.40 10 0
25 0.15 0.05 33 6 0.4 0.4 60.10 15.90 104.80 10 0 30.20 30.60 16.00 10 0
25 0.15 0.15 29 6 0.2 0.2 33.50 24.70 28.80 10 0 25.30 29.00 6.30 10 0
25 0.15 0.15 29 6 0.2 0.4 43.90 19.00 60.40 10 0 26.30 28.40 8.50 10 0
25 0.15 0.15 29 6 0.4 0.2 41.70 20.40 53.80 10 0 26.50 28.40 8.70 10 0
25 0.15 0.15 29 6 0.4 0.4 55.50 14.30 88.30 10 0 30.60 26.20 17.20 10 0
Average 33 4 45.33 22.54 63.01 10.00 0.00 27.37 31.87 8.63 10.00 0.00
Table 5.2: Comparison of the transformed graphs G1 and G2 with the original graph
G on 36 nodes.
198
IIRPP-F1 IIRPP-F2
|V | p p |E| |A| p p |V 1| |E1| |A1a d r i | NOS NFS |V 2| |E2| |A2| NOS NFS
36 0.05 0.05 55 3 0.2 0.2 51.90 46.30 51.30 10 0 37.00 54.50 5.00 10 0
36 0.05 0.05 55 3 0.2 0.4 70.80 37.50 107.30 10 0 41.40 52.30 13.80 10 0
36 0.05 0.05 55 3 0.4 0.2 65.30 39.80 90.60 10 0 39.50 53.30 9.90 10 0
36 0.05 0.05 55 3 0.4 0.4 99.80 24.20 192.40 10 0 48.30 49.00 27.30 10 0
36 0.05 0.15 49 3 0.2 0.2 50.60 40.70 47.80 10 0 36.60 48.70 4.20 10 0
36 0.05 0.15 49 3 0.2 0.4 64.90 33.80 86.50 9 1 39.40 47.30 9.80 10 0
36 0.05 0.15 49 3 0.4 0.2 60.60 35.70 76.00 10 0 37.70 48.20 6.30 10 0
36 0.05 0.15 49 3 0.4 0.4 88.20 23.30 154.20 9 1 45.70 44.30 22.10 10 0
36 0.15 0.05 49 9 0.2 0.2 51.70 41.50 53.90 10 0 37.10 48.50 11.10 10 0
36 0.15 0.05 49 9 0.2 0.4 67.00 33.50 99.00 10 0 39.90 47.10 16.70 10 0
36 0.15 0.05 49 9 0.4 0.2 61.60 35.90 82.70 10 0 38.30 47.90 13.50 10 0
36 0.15 0.05 49 9 0.4 0.4 91.90 22.50 169.20 10 0 45.40 44.60 27.20 10 0
36 0.15 0.15 44 8 0.2 0.2 48.80 37.10 45.30 10 0 36.50 43.80 8.90 10 0
36 0.15 0.15 44 8 0.2 0.4 62.50 30.40 83.00 10 0 38.80 42.70 13.40 10 0
36 0.15 0.15 44 8 0.4 0.2 59.20 31.70 74.40 10 0 37.70 43.20 11.30 10 0
36 0.15 0.15 44 8 0.4 0.4 87.20 19.40 152.70 10 0 44.90 40.00 24.90 10 0
Average 49.25 5.75 67.63 33.33 97.89 9.88 0.13 40.26 47.21 14.09 10.00 0.00
Table 5.3: Comparison of the transformed graphs G1 and G2 with the original graph
G on 49 nodes.
199
IIRPP-F1 IIRPP-F2
|V | p p |E| |A| p p |V 1| |E1a d r i | |A1| NOS NFS |V 2| |E2| |A2| NOS NFS
49 0.05 0.05 76 4 0.2 0.2 72.50 63.10 77.70 10 0 50.20 75.40 6.40 10 0
49 0.05 0.05 76 4 0.2 0.4 98.70 50.80 156.00 9 1 55.80 72.60 17.60 10 0
49 0.05 0.05 76 4 0.4 0.2 88.90 54.40 126.60 10 0 52.00 74.50 10.00 10 0
49 0.05 0.05 76 4 0.4 0.4 131.30 35.90 247.20 10 0 63.80 69.00 32.80 10 0
49 0.05 0.15 68 4 0.2 0.2 68.70 24.80 62.70 8 2 50.40 67.30 6.80 10 0
49 0.05 0.15 68 4 0.2 0.4 92.44 16.60 134.44 9 0 54.67 65.22 15.22 9 1
49 0.05 0.15 68 4 0.4 0.2 85.10 27.80 110.10 10 0 53.60 65.80 13.00 10 0
49 0.05 0.15 68 4 0.4 0.4 121.40 23.30 214.40 9 1 60.10 62.60 25.90 10 0
49 0.15 0.05 69 12 0.2 0.2 73.40 23.90 83.40 10 0 50.80 68.10 15.60 10 0
49 0.15 0.05 69 12 0.2 0.4 99.90 15.90 159.10 9 1 56.30 65.80 25.70 10 0
49 0.15 0.05 69 12 0.4 0.2 89.70 24.70 133.10 10 0 53.00 67.20 19.60 10 0
49 0.15 0.05 69 12 0.4 0.4 129.50 19.00 239.60 10 0 65.20 61.60 43.00 10 0
49 0.15 0.15 62 11 0.2 0.2 71.20 20.40 76.50 7 3 50.60 61.20 14.20 10 0
49 0.15 0.15 62 11 0.2 0.4 93.10 14.30 139.40 9 1 53.50 60.10 19.30 10 0
49 0.15 0.15 62 11 0.4 0.2 83.10 46.30 110.50 10 0 52.10 60.50 17.10 10 0
49 0.15 0.15 62 11 0.4 0.4 118.70 37.50 206.50 10 0 61.30 56.40 34.50 10 0
Average 68.75 7.75 94.85 31.17 142.33 9.38 0.56 55.21 65.83 19.80 9.94 0.06
Table 5.4: Comparison of the transformed graphs G1 and G2 with the original graph
G on 64 nodes.
200
IIRPP-F1 IIRPP-F2
|V | pa pd |E| |A| p 1r pi |V | |E1| |A1| NOS NFS |V 2| |E2| |A2| NOS NFS
64 0.05 0.05 102 5 0.2 0.2 98.78 39.80 114.67 6 3 65.11 101.44 7.22 7 3
64 0.05 0.05 102 5 0.2 0.4 137.30 24.20 229.10 7 3 73.70 97.30 24.10 8 2
64 0.05 0.05 102 5 0.4 0.2 121.20 40.70 183.20 7 3 69.60 99.20 16.20 10 0
64 0.05 0.05 102 5 0.4 0.4 178.10 33.80 347.20 7 3 84.20 92.10 45.00 7 3
64 0.05 0.15 91 5 0.2 0.2 94.20 35.70 97.30 5 5 65.30 90.40 7.50 8 2
64 0.05 0.15 91 5 0.2 0.4 125.88 23.30 189.25 4 4 71.38 87.50 19.38 8 1
64 0.05 0.15 91 5 0.4 0.2 112.38 41.50 149.38 6 2 69.25 88.38 15.50 10 0
64 0.05 0.15 91 5 0.4 0.4 158.30 33.50 277.10 8 2 80.50 82.90 37.70 10 0
64 0.15 0.05 92 16 0.2 0.2 98.90 35.90 119.90 7 3 66.10 91.10 19.90 9 1
64 0.15 0.05 92 16 0.2 0.4 138.20 22.50 235.50 6 4 74.50 87.00 36.50 7 3
64 0.15 0.05 92 16 0.4 0.2 116.90 37.10 168.70 9 1 70.00 89.20 27.60 10 0
64 0.15 0.05 92 16 0.4 0.4 173.20 30.40 330.80 10 0 82.70 83.30 52.10 10 0
64 0.15 0.15 82 14 0.2 0.2 90.00 31.70 88.40 9 1 66.20 81.00 18.20 10 0
64 0.15 0.15 82 14 0.2 0.4 119.90 19.40 173.10 6 4 71.20 78.80 27.60 9 1
64 0.15 0.15 82 14 0.4 0.2 109.33 63.10 140.78 8 1 69.56 79.44 24.67 10 0
64 0.15 0.15 82 14 0.4 0.4 157.44 50.80 271.78 7 2 83.00 73.22 50.56 9 1
Average 91.75 10 126.88 35.21 194.76 7.00 2.56 72.64 87.64 26.86 8.88 1.06
IIRPP-F1 Gap% IIRPP-F2 Gap%
|V | pa pd pr pi = 0.2 pi = 0.4 pi = 0.2 pi = 0.4
25 0.05 0.05 0.2 0.00 0.00 0.00 0.00
25 0.05 0.05 0.4 0.00 0.00 0.00 0.00
25 0.05 0.15 0.2 0.00 0.00 0.00 0.00
25 0.05 0.15 0.4 0.00 0.00 0.00 0.00
25 0.15 0.05 0.2 0.00 0.00 0.00 0.00
25 0.15 0.05 0.4 0.00 0.00 0.00 0.00
25 0.15 0.15 0.2 0.00 0.00 0.00 0.00
25 0.15 0.15 0.4 0.00 0.00 0.00 0.00
Average 0.00 0.00 0.00 0.00
Table 5.5: Comparison of the percentage optimality gap between the best feasible
solution and the best lower bound for the IIRPP formulations on 25 nodes.
IIRPP-F1 Gap% IIRPP-F2 Gap%
|V | pa pd pr pi = 0.2 pi = 0.4 pi = 0.2 pi = 0.4
36 0.05 0.05 0.2 0.00 0.00 0.00 0.00
36 0.05 0.05 0.4 0.00 0.00 0.00 0.00
36 0.05 0.15 0.2 0.00 0.75 0.00 0.00
36 0.05 0.15 0.4 0.00 1.31 0.00 0.00
36 0.15 0.05 0.2 0.00 0.00 0.00 0.00
36 0.15 0.05 0.4 0.00 0.00 0.00 0.00
36 0.15 0.15 0.2 0.00 0.00 0.00 0.00
36 0.15 0.15 0.4 0.00 0.00 0.00 0.00
Average 0.00 0.26 0.00 0.00
Table 5.6: Comparison of the percentage optimality gap between the best feasible
solution and the best lower bound for the IIRPP formulations on 36 nodes.
201
IIRPP-F1 Gap% IIRPP-F2 Gap%
|V | pa pd pr pi = 0.2 pi = 0.4 pi = 0.2 pi = 0.4
49 0.05 0.05 0.2 0.00 0.44 0.00 0.00
49 0.05 0.05 0.4 0.00 0.00 0.00 0.00
49 0.05 0.15 0.2 0.75 0.00 0.00 0.00
49 0.05 0.15 0.4 0.00 0.37 0.00 0.00
49 0.15 0.05 0.2 0.00 0.89 0.00 0.00
49 0.15 0.05 0.4 0.00 0.00 0.00 0.00
49 0.15 0.15 0.2 1.27 0.42 0.00 0.00
49 0.15 0.15 0.4 0.00 0.00 0.00 0.00
Average 0.25 0.27 0.00 0.00
Table 5.7: Comparison of the percentage optimality gap between the best feasible
solution and the best lower bound for the IIRPP formulations on 49 nodes.
IIRPP-F1 Gap% IIRPP-F2 Gap%
|V | pa pd pr pi = 0.2 pi = 0.4 pi = 0.2 pi = 0.4
64 0.05 0.05 0.2 7.45 1.12 1.62 2.85
64 0.05 0.05 0.4 1.88 2.37 0.00 1.05
64 0.05 0.15 0.2 5.78 3.57 2.29 0.69
64 0.05 0.15 0.4 1.22 0.35 0.00 0.00
64 0.15 0.05 0.2 2.27 2.93 0.08 1.19
64 0.15 0.05 0.4 0.34 0.00 0.00 0.00
64 0.15 0.15 0.2 0.52 2.11 0.00 0.48
64 0.15 0.15 0.4 0.16 1.33 0.00 0.00
Average 2.45 1.72 0.50 0.78
Table 5.8: Comparison of the percentage optimality gap between the best feasible
solution and the best lower bound for the IIRPP formulations on 64 nodes.
202
RPP Time IIRPP-F1 Time IIRPP-F2 Time
|V | pa pd pr pi = 0 pi = 0.2 pi = 0.4 pi = 0.2 pi = 0.4
25 0.05 0.05 0.2 0.03 0.10 0.20 0.06 0.07
25 0.05 0.05 0.4 0.02 0.11 0.09 0.08 0.09
25 0.05 0.15 0.2 0.02 0.13 0.26 0.08 0.05
25 0.05 0.15 0.4 0.02 0.06 0.13 0.05 0.07
25 0.15 0.05 0.2 0.02 0.19 0.04 0.05 0.04
25 0.15 0.05 0.4 0.04 0.05 0.08 0.04 0.06
25 0.15 0.15 0.2 0.02 0.07 0.04 0.04 0.04
25 0.15 0.15 0.4 0.02 0.12 1.35 0.05 0.11
Average 0.02 0.10 0.28 0.06 0.07
Table 5.9: Comparison of the running time for the RPP and IIRPP formulations on
25 nodes.
RPP Time IIRPP-F1 Time IIRPP-F2 Time
|V | pa pd pr pi = 0 pi = 0.2 pi = 0.4 pi = 0.2 pi = 0.4
36 0.05 0.05 0.2 0.05 3.05 1.56 0.85 0.82
36 0.05 0.05 0.4 0.04 0.28 2.07 0.15 0.80
36 0.05 0.15 0.2 0.05 2.71 366.90 0.95 16.83
36 0.05 0.15 0.4 0.04 1.28 361.94 0.15 0.79
36 0.15 0.05 0.2 0.03 0.52 4.75 0.19 0.86
36 0.15 0.05 0.4 0.04 1.05 29.55 0.31 1.23
36 0.15 0.15 0.2 0.03 0.20 0.32 0.12 0.12
36 0.15 0.15 0.4 0.03 1.86 91.16 0.11 0.52
Average 0.04 1.37 107.28 0.36 2.75
Table 5.10: Comparison of the running time for the RPP and IIRPP formulations
on 36 nodes.
203
RPP Time IIRPP-F1 Time IIRPP-F2 Time
|V | pa pd pr pi = 0 pi = 0.2 pi = 0.4 pi = 0.2 pi = 0.4
49 0.05 0.05 0.2 0.08 21.16 453.35 2.21 81.86
49 0.05 0.05 0.4 0.05 5.21 166.64 0.69 2.59
49 0.05 0.15 0.2 0.05 746.04 211.99 91.92 229.79
49 0.05 0.15 0.4 0.06 145.51 714.47 0.90 126.83
49 0.15 0.05 0.2 0.04 86.25 360.86 0.94 4.43
49 0.15 0.05 0.4 0.04 1.55 9.89 0.53 2.28
49 0.15 0.15 0.2 0.06 1086.17 363.09 71.98 6.12
49 0.15 0.15 0.4 0.03 3.33 8.92 0.21 0.67
Average 0.05 261.90 286.15 21.17 56.82
Table 5.11: Comparison of the running time for the RPP and IIRPP formulations
on 49 nodes.
RPP Time IIRPP-F1 Time IIRPP-F2 Time
|V | pa pd pr pi = 0 pi = 0.2 pi = 0.4 pi = 0.2 pi = 0.4
64 0.05 0.05 0.2 0.11 1352.75 1236.98 807.11 1057.13
64 0.05 0.05 0.4 0.13 1145.00 1600.07 55.99 1183.37
64 0.05 0.15 0.2 0.11 2096.28 2510.66 1016.27 1439.14
64 0.05 0.15 0.4 0.08 1479.86 844.08 23.37 331.58
64 0.15 0.05 0.2 0.10 1243.26 1715.96 427.58 1149.53
64 0.15 0.05 0.4 0.07 439.77 264.95 7.26 59.44
64 0.15 0.15 0.2 0.07 403.24 1499.75 164.61 515.47
64 0.15 0.15 0.4 0.06 452.93 1035.31 63.53 241.57
Average 0.09 1076.64 1338.47 320.72 747.15
Table 5.12: Comparison of the running time for the RPP and IIRPP formulations
on 64 nodes.
204
RPP OS IIRPP-F1 Diff% IIRPP-F2 Diff%
|V | pa pd pr pi = 0 pi = 0.2 pi = 0.4 pi = 0.2 pi = 0.4
25 0.05 0.05 0.2 1706.97 1.21 4.09 1.21 4.09
25 0.05 0.05 0.4 2368.13 4.05 7.48 4.05 7.48
25 0.05 0.15 0.2 1722.68 6.01 8.31 6.01 8.31
25 0.05 0.15 0.4 2395.98 4.68 9.73 4.68 9.73
25 0.15 0.05 0.2 1659.78 1.66 13.08 1.66 13.08
25 0.15 0.05 0.4 2588.80 0.36 5.28 0.36 5.28
25 0.15 0.15 0.2 1732.24 7.92 2.36 7.92 2.36
25 0.15 0.15 0.4 2529.53 5.03 8.96 5.03 8.96
Average 2088.01 3.86 7.41 3.86 7.41
Table 5.13: Comparison of the percentage gap between the RPP optimal solution
and the best feasible solutions from the IIRPP formulations on 25 nodes.
RPP OS IIRPP-F1 Diff% IIRPP-F2 Diff%
|V | pa pd pr pi = 0 pi = 0.2 pi = 0.4 pi = 0.2 pi = 0.4
36 0.05 0.05 0.2 2458.47 2.55 8.15 2.55 8.15
36 0.05 0.05 0.4 3774.18 3.15 5.79 3.15 5.79
36 0.05 0.15 0.2 2517.13 2.38 9.71 2.38 9.70
36 0.05 0.15 0.4 3627.48 4.20 8.96 4.20 8.09
36 0.15 0.05 0.2 2378.62 3.44 9.16 3.44 9.16
36 0.15 0.05 0.4 3772.47 2.58 6.83 2.58 6.83
36 0.15 0.15 0.2 2432.52 4.10 10.44 4.10 10.44
36 0.15 0.15 0.4 3923.57 7.44 9.48 7.44 9.48
Average 3110.55 3.73 8.56 3.73 8.45
Table 5.14: Comparison of the percentage gap between the RPP optimal solution
and the best feasible solutions from the IIRPP formulations on 36 nodes.
205
RPP OS IIRPP-F1 Diff% IIRPP-F2 Diff%
|V | pa pd pr pi = 0 pi = 0.2 pi = 0.4 pi = 0.2 pi = 0.4
49 0.05 0.05 0.2 3285.66 1.38 5.56 1.38 5.56
49 0.05 0.05 0.4 5063.34 1.79 6.02 1.79 6.02
49 0.05 0.15 0.2 3281.47 3.89 4.96 3.88 4.96
49 0.05 0.15 0.4 5124.26 2.84 7.83 2.84 7.83
49 0.15 0.05 0.2 3495.18 4.06 8.45 4.06 8.19
49 0.15 0.05 0.4 5039.98 5.19 10.17 5.19 10.17
49 0.15 0.15 0.2 3604.03 3.52 6.35 3.37 6.35
49 0.15 0.15 0.4 5067.68 6.43 12.75 6.43 12.75
Average 4245.20 3.64 7.76 3.62 7.73
Table 5.15: Comparison of the percentage gap between the RPP optimal solution
and the best feasible solutions from the IIRPP formulations on 49 nodes.
RPP OS IIRPP-F1 Diff% IIRPP-F2 Diff%
|V | pa pd pr pi = 0 pi = 0.2 pi = 0.4 pi = 0.2 pi = 0.4
64 0.05 0.05 0.2 4579.26 15.82 8.46 1.77 10.26
64 0.05 0.05 0.4 6820.47 3.90 6.93 2.88 5.67
64 0.05 0.15 0.2 4526.89 5.87 10.04 3.46 7.95
64 0.05 0.15 0.4 6499.43 5.25 9.14 4.85 9.11
64 0.15 0.05 0.2 4773.34 5.08 9.88 4.67 8.67
64 0.15 0.05 0.4 6888.24 4.71 8.18 4.67 8.18
64 0.15 0.15 0.2 4527.43 4.73 12.08 4.71 11.93
64 0.15 0.15 0.4 6948.28 2.97 6.64 2.97 6.28
Average 5695.42 6.04 8.92 3.75 8.51
Table 5.16: Comparison of the percentage gap between the RPP optimal solution
and the best feasible solutions from the IIRPP formulations on 64 nodes.
206
Table 5.17: Comparison of the RPP based and IIRPP based heuristics with the
IIRPP formulations on 25 nodes.
207
RPP-H IIRPP-F1-H IIRPP-F2-H
|V | pa pd pr pi BFS Time Diff% Time NBS Diff% Time NBS Diff% Time NBS
25 0.05 0.05 0.2 0.2 1727.56 0.06 5.27 0.04 4 30.26 0.10 0 30.25 0.07 0
25 0.05 0.05 0.2 0.4 1776.76 0.07 8.96 0.04 4 21.77 0.14 0 21.96 0.07 0
25 0.05 0.05 0.4 0.2 2463.96 0.07 4.06 0.04 3 14.10 0.12 2 15.70 0.08 2
25 0.05 0.05 0.4 0.4 2545.19 0.08 11.85 0.04 0 14.49 0.24 1 11.55 0.08 1
25 0.05 0.15 0.2 0.2 1826.13 0.08 3.51 0.03 4 20.10 0.08 3 20.04 0.05 3
25 0.05 0.15 0.2 0.4 1865.83 0.05 10.25 0.04 4 27.27 0.12 2 25.25 0.05 2
25 0.05 0.15 0.4 0.2 2508.12 0.05 7.58 0.03 3 16.25 0.10 2 16.14 0.07 2
25 0.05 0.15 0.4 0.4 2629.07 0.06 14.15 0.03 0 20.74 0.24 1 18.16 0.08 1
25 0.15 0.05 0.2 0.2 1687.28 0.05 6.84 0.03 4 15.14 0.08 4 16.70 0.05 3
25 0.15 0.05 0.2 0.4 1876.81 0.04 7.94 0.03 4 17.44 0.11 1 13.07 0.06 0
25 0.15 0.05 0.4 0.2 2598.18 0.04 5.61 0.04 3 6.66 0.09 2 8.64 0.07 2
25 0.15 0.05 0.4 0.4 2725.48 0.05 18.97 0.04 1 10.25 0.20 4 12.11 0.08 4
25 0.15 0.15 0.2 0.2 1869.50 0.04 4.80 0.03 4 15.66 0.07 3 13.70 0.05 3
25 0.15 0.15 0.2 0.4 1773.12 0.04 7.75 0.03 3 13.76 0.09 4 16.06 0.04 3
25 0.15 0.15 0.4 0.2 2656.84 0.05 3.22 0.03 4 15.88 0.13 1 14.81 0.08 1
25 0.15 0.15 0.4 0.4 2756.21 0.11 11.98 0.03 2 9.42 0.21 2 8.63 0.07 2
Average 2205.38 0.06 8.30 0.03 2.94 16.82 0.13 2.00 16.42 0.07 1.81
Table 5.18: Comparison of the RPP based and IIRPP based heuristics with the
IIRPP formulations on 36 nodes.
208
RPP-H IIRPP-F1-H IIRPP-F2-H
|V | pa pd pr pi BFS Time Diff% Time NBS Diff% Time NBS Diff% Time NBS
36 0.05 0.05 0.2 0.2 2521.05 0.85 9.86 0.07 2 26.06 0.17 1 26.92 0.11 1
36 0.05 0.05 0.2 0.4 2658.84 0.78 15.12 0.07 3 25.17 0.33 0 25.65 0.12 0
36 0.05 0.05 0.4 0.2 3893.12 0.16 10.33 0.08 0 12.48 0.35 0 13.00 0.16 0
36 0.05 0.05 0.4 0.4 3992.67 0.80 12.35 0.08 0 9.44 0.98 0 7.95 0.22 1
36 0.05 0.15 0.2 0.2 2576.93 0.95 11.88 0.07 1 42.17 0.23 0 40.69 0.14 0
36 0.05 0.15 0.2 0.4 2761.33 16.83 9.39 0.07 0 23.06 0.31 0 21.93 0.10 0
36 0.05 0.15 0.4 0.2 3779.81 0.15 8.38 0.06 1 13.85 0.29 2 14.96 0.16 1
36 0.05 0.15 0.4 0.4 3920.81 0.79 11.00 0.06 0 18.49 0.83 0 17.93 0.22 0
36 0.15 0.05 0.2 0.2 2460.46 0.19 5.22 0.06 5 23.60 0.17 0 23.59 0.09 0
36 0.15 0.05 0.2 0.4 2596.55 0.86 16.57 0.05 1 17.33 0.27 1 17.95 0.10 1
36 0.15 0.05 0.4 0.2 3869.84 0.51 6.04 0.06 3 15.71 0.30 0 14.68 0.14 1
36 0.15 0.05 0.4 0.4 4030.12 1.26 11.61 0.06 0 15.01 0.75 1 14.03 0.16 1
36 0.15 0.15 0.2 0.2 2532.25 0.12 4.52 0.05 3 21.27 0.15 1 26.09 0.08 0
36 0.15 0.15 0.2 0.4 2686.45 0.12 9.19 0.06 2 33.93 0.29 0 33.91 0.13 0
36 0.15 0.15 0.4 0.2 4215.41 0.11 5.70 0.05 0 14.52 0.32 1 12.48 0.14 1
36 0.15 0.15 0.4 0.4 4295.48 0.51 9.75 0.05 2 13.77 0.67 1 14.35 0.18 1
Average 3299.45 1.56 9.81 0.06 1.44 20.37 0.40 0.50 20.38 0.14 0.50
Table 5.19: Comparison of the RPP based and IIRPP based heuristics with the
IIRPP formulations on 49 nodes.
209
RPP-H IIRPP-F1-H IIRPP-F2-H
|V | pa pd pr pi BFS Time Diff% Time NBS Diff% Time NBS Diff% Time NBS
49 0.05 0.05 0.2 0.2 3331.04 2.49 8.21 0.12 2 34.73 0.60 0 34.60 0.25 0
49 0.05 0.05 0.2 0.4 3468.47 81.83 14.83 0.12 0 27.87 0.85 0 26.78 0.24 0
49 0.05 0.05 0.4 0.2 5154.11 4.44 5.24 0.11 1 19.44 1.08 0 20.91 0.44 0
49 0.05 0.05 0.4 0.4 5368.08 165.09 18.08 0.10 0 18.04 2.51 0 17.16 0.51 0
49 0.05 0.15 0.2 0.2 3408.81 91.92 9.07 0.10 1 32.36 0.35 0 29.79 0.21 0
49 0.05 0.15 0.2 0.4 3560.67 566.82 14.16 0.10 2 34.33 0.77 0 31.51 0.24 0
49 0.05 0.15 0.4 0.2 5269.80 83.14 8.78 0.10 0 20.23 0.96 0 20.97 0.49 0
49 0.05 0.15 0.4 0.4 5525.44 133.39 19.31 0.11 0 19.63 2.24 0 18.95 0.46 0
49 0.15 0.05 0.2 0.2 3637.25 6.88 7.13 0.10 1 32.23 0.51 0 31.75 0.27 0
49 0.15 0.05 0.2 0.4 3781.32 4.42 14.93 0.10 1 25.03 1.02 0 23.56 0.26 1
49 0.15 0.05 0.4 0.2 5301.63 0.52 5.03 0.10 2 9.85 0.74 2 10.57 0.32 1
49 0.15 0.05 0.4 0.4 5552.69 2.08 11.43 0.09 0 10.95 2.14 1 10.55 0.35 1
49 0.15 0.15 0.2 0.2 3725.42 71.98 6.30 0.11 2 22.79 0.39 0 22.35 0.19 0
49 0.15 0.15 0.2 0.4 3833.02 6.95 11.71 0.11 0 21.15 0.80 0 21.02 0.20 0
49 0.15 0.15 0.4 0.2 5393.60 0.21 8.83 0.08 0 11.25 0.67 1 9.40 0.25 1
49 0.15 0.15 0.4 0.4 5713.95 0.69 15.24 0.09 0 15.51 1.93 1 12.97 0.33 0
Average 4501.58 76.43 11.14 0.10 0.75 22.21 1.10 0.31 21.43 0.31 0.25
Table 5.20: Comparison of the RPP based and IIRPP based heuristics with the
IIRPP formulations on 64 nodes.
210
RPP-H IIRPP-F1-H IIRPP-F2-H
|V | pa pd pr pi BFS Time Diff% Time NBS Diff% Time NBS Diff% Time NBS
64 0.05 0.05 0.2 0.2 4677.95 1086.42 5.48 0.22 1 30.12 1.22 0 30.58 0.57 0
64 0.05 0.05 0.2 0.4 4966.62 1059.66 13.32 0.20 0 23.45 2.61 0 23.54 0.59 0
64 0.05 0.05 0.4 0.2 7017.13 81.13 11.22 0.23 0 18.72 2.46 0 18.40 0.92 0
64 0.05 0.05 0.4 0.4 7206.99 1183.42 16.25 0.25 0 14.11 7.22 0 17.04 1.30 0
64 0.05 0.15 0.2 0.2 4683.73 1016.27 6.16 0.22 3 27.80 0.94 0 27.34 0.41 0
64 0.05 0.15 0.2 0.4 4871.86 1435.14 13.97 0.21 1 20.94 1.85 2 23.73 0.45 1
64 0.05 0.15 0.4 0.2 6798.94 381.67 7.96 0.20 0 23.19 2.11 0 22.63 0.68 0
64 0.05 0.15 0.4 0.4 7091.78 711.80 14.22 0.20 0 15.53 4.87 1 15.10 0.76 0
64 0.15 0.05 0.2 0.2 4996.10 426.70 9.67 0.23 0 20.44 1.08 0 20.85 0.42 0
64 0.15 0.05 0.2 0.4 5187.06 1149.29 12.78 0.22 0 24.60 2.65 0 25.48 0.56 0
64 0.15 0.05 0.4 0.2 7210.03 78.18 7.50 0.19 0 14.30 2.31 1 13.96 0.77 1
64 0.15 0.05 0.4 0.4 7451.56 61.45 15.97 0.19 0 12.86 6.74 0 11.33 0.76 1
64 0.15 0.15 0.2 0.2 4740.75 164.61 4.99 0.17 2 32.17 0.91 0 31.73 0.48 0
64 0.15 0.15 0.2 0.4 5067.47 874.15 11.54 0.18 1 25.28 1.79 0 24.38 0.49 0
64 0.15 0.15 0.4 0.2 7253.00 479.97 7.01 0.16 1 17.87 1.97 0 17.13 0.66 0
64 0.15 0.15 0.4 0.4 8558.74 578.24 6.13 0.17 1 12.33 5.25 1 11.20 0.87 1
Average 6111.23 673.01 10.26 0.20 0.63 20.86 2.87 0.31 20.90 0.67 0.25
Table 5.21: Summary of the comparison of the transformed graphs G1 and G2 with
the original graph G.
211
IIRPP-F1 IIRPP-F2
|V | |E| |A| |V 1| |E1| |A1| NOS NFS |V 2| |E2| |A2| NOS NFS
25 33 4 45.33 22.54 63.01 10.00 0.00 27.37 31.87 8.63 10.00 0.00
36 49.25 5.75 67.63 33.33 97.89 9.88 0.13 40.26 47.21 14.09 10.00 0.00
49 68.75 7.75 94.85 31.17 142.33 9.38 0.56 55.21 65.83 19.80 9.94 0.06
64 91.75 10 126.88 35.21 194.76 7.00 2.56 72.64 87.64 26.86 8.88 1.06
IIRPP-F1 Gap% IIRPP-F2 Gap%
|V | pi = 0.2 pi = 0.4 pi = 0.2 pi = 0.4
25 0.00 0.00 0.00 0.00
36 0.00 0.26 0.00 0.00
49 0.25 0.27 0.00 0.00
64 2.45 1.72 0.50 0.78
Table 5.22: Summary of the comparison of the percentage optimality gap between
the best feasible solution and the best lower bound for the IIRPP formulations.
RPP Time IIRPP-F1 Time IIRPP-F2 Time
|V | pi = 0 pi = 0.2 pi = 0.4 pi = 0.2 pi = 0.4
25 0.02 0.10 0.28 0.06 0.07
36 0.04 1.37 107.28 0.36 2.75
49 0.05 261.90 286.15 21.17 56.82
64 0.09 1076.64 1338.47 320.72 747.15
Table 5.23: Summary of the comparison of the running time for the RPP and IIRPP
formulations.
RPP OS IIRPP-F1 Diff% IIRPP-F2 Diff%
|V | pi = 0 pi = 0.2 pi = 0.4 pi = 0.2 pi = 0.4
25 2088.01 3.86 7.41 3.86 7.41
36 3110.55 3.73 8.56 3.73 8.45
49 4245.20 3.64 7.76 3.62 7.73
64 5695.42 6.04 8.92 3.75 8.51
Table 5.24: Summary of the comparison of the percentage gap between the RPP
optimal solution and the best feasible solutions from the IIRPP formulations.
212
Table 5.25: Summary of the comparison of the RPP based and IIRPP based heuris-
tics with the IIRPP formulations.
213
RPP-H IIRPP-F1-H IIRPP-F2-H
|V | BFS Time Diff% Time NBS Diff% Time NBS Diff% Time NBS
25 2205.38 0.06 8.30 0.03 2.94 16.82 0.13 2.00 16.42 0.07 1.81
36 3299.45 1.56 9.81 0.06 1.44 20.37 0.40 0.50 20.38 0.14 0.50
45 4501.58 76.43 11.14 0.10 0.75 22.21 1.10 0.31 21.43 0.31 0.25
64 6111.23 673.01 10.26 0.20 0.63 20.86 2.87 0.31 20.90 0.67 0.25
Chapter 6: Concluding Remarks
In this dissertation, we made an effort to bridge the gap between academic
research and its practical applicability in the field of logistics. Each chapter ad-
dressed a different real-world logistics problem. We used several decision-making
techniques, including data-driven optimization and statistical modeling, to produce
practical and easy-to-implement solutions for these real-world problems. This should
help industry practitioners improve their decision making.
In Chapter 2, we used an iterative methodology to generate robust vehicle
routes to read uncertain RFID meters from a distance. Using a large, real-world
data set, we demonstrated that it is practically possible to model an inherently
stochastic problem with a continuous source of incoming data using deterministic
optimization techniques while updating the unknown variables before every deci-
sion point. This avoids using an intractable stochastic optimization model and
brings together mathematical programming models and statistical modeling. We
demonstrated that the choice of statistical model to update the meter reading prob-
abilities is an important consideration. We showed that Bayesian updating works
in a practical setup and that it avoids the drawbacks of regression. Furthermore,
we developed a hierarchical Bayesian updating model specific to the meter reading
214
problem to take into account the heterogeneity of the signal transmission behavior
of individual meters. We hope to extend this work using Bayesian decision theory
and route optimization to help utility companies maximize their information gain.
In Chapter 3, we showed that route balance is very important, and it directly
affects the total operating and delivery costs incurred in the CVRP. Under random
traffic conditions, route times increase from the original solution produced by the
routing algorithm using an objective that minimizes the total route time for a fleet of
vehicles. When starting with routes that are already imbalanced, longer routes will
take even more time to complete in the presence of heavy traffic. Routes that are
not balanced directly and substantially affect the total cost because longer routes
lead to a higher chance of a driver working more than the regular work hours. Thus,
when routes are not balanced, a delivery company may pay more overtime wages
to its drivers. In a practical setting, we can apply a simple randomized routing
algorithm repeatedly to obtain many solutions. We can then select a solution with
a small total route length and a small route length standard deviation to achieve
low operating and delivery costs. It might be helpful for delivery companies to use
quantifiable metrics and determine the types of instances that would have a larger
impact of reducing route variability on total cost.
In Chapter 4, we showed that route lengths can be estimated using regression
models for a hard-to-solve routing problem such as the CETSP. Route length esti-
mation is useful for scenarios where it is important to quickly estimate the route
length when decisions need to be made. We estimated the route length for the
Steiner zone variable neighborhood search (SZVNS) heuristic on CETSP instances
215
with node locations generated randomly, and all customers having the same radius
for the service regions. We established the performance of regression models using
different quantitative and qualitative statistical measures. The independent vari-
ables in the regression models captured the geometric properties of an instance, the
spread of the customer service regions, and the number of Steiner zones. The vari-
able for the number of Steiner zones captured the feature of an instance exploited
by the SZVNS heuristic. Similar regression models could be built for estimating the
route lengths using different heuristics by having variables that capture the specific
feature of a heuristic. It would also be important to have fast route length prediction
models for CETSP instances with node locations that are not generated randomly.
In Chapter 5, we introduced the IIRPP, a variant of the RPP involving turns,
to help local governments in road inspections. Along with street segments, inter-
sections need to be inspected for proper road quality management. It is important
to make at least one straight or left turn at each intersection. We formulated two
integer programs, IIRPP-F1 and IIRPP-F2, based on two different graph transfor-
mations to solve the IIRPP optimally. The computational experiments showed that
IIRPP-F2 was faster and able to produce better quality solutions within a reason-
able amount of time compared to IIRPP-F1 because of the significantly smaller size
of its transformed graph. We developed three heuristics for the IIRPP. RPP-H was
the best performing heuristic compared to IIRPP-F1-H and IIRPP-F2-H. RPP-H
used a simple modification to the optimal RPP routes and produced good quality
solutions within a very short amount of time. In the future, we hope to develop
branch-and-cut algorithms to solve larger instances optimally within a reasonable
216
amount of time, and smarter heuristics to reduce the gap between heuristic solutions
and optimal IIRPP solutions.
From the real-world logistics problems studied in this dissertation, we learned
that it is very important to look at a problem from the perspective of a decision
maker trying to solve it in practice. The methodologies to solve practical problems
should be designed with applicability in the mind. It is also important to use large
data sources to generate novel insights and to build data-driven, decision-making
tools that improve business decision making in real-world settings.
217
Bibliography
Albert JH, Chib S (1993) Bayesian analysis of binary and polychotomous response data.
Journal of the American Statistical Association 88(422):669?679.
A?vila T, Corbera?n A?, Plana I, Sanchis JM (2016) A new branch-and-cut algorithm for the
generalized directed rural postman problem. Transportation Science 50(2):750?761.
Beardwood J, Halton JH, Hammersley JM (1959) The shortest path through many points.
Mathematical Proceedings of the Cambridge Philosophical Society 55(4):299?327.
Behdani B, Smith JC (2014) An integer-programming-based approach to the close-enough
traveling salesman problem. INFORMS Journal on Computing 26(3):415?432.
Benavent E, Soler D (1999) The directed rural postman problem with turn penalties.
Transportation Science 33(4):408?418.
Bodin L, Levy L (1991) The arc partitioning problem. European Journal of Operational
Research 53:393?401.
Capacitated Vehicle Routing Problem Library (2014). http://vrp.atd-lab.inf.
puc-rio.br/index.php/en/.
Carrabs F, Cerrone C, Cerulli R, Gaudioso M (2017) A novel discretization scheme for the
218
close-enough traveling salesman problem. Computers & Operations Research 78:163?
171.
Cavdar B, Sokol J (2015) A distribution-free TSP tour length estimation model for random
graphs. European Journal on Operational Research 243(2):588?598.
Cerrone C, Dussault B, Wang X, Golden B, Wasil E (2019) A two-stage solution approach
for the directed rural postman problem with turn penalties. European Journal of
Operational Research 272:754?765.
Chien (1992) Operational estimators for the length of a traveling salesman tour. Computers
& Operations Research 19(6):469?478.
Christofides N, Eilon S (1969a) An algorithm for the vehicle-dispatching problem. Journal
of the Operational Research Society 20(3):309?318.
Christofides N, Eilon S (1969b) Expected distances in distribution problems. Journal of
the Operational Research Society 20(4):437?443.
Clarke G, Wright JW (1964) Scheduling of vehicles from a central depot to a number of
delivery points. Operations Research 12(4):568?581.
Clossey J, Laporte G, Soriano P (2001) Solving arc routing problems with turn penalties.
Journal of the Operational Research Society 52(4):433?439.
Colorado Department of Regulatory Agencies (2018) Public Utilities Commission.
Accessed May 4, 2018, https://www.pueblo.us/DocumentCenter/View/6596/
Your-Rights-as-an-Electric-or-Natural-Gas-Utility-Customer.
219
Corbera?n A?, Plana I, Sanchis JM (2014) The rural postman problem on directed, mixed,
and windy graphs. Arc Routing: Problems, Methods, and Applications (SIAM) 101?
127.
Coutinho WP, Subramanian A, do Nascimento RQ, Pessoa AA (2016) A branch-and-
bound algorithm for the close-enough traveling salesman problem. INFORMS Journal
on Computing 28(4):752?765.
Defryn C, So?rensen K (2017) A fast two-level variable neighborhood search for the clus-
tered vehicle routing problem. Computers & Operations Research 83:78?94.
Dijkstra EW (1959) A note on two problems in connexion with graphs. Numerische Math-
ematik 1(1):269?271.
Domencich T, McFadden DL (1975) Urban Travel Demand: A Behavioral Analysis (Else-
vier).
Dong J, Yang N, Chen M (2007) Heuristic approaches for a TSP variant: The automatic
meter reading shortest tour problem. Extending the Horizons: Advances in Comput-
ing, Optimization, and Decision Technologies (Springer Verlag) 145?163.
Dumitrescu A, Mitchell J (2003) Approximation algorithms for TSP with neighborhoods
in the plane. Journal of Algorithms 48:135?159.
Eglese R, Golden B, Wasil E (2014) Route optimization for meter reading and salt spread-
ing. Arc Routing: Problems, Methods, and Applications (SIAM) 300?320.
Frederickson GN, Hecht MS, Kim CE (1978) Approximation algorithms for some routing
220
problems. SIAM Journal on Computing 7(2):178?193.
Golden B, Alt F (1979) Interval estimation of a global optimum for large combinatorial
problems. Naval Research Logistics Quarterly 26(1):69?77.
Golden B, Wasil E, Kelly J, Chao IM (1998) The impact of metaheuristics on solving the
vehicle routing problem: Algorithms, problem sets, and computational results. Fleet
Management and Logistics (Springer) 33?56.
Groe?r C, Golden B, Wasil E (2009) The balanced billing cycle vehicle routing problem.
Networks 54(4):243?254.
Gulczynski D, Heath J, Price C (2006) The close enough traveling salesman problem:
A discussion of several heuristics. Perspectives in Operations Research: Papers in
Honor of Saul Gass? 80th Birthday (Springer Verlag) 271?283.
Ha? MH (2012) Mode?lisation et re?solution de proble?mes ge?ne?ralise?s de tourne?es de
ve?hicules, Ph.D. dissertation, Automatique, Ecole des Mines de Nantes, France.
Ha? MH, Bostel N, Langevin A, Rousseau L-M (2014) Solving the close-enough arc routing
problem. Networks 63(1):107?118.
Hansen P, Mladenovic? N, Todosijevic? R, Hanafi S (2017) Variable neighborhood search:
Basics and variants. EURO Journal on Computational Optimization 5(3):423?454.
Hastings WK (1970) Monte Carlo sampling methods using Markov chains and their ap-
plications. Biometrika 57:97?109.
Hindle A, Worthington D (2004) Models to estimate average route lengths in different
221
geographical environments. Journal of the Operational Research Society 55(6):662?
666.
Illinois Administrative Code (2018) Title 83: Public Utilities, Section 280.90: Esti-
mated Bills. Accessed May 4, 2018, ftp://www.ilga.gov/JCAR/AdminCode/083/
083002800F00900R.html.
Irving, Texas - Code of Ordinances (2018) Chapter 31: Public Utilities, Article
II: Natural Gas, Section 31-11: Estimated bills prohibited. Accessed May 4,
2018, https://library.municode.com/tx/Irving/codes/code_of_ordinances?
nodeId=PTIITHCO_CH31PUUT_ARTIINAGA_S31-11ESBIPRCOEXTESE.
Kara I, Guden H, Koc ON (2012) New formulations for the generalized traveling salesman
problem. Proceedings of the 6th international conference on Applied Mathematics,
Simulation, Modelling 60?65.
Kenyon AS, Morton DP (2003) Stochastic vehicle routing with random travel times. Trans-
portation Science 37(1):69?82.
Kwon O, Golden B, Wasil E (1995) Estimating the length of the optimal TSP tour:
An empirical study using regression and neural networks. Computers & Operations
Research 22(10):1039?1046.
Laporte G, Louveaux F, Mercure H (1992) The vehicle routing problem with stochastic
travel times. Transportation Science 26(3):161?170.
Levy L (2018) Personal communication. RouteSmart Technologies, Inc.
222
Levy L, Sniezek J, Cox B (2002) Utility meter route management and optimization using
GIS, Tech. report, Electric and Gas Utilities User Group, Coeur d?Alene, Idaho.
Louviere JJ, Hensher DA, Swait JD (2000) Stated choice methods: Analysis and applica-
tions (Cambridge University Press).
Lum O, Golden B, Wasil E (2018) An open-source desktop application for generating
arc-routing benchmark instances. INFORMS Journal on Computing 30(2):361-370.
Marin J-M, Robert CP (2014) Bayesian Essentials with R (Springer).
Mennell WK (2009) Heuristics for solving three routing problems: Close-enough trav-
eling salesman problem, close-enough vehicle routing problem, sequence-dependent
team orienteering problem, Ph.D. dissertation, Decision, Operations & Information
Technologies, University of Maryland, College Park, USA.
Mennell WK, Golden B, Wasil E (2011) A Steiner-zone heuristic for solving the close-
enough traveling salesman problem. Operations Research, Computing, and Homeland
Defense (INFORMS) 162?183.
Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E (1953) Equation of
state calculations by fast computing machines. Journal of Chemical Physics 21:1087?
1091.
Michigan Department of Labor and Economic Growth (2018) Public Service
Commission, Consumer Standards and Billing Practices, Part 4. Accessed
May 4, 2018, https://www.consumersenergy.com/~/media/CE/Documents/
mpsc-billing-rules.ashx?la=en.
223
Nicola D, Vetschera R, Dragomir A (2019) Total distance approximations for routing
solutions. Computers & Operations Research 102:67?74.
Press SJ (2003) Subjective and Objective Bayesian Statistics: Principles, Models, and
Applications (John Wiley & Sons).
Renaud A, Absi N, Feillet D (2017) The stochastic close-enough arc routing problem.
Networks 69(2):205?221.
Rossi PE, Allenby GM, McCulloch R (2005) Bayesian Statistics and Marketing (John
Wiley & Sons).
Rostami B, Desaulniers G, Errico F, Lodi A (2017) The vehicle routing problem with
stochastic and correlated travel times. Data Science for Real-Time Decision-Making
(Canada Excellence Research Chair) 1?51.
Shuttleworth R, Golden B, Smith S, Wasil E (2008) Advances in meter reading: Heuristic
solution of the close enough traveling salesman problem over a street network. The
Vehicle Routing Problem: Latest Advances and New Challenges (Springer Verlag)
487?501.
Silberholz J, Golden B (2007) The generalized traveling salesman problem: A new genetic
algorithm approach. Extending the Horizons: Advances in Computing, Optimization,
and Decision Technologies (Springer Verlag) 165?181.
Steiniger S, Hunter AJS (2013) The 2012 free and open source GIS software map - A
guide to facilitate research, development, and adoption. Computers, Environment
and Urban Systems 39:136?150.
224
Stern D, Dror M (1979) Routing electric meter readers. Computers & Operations Research
6:209?223.
Wang X, Golden B, Wasil E (2019) A Steiner zone variable neighborhood search heuristic
for the close-enough traveling salesman problem. Computers & Operations Research
101:200?219.
Wassan N, Wassan N, Nagy G, Salhi S (2017) The multiple trip vehicle routing problem
with backhauls: Formulation and a two-level variable neighbourhood search. Com-
puters & Operations Research 78:454?467.
Yang Z, Xiao M-Q, Ge Y-W, Feng D-L, Zhang L, Song H-F, Tang X-L (2018) A double-
loop hybrid algorithm for the traveling salesman problem with arbitrary neighbour-
hoods. European Journal on Operational Research 265(1):65?80.
Yuan B, Orlowska M, Sadiq S (2007) On the optimal robot routing problem in wireless
sensor networks. IEEE Transactions on Knowledge and Data Engineering 19(9):1252?
1261.
225