ABSTRACT
Title of dissertation: COLUMN GENERATION IN INFEASIBLE
PREDICTOR-CORRECTOR METHODS
FOR SOLVING LINEAR PROGRAMS
Stacey O. Nicholls, Doctor of Philosophy, 2009
Dissertation directed by: Professor Dianne P. O?Leary
Department of Computer Science
Institute for Advanced Computer Studies
Primal-dual interior-point methods (IPMs) are distinguished for their excep-
tional theoretical properties and computational behavior in solving linear program-
ming (LP) problems. Consider solving the primal-dual LP pair using an IPM suchas
a primal-dual Affine-Scaling method, Mehrotra?s Predictor-Corrector method (the
most commonly used IPM to date), or Potra?s Predictor-Corrector method. The
bulk of the computation in the process stems from the formation of the normal
equation matrix, AD2AT, where A ? Rfracturm?n and D2 = S?1X is a diagonal matrix.
In cases when n greatermuch m, we propose to reduce this cost by incorporating a column
generation scheme into existing infeasible IPMs for solving LPs. In particular, we
solve an LP problem based on an iterativeapproach where we select a ?small? subset
of the constraints at each iteration with the aim of achieving both feasibility and
optimality. Rather than n constraints, we work with k = |Q| ? [m,n] constraints at
each iteration, where Q is an index set consisting of the k most nearly active con-
straints at the current iterate. The cost of the formation of the matrix, AQD2QATQ,
reduces from ?(m2n) to ?(m2k) operations, where k is relatively small compared
to n. Although numerical results show an occasional increase in the number of iter-
ations, the total operation count and time to solve the LP using our algorithms is,
in most cases, small compared to other ?reduced? LP algorithms.
COLUMN GENERATION IN INFEASIBLE
PREDICTOR-CORRECTOR METHODS FOR SOLVING
LINEAR PROGRAMS
by
Stacey O. Nicholls
Dissertation submitted to the Faculty of the Graduate School of the
University of Maryland, College Park in partial fulfillment
of the requirements for the degree of
Doctor of Philosophy
2009
Advisory Committee:
Professor Dianne P. O?Leary, Chair/Advisor
Professor Andr?e L. Tits
Professor Eric V. Slud
Professor Konstantina Trivisa
Professor Michael Laskowski
DEDICATION
This dissertation is dedicated to my sons:
Tyler Adedapo Akanji Jagun
and
Sean Adekunle Ayinde Jagun
ii
ACKNOWLEDGMENTS
My journey to completing this dissertation came with many obstacles and
challenges. However, there were a great number of people who supported my efforts
to help make this thesis possible. I am truly blessed to have had the opportunity to
learn from, work with, and be mentored by those who made my graduate experience
one that I will always cherish.
First and foremost, I would like to thank my advisor, Dr. Dianne P. O?Leary.
Words cannot expresshow grateful I am to have workedwith such an extraordinarily
brilliant woman. She gave me very challenging, yet interesting projects to work on
overthese past few years, and her advicewas invaluable. She was alwaysavailable for
me when I needed help and/or advice on issues related to and beyond my research.
She has given me enthusiasm for the area of Numerical Optimization, an eye for
detail in working mathematical proofs and describing computational results, and
an appreciation for all of the time and energy she devotes to her students. She is
an exceptional professor and mentor, and I aspire to be just as influential to my
students and their careers as she has been to me.
I would also like to thank Dr. Andr?e L. Tits. This thesis would not have been
possible without his remarkable theoretical ideas and expertise. He has always given
me helpful suggestions and advice for my research. He was a valuable asset to me
in this process, and I thank him for all he has offered.
iii
Thank you to Dr. Andr?e L. Tits, Dr. Eric V. Slud, Dr. Konstantina Trivisa,
and Dr. Michael Laskowski for agreeing to serve on my thesis committee. I greatly
appreciate the time and energy you dedicated to reviewing the thesis and providing
suggestions for revisions.
During my years at UMCP, I was fortunate to receive funding from a number
of sources. UMCP offered not only an opportunity for me to gain valuable informa-
tion from professors, but also an opportunity to teach undergraduate mathematics
courses. I taught lecture and recitation classes as a teaching assistant for four years.
I must thank Dr. William Schildknecht for giving me the opportunity to teach
a variety of mathematics courses at UMCP. These opportunities opened doors for
teaching jobs at other universities, and I am grateful for this teaching support. I
would also like to thank Dr. Johnetta Davis for her assistance throughout my grad-
uate years. She was pivotal in helping me obtain a GAANN fellowship for which I
am truly thankful. Dr. Davis also served as one of the PIs for the Promise Program,
a program in which I gained valuable insight about the process to the doctorate.
I would like to thank Dr. Michael Boyle for the opportunity to work under the
VIGRE grant. This support provided me with an opportunity to work with Dr.
Jeffrey Cooper and Dr. Tobias von Petersdorff in designing MATLAB projects for
undergraduate students in computational mathematics using concepts in numeri-
cal analysis, differential equations, and optimization. I was also funded under the
Department of Energy grant DEFG0204ER25655: ?Interior Point Algorithms for
Optimization Problems?, and I would like to thank Dr. Dianne P. O?Leary and Dr.
Andr?e L. Tits for this support. Lastly, I would like to thank Dr. Daniel Syman-
iv
cyk and the rest of the hiring committee from the Mathematics Department at
Anne Arundel Community College for believing in my ability to work as a full-time
mathematics professor while finishing my graduate studies.
The staff at UMCP has been extremely helpful throughout my graduate years.
I would like to especially thank Ms. Alverda McCoy for all of her help, support,
and advice with the AMSC program.
Many of my most memorable graduate school moments are due to a group of
friends that: (1) helped make my transition from undergraduate to graduate school
a smooth one, (2) offered exceptional advice to help me reach certain milestones in
my program, and (3) provided a friendship that will last a lifetime. I particularly
would like to give a big thanks to my friend and former roommate, Ms. Joycelyn
Wilson, for her immeasurable support throughout the years. I would also like to
thank my graduate mentor Dr. Tasha Inniss, for her guidance in my early graduate
school years and Dr. Monica Jackson, for her help as a fellow teaching assistant. I
greatly appreciate the help and support of all of my friends. Thank you!
Mydecisionto pursue graduate study was largelydue totheexceptionalfaculty
in the Mathematics Department at Spelman College. I would like to thank Dr.
Sylvia Bozeman for encouraging me to apply to graduate school. I would also like
to thank Dr. Nagambal Shah and Dr. Yewande Olubummo for their constant
support during my years of study as a math major at Spelman.
I could not have completed my graduate program without the invaluable sup-
port of my family. To my fiance?e, Adebola ?Taheel? Jagun, thank you for your
continual encouragement over the years. Our sons, Tyler and Sean Jagun, are a
v
blessing from God, and I am so thankful to have them. They made my dissertation
progress even more of a challenge (many sleepless nights) and yet they were my
motivation for finishing the degree. I am also truly grateful for the endless love
and support of my parents, Herbert and Jacqueline Nicholls. Thank you for always
believing in me! To my sister, Cynthia Nicholls, you are a special person in my life
and I want to thank you for just being you. Thank you to all of my family members
for their encouraging words and support.
It is truly impossible to remember all, and I apologize to those I?ve inadver-
tently left out.
Lastly, I faced obstacles and challenges at just about every stage of my gradu-
ate program. God made a way for me to overcome these obstacles with the support
of the individuals mentioned here. Thank you God for ALL you have provided me!
vi
TABLE OF CONTENTS
List of Tables ix
List of Figures x
1 Introduction 1
2 Background 5
2.1 Optimality Conditions for the Primal-Dual LP . . . . . . . . . . . . . 6
2.2 The Column Generation Heuristic . . . . . . . . . . . . . . . . . . . . 10
2.2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2.2 Initial Point . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.3 Selection of Q . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.4 Update Strategy . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3 A Reduced Primal-Dual Affine-Scaling (redPDAS) Algorithm 21
3.1 Affine-Scaling Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.1.2 The Primal and Dual Affine-Scaling Methods . . . . . . . . . 22
3.1.3 The Primal-Dual Affine-Scaling Method . . . . . . . . . . . . 27
3.1.4 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2 A Primal-Dual Affine-Scaling (PDAS) Algorithm . . . . . . . . . . . 31
3.3 The redPDAS Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.4 Convergence Analysis of redPDAS . . . . . . . . . . . . . . . . . . . . 35
3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4 A Reduced Mehrotra Predictor-Corrector (redMPC) Algorithm 54
4.1 MPC Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.1.2 Centering-Corrector Direction . . . . . . . . . . . . . . . . . . 56
4.2 A MPC Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.3 The redMPC Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.4 Convergence Analysis of redMPC . . . . . . . . . . . . . . . . . . . . 64
4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5 Numerical Experiments 102
5.1 Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
5.1.1 Test Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.2 redPDAS Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . 106
5.3 redMPC Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
5.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
5.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
vii
6 Conclusions and Further Study 131
A Topics in Numerical Optimization 134
A.1 Steepest-Descent Method . . . . . . . . . . . . . . . . . . . . . . . . . 134
A.2 Orthogonal Projection Matrix . . . . . . . . . . . . . . . . . . . . . . 135
B The redPC Algorithm 137
B.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
B.2 Potra?s Predictor-Corrector Algorithm . . . . . . . . . . . . . . . . . 139
B.3 The redPC Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
Bibliography 146
viii
LIST OF TABLES
3.1 A summary of the differences between the redPDAS algorithm and
a modified version of the algorithm adapted from Tits et. al. [24]
that proves to be locally and quadratically convergent. . . . . . . . . 48
4.1 A summary of the differences between the redMPC algorithm and
a modified version of the algorithm adapted from Winternitz et. al.
[28] that proves to be locally and quadratically convergent. . . . . . . 95
5.1 Random Problems (TAW1 - TAW5) from Tits et. al. [24]. Prob-
lems with constraint matrix A* generate each column of A with
randn(m,1) and then normalize to make each column have norm
one. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
5.2 RandomTest Problems (RAND1 - RAND5) with specified m and n
and known optimal solution . . . . . . . . . . . . . . . . . . . . . . . 105
5.3 Netlib Problems from [2] with specified m and n and known optimal
solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
ix
LIST OF FIGURES
5.1 Performance of the redPDAS algorithm against the PDAS on test
problem TAW4 with m = 50,n = 20000, and |Q| ? [3m,ubnd] where
ubnd is . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.2 Comparison of the redPDAS algorithm versus PDAS algorithm us-
ing 25 randomly generated test problems from TAW1 (constraints
tangent to the unit sphere) with m = 50, n = 20000, and ubnd = ?k
where 150 ? ?k ? 20000. . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.3 Comparison of the redPDAS algorithm versus PDAS algorithm us-
ing 25 randomly generated test problems from TAW2 (random [nor-
mal] constraints) with m = 50, n = 20000, and ubnd = ?k where
150 ? ?k ? 20000. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
5.4 Comparison of the redPDAS algorithm versus PDAS algorithm us-
ing 25 randomly generated test problems from TAW3 (Random as
in WT-18July2003) with m = 50, n = 20000, and ubnd = ?k where
150 ? ?k ? 20000. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
5.5 Comparison of the redPDAS algorithm versus PDAS algorithm us-
ing 25 randomly generated test problems from TAW4 (RandomLP)
with m = 50, n = 20000, and ubnd = ?k where 150 ? ?k ? 20000. . . . . 111
5.6 Comparison of the redPDAS algorithm versus PDAS algorithm us-
ing 25 randomly generated test problems from TAW5 (SIPND) with
m = 50, n = 20000, and ubnd = ?k where 150 ? ?k ? 20000. . . . . . . . 111
5.7 Comparison of the redPDAS algorithm versus PDAS algorithm us-
ing 25 randomly generated test problems from RAND1 with m = 5,
n = 100, and ubnd = ?k where 15 ? ?k ? 100. . . . . . . . . . . . . . . . 112
5.8 Comparison of the redPDAS algorithm versus PDAS algorithm us-
ing 25 randomly generated test problems from RAND2 with m = 12,
n = 500, and ubnd = ?k where 36 ? ?k ? 500. . . . . . . . . . . . . . . . 113
5.9 Comparison of the redPDAS algorithm versus PDAS algorithm us-
ing 25 randomly generated test problems from RAND3 with m = 24,
n = 780, and ubnd = ?k where 72 ? ?k ? 780. . . . . . . . . . . . . . . . 113
5.10 Comparison of the redPDAS algorithm versus PDAS algorithm us-
ing 25 randomly generated test problems from RAND4 with m = 58,
n = 1004, and ubnd = ?k where 174 ? ?k ? 1004. . . . . . . . . . . . . . 114
x
5.11 Comparison of the redPDAS algorithm versus PDAS algorithm us-
ing 25 randomly generated test problems from RAND5 with m = 75,
n = 2016, and ubnd = ?k where 225 ? ?k ? 2016. . . . . . . . . . . . . . 114
5.12 Comparison of the redPDAS algorithm versus PDAS algorithm us-
ing the Netlib test problem SCSD1 with m = 77, n = 760, and
ubnd = ?k where 231 ? ?k ? 760. . . . . . . . . . . . . . . . . . . . . . . 115
5.13 Comparison of the redPDAS algorithm versus PDAS algorithm us-
ing the Netlib test problem SCSD6 with m = 147, n = 1350, and
ubnd = ?k where 441 ? ?k ? 1350. . . . . . . . . . . . . . . . . . . . . . 116
5.14 Comparison of the redPDAS algorithm versus PDAS algorithm us-
ing the Netlib test problem SCSD8 with m = 397, n = 2750, and
ubnd = ?k where 1191 ? ?k? 2750. . . . . . . . . . . . . . . . . . . . . 116
5.15 Performance of the redMPC algorithm against the MPC using test
problem TAW4 with m = 50,n = 20000, |Q| ? [3m,ubnd] where ubnd
is fixed to n. The average time (in seconds) to solve 50 randomly
generated problems from TAW4 is shown over varying values of C
ranging from 10?16 to 10?1. . . . . . . . . . . . . . . . . . . . . . . . 118
5.16 Comparison of the redMPC algorithm versus the MPC and ipas35
algorithms using 25 randomly generated test problems from TAW1
(constraints tangent to the unit sphere) withm = 50, n = 20000, and
ubnd = ?k where 150 ? ?k ? 20000. . . . . . . . . . . . . . . . . . . . . 119
5.17 Comparison of the redMPC algorithm versus the MPC and ipas35
algorithms using 25 randomly generated test problems from TAW2
(random [normal] constraints) with m= 50, n = 20000, and ubnd = ?k
where 150 ? ?k ? 20000. . . . . . . . . . . . . . . . . . . . . . . . . . 120
5.18 Comparison of the redMPC algorithm versus the MPC and ipas35
algorithms using 25 randomly generated test problems from TAW3
(Random as in WT-18July2003) with m= 50, n = 20000, and ubnd =
?k where 150 ? ?k ? 20000. . . . . . . . . . . . . . . . . . . . . . . . . 120
5.19 Comparison of the redMPC algorithm versus the MPC and ipas35
algorithms using 25 randomly generated test problems from TAW4
(RandomLP) with m = 50, n = 20000, and ubnd = ?k where 150 ?
?k ? 20000. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
5.20 Comparison of the redMPC algorithm versus the MPC and ipas35
algorithms using 25 randomly generated test problems from TAW5
(SIPND) with m = 50, n = 20000, and ubnd = ?k where 150 ? ?k ?
20000. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
xi
5.21 Comparison of the redMPC algorithm versus the MPC and ipas35
algorithms using 25 randomly generated test problems from RAND1
with m = 5, n = 100, and ubnd = ?k where 15 ? ?k ? 100. . . . . . . . . 123
5.22 Comparison of the redMPC algorithm versus the MPC and ipas35
algorithms using 25 randomly generated test problems from RAND2
with m = 12, n = 500, and ubnd = ?k where 36 ? ?k ? 500. . . . . . . . 123
5.23 Comparison of the redMPC algorithm versus the MPC and ipas35
algorithms using 25 randomly generated test problems from RAND3
with m = 24, n = 780, and ubnd = ?k where 72 ? ?k ? 780. . . . . . . . 124
5.24 Comparison of the redMPC algorithm versus the MPC and ipas35
algorithms using 25 randomly generated test problems from RAND4
with m = 58, n = 1004, and ubnd = ?k where 174 ? ?k ? 1004. . . . . . 124
5.25 Comparison of the redMPC algorithm versus the MPC and ipas35
algorithms using 25 randomly generated test problems from RAND5
with m = 75, n = 2016, and ubnd = ?k where 225 ? ?k ? 2016. . . . . . 125
5.26 Comparison of the redMPC algorithm versus the MPC and ipas35
algorithms using the Netlib test problem SCSD1 with m = 77, n =
760, and ubnd = ?k where 231 ? ?k ? 760. . . . . . . . . . . . . . . . . . 126
5.27 Comparison of the redMPC algorithm versus the MPC and ipas35
algorithms using the Netlib test problem SCSD6 with m = 147, n =
1350, and ubnd = ?k where 441 ? ?k ? 1350. . . . . . . . . . . . . . . . 127
5.28 Comparison of the redMPC algorithm versus the MPC and ipas35
algorithms using the Netlib test problem SCSD8 with m = 397, n =
2750, and ubnd = ?k where 1197 ? ?k ? 2750. . . . . . . . . . . . . . . . 127
A.1 A n-dimensional vector x is projected onto the null space of A [de-
notedN(A)] by the orthogonal projectionmatrixP = I?AT(AAT)?1A.136
xii
Chapter 1
Introduction
For nearly 40 years, the simplex method was deemed the most efficient algo-
rithm for solving linear programming (LP) problems, a class of problems involving
the minimization or maximization of a linear function subject to linear constraints.
Although the simplex algorithm can be shown to exhibit an exponential number of
iterations in the worst case, on practical problems the number of iterations is usually
linear in the size of the problem. Furthermore, the operation count per iteration is
also rather low; an m?m linear system is solved at each iteration, where m is the
number of constraints in the primal LP. 1
The work of Narendra Karmarkar [15], however, spawned a new class of meth-
ods, interior-point methods (IPMs), capable of outperforming the simplex method
on large-scale LPs and (in general) producing polynomial complexity results. Back-
ground information on simplex and interior-point methods can be found in [5], [20]
and [29]. Unlike the simplex method, IPMs solve a (2n + m) ? (2n + m) linear
system at each iteration where most of the computation stems from solving an
m ? m system with normal equation matrix AD2AT (A is an m ? n constraint
matrix and D is a positive diagonal matrix). If n greatermuch m, the operation count per
iteration can be quite large. Nonetheless, their low iteration count and speed make
1The primal LP is composed of a system of m equality constraints with n variables. The dual
LP is the ?companion?to the primal LP in which the roles of constraints and variables are reversed.
Therefore, the dual LP is composed of a system of n constraints with m variables.
1
IPMs the method of choice on large-scale LPs. Today, much attention is focused
on primal-dual IPMs [29]; they stand out for their exceptional theoretical proper-
ties and computational behavior. Most primal-dual IPM codes today are based on
Mehrotra?s Predictor-Corrector Method (MPC) [17]. MPC combines the essence of
the primal-dual framework with ingenious heuristics.
Inthisthesis, we deviseseveral?reduced?variantsof existinginfeasibleinterior-
point algorithms 2 for solving primal-dual LP pairs. Specifically, we examine the
incorporation of a column generation scheme into algorithms which, through a se-
quence of iterations, aim to achieve both feasibility and optimality. In our context,
a column generation approach refers to considering only a subset of columns of A,
or equivalently, a subset of the dual constraints, at each iteration. This reduces the
operation count per iteration and, if the subsets are chosen well, we generally ob-
serve no increase in the total number of iterations. Our algorithms can outperform
other column generation algorithms in total operation count and CPU time.
A vast number of papers have been written on the convergence of infeasible
interior-point algorithms. Kojima et al. [16], Potra [22], and Achache et al. [1]
devise infeasible algorithms which improve feasibility and find optimal solutions for
the primal-dual pair. The algorithms also demonstrate both global convergence and
polynomial complexity.
Column generation algorithms within interior-point methods have been an-
alyzed as well. Algorithms for ?building up? the number of constraints at each
2An ?infeasible algorithm? refers to an algorithm that accepts an initial solution that does not
satisfy all of the constraints of an LP.
2
iteration have been developed by Dantzig et al. [6], Ye [31], Hertog et. al. [11], and
Goffin et. al. [9]. Starting with a small subset of dual (working) constraints, an iter-
ative search for the optimal solution is conducted until there is a constraint blocking
the progression to this solution. Depending on the algorithm, the violatedconstraint
or constraint generated by the algorithm is added to the set, and the process repeats
until the optimal solution has been found. Ye [32] considers adding more than one
constraint to the working set at each iteration. In all of these algorithms, constraints
are never removedfrom the working set inthe current iteration. Ye [30], on the other
hand, developed a ?build-down? scheme in which a constraint is discarded from the
full constraint set if it is detected to be nonbinding in the optimal set. Convergence
properties for the aforementioned algorithms are provided along with polynomial
complexity analyses for some. Both the ?build-up? and ?build-down? approaches
were combined and analyzed in Hertog et al. [12] where the algorithm was shown
to terminate in polynomial time. More recently, a constraint reduction approach
was developed in a primal-dual framework 3 by Tits-Absil-Woessner [24]. The au-
thors consider replacing the normal equation matrix for the primal-dual LP with a
?reduced? matrix AQD2QATQ by solving a subset of m ? |Q| ? n dual constraints
at each iteration where Q is a fixed index set associated with the most ?promising?
dual constraints at the current iteration. Global convergence and local quadratic
convergence are proved for their ?reduced? affine-scaling algorithm.
The two algorithms we develop in this thesis are redPDAS and redMPC. The
3Here we are working with the primal and dual LPs simultaneously as opposed to working with
them individually. The primal-dual framework involves solving the optimality conditions to satisfy
both the primal and dual LPs.
3
redPDAS algorithm is an extension of the Tits-Absil-Woessner algorithm [24] to
handle dual infeasible iterates, and the redMPC algorithm is a ?reduced? version of
Mehotra?s Predictor-Corrector Method. We allow both primal and dual infeasibility
as opposed to Tits-Absil-Woessner who allow only primal infeasibility in the PDAS
case.
The remainder of the thesis focuses on the two algorithms, their convergence,
and their performance. The column generation scheme used within our algorithms is
presented in Chapter 2. In the beginning of Chapter 3, a general discussion of affine-
scaling methods is given. The remainder of the chapter includes a description of the
Reduced Primal-Dual Affine-Scaling redPDAS algorithm along with some conver-
gence results. We also show that with certain modifications, the algorithm is glob-
ally and locally convergent. A general discussion of Mehrotra?s Predictor-Corrector
method is given at the beginning of Chapter 4. The Reduced Mehrotra?s Predictor-
Corrector redMPC algorithm is presented afterward, along with some convergence
results. With modifications, the redMPC algorithm is also shown to be globally
and locally convergent. Numerical experiments are presented for the redPDAS and
redMPC algorithms in Chapter 5 as well as a discussion of their results. Chapter
6 provides the conclusion to the dissertation and directions for future work.
4
Chapter 2
Background
Interior-point methods (IPMs) for alinear program (LP) generate points which
lie in the interior of the region defined by the constraints of the problem. A special
class of IPMs, primal-dual methods, exhibit exceptional theoretical properties and
computational performance. This dissertation examines the effects of incorporat-
ing a column generation scheme into primal-dual IPM algorithms. To understand
the algorithm framework in the following chapters, we present some background
information in this chapter.
The chapter is divided into two sections: (1) the optimality conditions for
a primal-dual LP and (2) column generation. In section 2.1, we begin with the
standard notation used to express the primal and dual LPs and their optimality
conditions. Next, we show how a primal-dual IPM algorithm finds a solution to
these optimality conditions. In section 2.2, column generation is discussed in the
context of our algorithm framework. We introduce the notation associated with
our algorithms and explain how the index set Q associated with the reduced (dual)
constraint set is formulated. Finally, we explain how to preserve a strictly interior
point (or solution) within our algorithms.
5
2.1 Optimality Conditions for the Primal-Dual LP
Consider the primal LP in standard form,
min
x
{cTx : Ax= b, x? 0}, (2.1)
where A?Rfracturm?n, b?Rfracturm, and c?Rfracturn are the known coefficient matrix and vectors,
respectively, and x?Rfracturn is a vector of unknown variables. All vectors in this thesis
are column vectors. It is assumed that A has full row rank, b,c negationslash= 0, and n greatermuch m.
Solving the primal LP is equivalent to solving the dual LP,
max
y
{bTy : ATy ?c}.
An alternativeexpression of the dual problem resultsby incorporating a nonnegative
slack vector, s, into the dual constraints:
max
y,s
{bTy : ATy+s = c, s? 0}. (2.2)
This last expression is often preferred due to its ease of implementation into IPMs.
Together, problems (2.1) and (2.2) are referred to as the primal-dual pair.
The Karush-Kuhn-Tucker (KKT) [3] or optimality conditions for the primal-
dual pair, (2.1) and (2.2), can be viewed as a mappingF from Rfractur2n+m to Rfractur2n+m such
that
F(x,y,s) =
?
??
??
??
?
Ax?b
ATy+s?c
XSe
?
??
??
??
?
= 0, (2.3)
(x,s) ? 0,
6
where X, S ? Rfracturn?n are diagonal matrices constructed from the vectors x and s,
respectively, and e ? Rfracturn is the vector of all ones [29]. Any vector x that satisfies
the conditions Ax?b = 0 and x ? 0 is a feasible point for the primal LP. A dual
feasible point is any (y,s) satisfying the conditions ATy + s ?c = 0 and s ? 0.
Thus the KKT conditions say that any point that is both primal feasible and dual
feasible and satisfies XSe = 0 is optimal for both problems. According to duality
theory (the theory that explains the relationship between the primal and dual LPs,
discussed, for example, in [20], [4] ), if x and (y,s) are feasible for their respective
problems, then
bTy ? cTx.
That is, the dual objective function value is a lower bound for the primal objective
function value. The difference between the primal and dual objective functions,
vextendsinglevextendsinglecTx?bTyvextendsinglevextendsingle, is known as the duality gap. At optimality, x? solves (2.1), (y?,s?)
solves (2.2), and the duality gap is zero (i.e. bTy? = cTx?). Furthermore, the
condition X?S?e = 0 implies that at least one of the pair x?j or s?j must be zero for
allj = 1,2,...,n. Since the nonzero components ofxand soccur in complementary
positions, the condition XSe = 0 in (2.3) is referred to as the complementary
slackness condition.
In this dissertation, we focus on a class of primal-dual interior-point methods
which find solutions to (2.1) and (2.2) by applying variants of Newton?s method
to (2.3). Let z = (x,y,s) represent the current approximation to the solution and
assume that it satisfies the nonnegativity conditions x > 0 and s > 0. If z is not
7
optimal, Newton?s method forms a linear model tangent toF at z and finds a search
direction, ?z = (?x,?y,?s)?Rfractur2n+m, by solving:
J(A,x,s) ?z = ?F(x,y,s),
where J is the Jacobian of F. This gives the linear system
?
??
??
??
?
A 0 0
0 AT I
S 0 X
?
??
??
??
?
?
??
??
??
?
?x
?y
?s
?
??
??
??
?
= ?
?
??
??
??
?
rp
rd
XSe
?
??
??
??
?
, (2.4)
whererp =Ax?bandrd = ATy+s?care the primaland dual residuals, respectively.
By eliminating?sfrom the linear system, we obtain the following equivalent system
(also known as the ?augmented system?):
?
??
?
A 0
S ?XAT
?
??
?
?
??
?
?x
?y
?
??
?= ?
?
??
?
rp
?Xrd +XSe
?
??
?,
with ?s = ?rd ?AT?y. By further eliminating ?x, we have the set of equations
that form a major component of primal-dual IPM algorithms:
AD2AT?y = ?rp ?AD2rd +Ax,
?s = ?rd ?AT?y, (2.5)
?x = ?x?D2?s,
where D2 = S?1X. The equations AD2AT?y = ?rp ?AD2rd +Ax can be equiva-
lentlywritten asAD2AT?y = b?AD2rd sincerp = Ax?b. This system of equations
is known as the ?normal equations? because they are the normal equations for a
linear least squares problem with coefficient matrixDAT. Since it is assumed thatA
8
has full row rank and D is a positive diagonal matrix, the coefficient matrixAD2AT
is symmetric and positive definite. As a result, the coefficient matrix can be factored
using the Cholesky factorization as AD2AT =LLT where L is lower triangular [20],
and this makes computing solutions of the normal equations easy. Once the search
direction ?z is computed, the updated solution z+ is given by
z+ =z+??z,
where ?? (0,1] is a parameter designed to prevent z+ from violating the condition
(x+,s+) > 0. The Newton iteration continues until the duality gap falls below a
small defined tolerance. This is the basis for primal-dual IPMs, but there are many
variants, two of which we discuss in Chapters 3 and 4.
Another way to interpret finding a solution of (2.4) is that we calculate ?z to
satisfy the primal constraints,
A(x+ ?x) = b,
A?x = b?Ax, (2.6)
the dual constraints,
AT(y+ ?y) + (s+ ?s) = c,
AT?y+ ?s = c?ATy?s, (2.7)
and the complementary slackness condition,
(xj + ?xj)(sj + ?sj) = 0, ?j
xjsj +xj?sj +sj?xj + ?xj?sj = 0, ?j
X?s+S?x = ?XSe??X?Se. (2.8)
9
Equations (2.6) and (2.7) are precisely the first two blocks of equations in the linear
system (2.4). The equations in (2.8) differ from the last block of equations in (2.4)
by a nonlinear term, ?X?Se. To obtain an approximate solution for ?z, we ignore
the nonlinear term. This action is permitted since ?X and ?S are small in the
limit and their product is even smaller. The result is applying Newton?s Method to
(2.3) and solving the linear system (2.4).
2.2 The Column Generation Heuristic
2.2.1 Overview
The bulk of the computation in solving for the search direction, ?z, stems
from the m?m matrix, AD2AT. For dense matrices, it takes ?(m2n) operations to
form the matrix and another ?(m33 ) to factor it. On large-scale LPs where ngreatermuchm,
the incorporation of a column generation scheme into the primal-dual framework
can reduce this computational effort.
Originally devised by Gilmore et al. [8] in the early 1960?s, column generation
is a technique used to solve an LP by generating the columns of A as needed. This
is a beneficial tool because most of the columns of A have no effect on the optimal
solution. From duality theory, assuming nondegeneracy, there are exactly m zero
components ofscorresponding to exactlympositive components ofxat the optimal
solution (see [4], [5], or [20], for example). If x? solves the primal LP and (y?,s?)
solves the dual LP, we define Q? to be the set of indices associated with those m
positive components of x? (or m zero components of s?). Since the remaining n?m
10
components of x? are zero, the ?reduced? primal LP formed by just the columns
associated with the index set Q? has the same solution as the standard primal LP.
Furthermore, since the columns of A are simply the rows of AT, we can incorporate
a column generation scheme into our primal-dual framework by working with just a
subset of the dual constraints at each iteration. We try to choose this subset Q so
that Q? ?Q. Our strategy is as follows: given any point (x,s)> 0, and any integer
k with m ? k ? n, we define the set Q ? N = {1,2,...n} to be the indices, j, of
the dual constraints, aTj y ?cj, associated with the k largest xj/sj ratios. We define
AQ ?Rfracturm?k to be the submatrix of A comprised of the columns aj with j ?Q. The
vectors (sQ,xQ) ?Rfracturk are comprised of the components sj and xj, respectively, with
j ? Q. If k is relatively small compared to n, it takes only ?(m2k) operations to
form the matrix AQS?1Q XQATQ. If we can use this matrix in place of AS?1XAT, our
algorithm will significantly reduce the amount of work per iteration and possibly
take only a fraction of the time to solve a particular LP problem.
We let
??z = (??x,?y,??s) ?Rfractur2n+m
be the approximate Newton direction at zi computed by using AQS?1Q XQATQ in
place of AS?1XAT in (2.5). The components of the direction associated with the
neglected constraints are set to zero. The directions
?zQ = (?xQ,?y,?sQ) and ??zQ = (??xQ,??y,??sQ)
have dimension 2k +m. These vectors are obtained by deleting the components of
?xj, ?sj, ??xj, and ??sj with j /?Q.
11
Our goal is to prove that applying a column generation heuristic to primal-
dual IPM algorithms for solving LPs will provide excellent theoretical properties
and computational results.
In the following three subsections, we discuss the main ingredients in the
reduced algorithms: the initial point, the choice of Q and the update strategy.
2.2.2 Initial Point
There are a number of techniques for generating initial points (see [11], [14],
or [17], for example). Our column generation approach requires (x,s) > 0 at each
iteration. To achieve this, we use the strategy of Mehrotra [17] and describe his
technique here.
Assume the columns of A are linearly independent and b,cnegationslash= 0 and set
?x = AT(AAT)?1b, ?y = (AAT)?1Ac, and ?s =c?ATy.
The point (?x,?y,?s) satisfies the primal and dual equality constraints. However, in or-
der to satisfy the condition (x,s)> 0, Mehrotra defines?x = max(?1.5min{?xj},0)
and ?s = max(?1.5min{?sj},0) and then defines
??x = ?x +.5
bracketleftBigg
(?x+?xe)T(?s+?se)summationtext
n
j=1 ?sj +?s
bracketrightBigg
, and
??s = ?s +.5
bracketleftBigg
(?x+?xe)T(?s+?se)summationtext
n
j=1 ?xj +?x
bracketrightBigg
,
where e is the vector of ones. Then, if x0 = ?x+ ??xe, y0 = ?y, and s0 = ?s+ ??se, we
have an initial point
z0 = (x0,y0,s0) (2.9)
12
such that (x0,s0) > 0. The equality constraints are not necessarily satisfied, so z0
might be infeasible.
2.2.3 Selection of Q
The choice ofQis based on thek ? [m,n] largestxj/sj ratios at each iteration.
Assumingnondegeneracy, there will be exactlymcomponents ofsconvergingto zero
as the optimal solution is approached. Since the corresponding m components of x
are not converging to zero, there will be exactly m xj/sj ratios tending to infinity
as we tend to the optimal solution. Meanwhile, the remaining n?m components
of x converge to zero while the corresponding n?m components of s do not. As
a result, the remaining n?m xj/sj ratios tend to zero as the optimal solution is
approached. Since Q is the index set associated with the k ? [m,n] largest xj/sj
ratios at each iteration, when we are close enough to the solution, we are guaranteed
to find an optimal solution to the ?reduced? dual LP based solely on the rows of
AT associated with Q. This implies that we can solve the normal equations with
coefficient matrix AQD2QATQ. We describe our selection of Q below.
Let C satisfy 0 <C ? 1. We specify a lower bound on the number of indices
in Q, given by lbnd = 3m. We also specify an upper bound ubnd on the number of
indices in Q. The upper bound is a predetermined integer between 3m and n. Let
N = {1,2,...,n}. The algorithm for determining Q is described below:
???????????????-
Select the most promising dual constraints:
13
1. Sort the values xj/sj so that
xj1/sj1 ?xj2/sj2 ?...?xjn/sjn.
2. Let Q consist of the indices associated with the ratios greater than C?(xj1/sj1):
Q= {j? : xj?/sj? >C?(xj1/sj1), ? = 1,2,...,n};
k = |Q|;
3. Modify Q if necessary so that lbnd ?k ?ubnd:
if k ?ubnd
Q = {j? : ? = 1,2,...,ubnd};
elseif k ?lbnd
Q = {j? : ? = 1,2,...lbnd};
end(if)
k = |Q|;
???????????????-
Insteps 1 and 2of our algorithm, the ratiosxj/sj are orderedfrom largest to smallest
and selected if they exceed the product of C and the value of the largest ratio at the
current iterate. The indices associated with the selected ratios make up the set Q.
The quantityC is an experimental constant chosen based on algorithm performance
(see Sections 5.2 and 5.3 for details). Step 3 guarantees that Q contains at least 3m
indices at each iteration. This is a critical component of the convergence analysis in
the following chapters. Computationally, by ensuring Q is of an ?appropriate? size
14
relative to m and n, we can reduce the risk of instabilities in the performance of the
algorithm. Since n greatermuch m, a value of k close to m is likely to cause the algorithm
to cycle through an extremely large number of constraint choices before reaching
the optimal solution. A value of k close to n in early iterations is acceptable since
steps 2 and 3 of our subroutine will significantly reduce k as the optimal solution
is approached. This poses the question, ?For which range of ratios k/n, where
k ? (m,n), are the ?reduced? algorithms most efficient for solving LPs?? This and
other topics will be discussed in Chapter 5.
2.2.4 Update Strategy
Throughout the next two chapters, we use the superscript + to represent an
update to a quantity within an algorithm. Here we examine the update strategy
used within our ?reduced? algorithms for the primal and dual variables as well as
the dual residuals.
In a general primal-dual IPM algorithm, the primal and dual variables are
updated by the equations:
x+ = x+?p?x,
y+ = y+?d?y, (2.10)
s+ = s+?d?s
where ?p and ?d are the step lengths between 0 and 1 along the primal and dual
15
search directions, respectively. The dual residuals are updated as follows:
r+d = ATy+ +s+ ?c
= AT parenleftbigy+?d?yparenrightbig+parenleftbigs+?d?sparenrightbig?c
= parenleftbigATy+s?cparenrightbig+?dparenleftbigAT?y+ ?sparenrightbig
= rd +?dparenleftbigAT?y?rd ?AT?yparenrightbig
= (1??d)rd.
We used the relations rd = ATy+s?c and ?s= ?rd?AT?y in the fourth line. In
the ?reduced? algorithms, we set ?sj = 0, ?j /?Q. By setting r+d =ATy+ +s+?c,
we have
(rd)+j =
?
???
???
(1??d)(rd)j if j ?Q
(rd)j +?daTj ?y if j /?Q
,
where aj is the jth column of A. Since (x0,s0) > 0, it can be shown that the
components of the initial dual residual r0d are strictly positive and have the same
value. Therefore,
0 < (1??d)(rd)0j < (rd)0j, ?j ?Q,
since ?d ? (0,1). If an index j remains in Q for every iteration, then we have
0 < (1??d)(rd)ij < (rd)ij, for j ?Q, ?i. (2.11)
In this case, the components of the dual residual in Q at every iteration remain
strictly positive and decrease to zero as the solution is approached. If at any iter-
ation, the jth component of residual is not associated with Q, it will be updated
16
as (rd)+j = (rd)j +?daTj ?y. Here the jth component of the dual residual does not
monotonically decrease to zero. This could potentially cause the ?reduced? algo-
rithms to fail to satisfy dual feasibility. We resolve this issue by incorporating a
new update strategy into our algorithms. In the case where |Q| = n, we follow the
update strategy for a general primal-dual IPM algorithm. Otherwise, the update
strategy is described below:
We define x+ = x +?p?x and (y+ = y +?d?y,s+ = s+?d?s) to be the
updates on the primal and dual variables, respectively. Since one of our goals is to
satisfy dual feasibility (ATy+ ?c ? 0), we define ? = ATy+ ?c and examine its
maximum component, denoted ?lscript, to determine if we are making progress to this
goal. In addition, we define a set ?Q to represent the index set of dual constraints
that are added to Q when it is determined we are not making progress towards
satisfying dual feasibility. The set ?Q is initially empty with ?Q?Q = ?. After each
iteration, we have the following possible outcomes:
Case 1: Suppose lscript?Q.
The quantity ?lscript +s+lscript is exactly the value of the lscriptth component of the updated dual
residual, (rd)+lscript . Sincethis component of the residual is givenby (rd)+lscript = (1??d)(rd)lscript,
it remainsstrictly positive and is smaller invalue than (rd)lscript. Therefore, we can make
progress toward satisfying dual feasibility by setting each component of the dual
residual to this value [i.e. r+d = (?lscript+s+lscript )ewhere e?Rfracturn is the vector of all ones]. To
ensure every component of the slack vector remains strictly positive (as a result of
this change), we define a new update on the slack vector as s++ = r+d ??. Although
the primal vector x does not affect the progress toward achieving dual feasibility,
17
it was observed through numerical testing that s++ converged to its optimal value
much faster than x+. Since the stopping criterion to our algorithms is based on
the duality gap falling below a small tolerance, our algorithms failed to converge in
reasonable time. Thus, we define a new update on the primal variables
x++j =
?
???
???
x+j j ?Q
min(s++,x+) j /?Q
to resolve this issue. Thus, the point (x++,y+,s++) is the new approximate solution
to the LP and ?Q is set to ? (nullset).
Case 2: Suppose lscript /?Q.
Since the ?reduced? algorithms only consider the columns of A associated with Q,
the term aTlscript ?y is never calculated in this case. Since it is unclear what the value
of aTlscript ?y is from iteration to iteration, there is no guarantee that we will progress
toward satisfying dual feasibility. As a result, we return to the previous solution
[i.e. set (x++,y++,s++) = (x,y,s)] and recalculate the solution with ?Q = ?Q?{lscript}.
The algorithm for the update strategy used in ?reduced? algorithms when
|Q|<n is summarized below.
???????????????-
Update (x+,s+) and rd:
? = ATy+ ?c;
Determine ?lscript = maxj{?j};
if lscript?Q
r+d =parenleftbig?lscript +s+lscript parenrightbige;
18
s++ = r+d ??;
x++j =
?
???
???
x+j j ?Q
min(s++,x+) j /?Q
;
?Q = ?;
else
Set x++ = x; y++ = y; s++ = s;
r+d = rd;
?Q = ?Q?{lscript};
end(if)
???????????????-
The Update (x+,s+) andrd subroutine is designedtoupdate the components of
the dual residual and solution even when dual feasibility is satisfied. An alternative
approach to updating these terms would be to consider invoking the update scheme
only after determining that we have not satisfied dual feasibility (i.e. maxj{aTj y+?
cj} > 0). Once a dual-feasible solution (y+,s+) has been determined, the solution
can be updated using the general update strategy in (2.10) until the optimal solution
has been found.
It should be noted that once the index lscript is added to Q (in the case lscript /? Q),
the algorithm recomputes the search direction from scratch using normal matrix
AQ+D2Q+ATQ+ where Q+ = Q?{lscript}. A faster way to compute this search direction
would be to compute a rank-1 update. This can be accomplished since the normal
19
matrix based on Q+ is a known invertiblesquare matrix plus a perturbation matrix:
AQ+D2Q+ATQ+ = AQD2QATQ + xlscripts
lscript
alscriptaTlscript
The Sherman-Morrison formula can be applied here to provide a numerically in-
expensive way to compute the inverse of AQ+D2Q+ATQ+ based on the rank-1 matrix
xlscript
slscriptalscripta
T
lscript .
2.3 Summary
In this chapter, we stated the optimality conditions for a primal-dual LP,
explained how to solve for these conditions and presented our column generation
heuristic for our ?reduced? algorithms. In the next two chapters, we will focus on
two ?reduced? algorithms, redPDAS and redMPC, and state the properties that
guarantee their convergence.
20
Chapter 3
A Reduced Primal-Dual Affine-Scaling (redPDAS) Algorithm
3.1 Affine-Scaling Methods
3.1.1 Overview
Primal-dual affine-scalingmethods transform a linear program to an equivalent
problem via an affine scaling transformation. The transformation repositions the
current point to one well inside the boundary of the feasible region determined by
the constraints xjsj > 0 to prevent the progression to the optimal solution from
becoming arbitrarily slow. Once the current point is transformed to one close to the
?center? of the feasible region, significant progress can be made towards the optimal
solution by moving along the search direction (see [19], [20], and [26]).
The primal, dual, and primal-dual affine-scaling methods are explained in
this section. In all three cases, the problem is scaled so that the current point (or
approximation to the solution) becomes the pointe(the vector of all ones) since it is
equidistant from the bounds of the region defined by the nonnegativity constraints.
A projected steepest descent 1 (ascent) direction is computed to simultaneously
decrease (increase) the objective function value and satisfy feasibility. The updated
approximation is formed by taking a step from e in the search direction. This point
1See Appendix A.
21
is then transformed back into its original coordinates.
In this chapter, we examine primal-dual affine-scaling algorithms. First we
discuss primal and dual variants. Then a general primal-dual affine scaling (PDAS)
algorithm is given in section 3.2 and a ?reduced? version called redPDAS is pre-
sented in section 3.3. Preliminary results for the convergence of the redPDAS
algorithm are provided in section 3.4. Global and local convergence results follow
with specific modifications to the algorithm. Finally, the chapter is summarized in
section 3.5.
3.1.2 The Primal and Dual Affine-Scaling Methods
Consider the linear scaling transformation ?D : Rfracturn ? Rfracturn, where ?D(x) =
D?1x with positive diagonal scaling matrix D. Here the primal variables x are
transformed to ? = D?1x. Thus x can be expressed as D?. The transformation
leads to a new primal problem:
Minx cTx Min? (Dc)T?
s.t. s.t.
Ax= b mapsto? AD? = b
x? 0 ? ? 0
Assuming primal feasibility, (Ax= b,x? 0), we compute a search direction
?? in the transformed space, a projected steepest descent direction that maintains
feasibility of the primal equality constraints. The steepest descent direction is given
by the negative gradient of the objective function, ?Dc. Suppose ?+ = ? + ??,
where the superscript + denotes the next iteration. To maintain feasibility, we
22
must satisfy
AD(? + ??) = b,
or equivalently
AD?? = 0.
That is, ?? must lie in the null space of AD. We define
PAD = I ?DAT(AD2AT)?1AD (3.1)
to be an orthogonal projection matrix 2 for AD. Then we want ?? to be the pro-
jection of the steepest descent direction onto the null space of AD. The expression
for ?? is given by
?? = ?PADDc. (3.2)
Transforming ?+ back into the original coordinate system, we have
x+ = D?+
= D(? + ??)
= x+D??.
The difference between the new iterate and the current iterate defines the search
direction in the original coordinate system. We will denote this direction ?x. We
2See Appendix A.
23
can write ?x= ??1D (??) as
?x = D??
= ?DPADDc
= [?D2 +D2AT(AD2AT)?1AD2] c. (3.3)
The second and third statements follow from (3.2) and (3.1), respectively. To make
? = e we set D ? X. This gives the direction ?x generated by the primal affine-
scaling algorithm.
There is a similiar affine-scaling algorithm for the dual problem. Suppose we
have a dual feasible point (y,s) : ATy +s = c,s ? 0. Consider the linear scaling
transformation that maps the dual variables s to ? = Ds where D is positive and
diagonal. The variables y are not transformed since they are unrestricted in value.
Then s can be written as D?1?. Assuming A has full row rank, we can write the
transformed problem exclusively in terms of the new variables ? by solving the dual
equality constraints, ATy + s = c for y and substituting this expression into the
problem. The expression for y is obtained by first premultiplying the dual equality
constraints by A:
ATy+s = c,
AATy+As = Ac,
y = parenleftbigAATparenrightbig?1 (Ac?As).
The last step follows since A has full row rank. Substituting the expression for y
24
into the objective function of the dual problem gives
bTy = bT
bracketleftBigparenleftbig
AATparenrightbig?1 (Ac?As)
bracketrightBig
= bT
bracketleftBigparenleftbig
AATparenrightbig?1Aparenleftbigc?D?1?parenrightbig
bracketrightBig
= bT parenleftbigAATparenrightbig?1Ac?bT parenleftbigAATparenrightbig?1AD?1?.
The second line uses the relation s = D?1?. Since we are interested in maximizing
the objectivefunction,bTy, of the dual problem overy, thisisthe same as minimizing
bT parenleftbigAATparenrightbig?1AD?1? with respect to ?. The term bT parenleftbigAATparenrightbig?1Ac is constant, so it
can be ignored. The dual equality constraints can be expressed as
c = ATy+s
= AT parenleftbigAATparenrightbig?1 (Ac?As) +s
= AT parenleftbigAATparenrightbig?1parenleftbigAc?AD?1?parenrightbig+D?1?.
Collecting terms, we have
parenleftBig
?AT parenleftbigAATparenrightbig?1A+I
parenrightBig
D?1? =
parenleftBig
I ?AT parenleftbigAATparenrightbig?1A
parenrightBig
c,
or
PAD?1? = PAc,
where
PA = I?AT parenleftbigAATparenrightbig?1A (3.4)
is the orthogonal projection matrix for A. The transformed dual problem becomes:
25
Maxy bTy Min? (D?1AT(AAT)?1b)T?
s.t. s.t.
ATy+s = c mapsto? (PAD?1)? =PAc
s? 0 ?? 0
The steepest descent direction in the transformed space is given by
?D?1AT(AAT)?1b. (3.5)
However, to maintain feasibility we must satisfy
PAD?1(?+ ??) = PAc,
or equivalently
PAD?1?? = 0.
By premultiplying the steepest descent direction in (3.5) by the orthogonal projec-
tion matrix I ? PAD = DAT(AD2AT)?1AD, where PAD is defined in (3.1) , we
have
?? = ?(I ?PAD)D?1AT(AAT)?1b
= ?DAT(AD2AT)?1ADD?1AT(AAT)?1b
= ?DAT(AD2AT)?1b.
It follows that
PAD?1?? = (I ?AT(AAT)?1A)D?1parenleftbig?DAT(AD2AT)?1bparenrightbig
= ?AT(AD2AT)?1b+AT(AAT)?1AAT(AD2AT)?1b
= ?AT(AD2AT)?1b+AT(AD2AT)?1b
= 0.
26
If the next iterate is ?+ =?+ ??, we have
s+ = D?1?+
= s+D?1??
and
?s= ?AT(AD2AT)?1b. (3.6)
We determine ?y to maintain dual feasibility. That is, we must satisfy
ATy+ +s+ = c,
AT(y+ ?y)+ (s+ ?s) = c,
AT?y+ ?s= 0. (3.7)
This last equation follows from the fact that ATy+s = c. Premultiplying (3.7) by
A and solving for ?y, we find
?y = ?(AAT)?1A?s
= (AD2AT)?1b. (3.8)
When D ? S?1, (?y,?s) is the direction generated by the dual affine-scaling
algorithm.
3.1.3 The Primal-Dual Affine-Scaling Method
In a primal-dual setting, the scaling matrix isD ? (S?1X)1/2. The pointXSe
is repositioned with respect to both nonnegativity constraints, x > 0 and s > 0.
27
Assuming primal and dual feasibility, ?x from (3.3) and (?y,?s) from (3.6) and
(3.8) with D? (S?1X)1/2 are given by
?x = [?S?1X +S?1XAT(AS?1XAT)?1AS?1X] c
= [?S?1X +S?1XAT(AS?1XAT)?1AS?1X] (ATy+s)
= ?S?1XATy?S?1Xs+S?1XAT(AS?1XAT)?1AS?1XATy
+S?1XAT(AS?1XAT)?1AS?1Xs
= ?S?1XATy?x+S?1XATy+S?1XAT(AS?1XAT)?1Ax
= ?x+S?1XAT(AS?1XAT)?1Ax
= ?x+S?1XAT(AS?1XAT)?1b,
and
?y = (AS?1XAT)?1b,
?s = ?AT(AS?1XAT)?1b.
Simplifying the expressions for the directions, we have
?y = (AS?1XAT)?1b,
?s = ?AT?y,
?x = ?x?S?1X?s.
The primal-dual affine-scaling direction is exactly the standard Newton step for
solving the optimality conditions for the primal or dual problem:
28
?
??
??
??
?
A 0 0
0 AT I
S 0 X
?
??
??
??
?
?
??
??
??
?
?x
?y
?s
?
??
??
??
?
= ?
?
??
??
??
?
0
0
XSe
?
??
??
??
?
. (3.9)
3.1.4 Algorithms
The algorithms in this chapter are written completely in terms of the original
variables. We let ?xa and (?ya,?sa) be the components of the affine-scaling direc-
tion associated with the primal and dual problems, respectively. We start with an
initial point (x,y,s) satisfying x> 0 and s> 0. Primal and dual feasiblity are not
required in our algorithms. Let rp = Ax?b and rd = ATy +s?c be the primal
and dual residuals, respectively. The search direction we seek is the solution to the
system:
?
??
??
??
?
A 0 0
0 AT I
S 0 X
?
??
??
??
?
?
??
??
??
?
?xa
?ya
?sa
?
??
??
??
?
= ?
?
??
??
??
?
rp
rd
XSe
?
??
??
??
?
. (3.10)
The expressions for the components of this direction are given by
?ya = (AS?1XAT)?1(b?AS?1Xrd),
?sa = ?rd ?AT?ya,
?xa = ?x?S?1X?sa.
To avoid moving too far along the path of the search direction, step lengths
are incorporated in the algorithms. Step lengths ??p and ??d are chosen as the largest
29
value (computed by a ratio test) to satisfy x + ??p?xa ? 0 and s + ??d?sa ? 0.
To ensure that the step length does not exceed 1, we set ??p = min{??p,1} and
??d = minbraceleftbig??d,1bracerightbig. To ensurex+ > 0 ands+ > 0, the steplengths are then multiplied
by a fixed positive number ? where 0 < ? < 1. Therefore, x and s are updated by
computing x+ = x + ?p?xa > 0 and s+ = s + ?d?sa > 0 where parenleftbig?p,?dparenrightbig =
?parenleftbigmin{??p,1},minbraceleftbig??d,1bracerightbigparenrightbig. The dual variables y are unrestricted in value and
updated by y+ =y+?d?ya.
Since the expressions for the affine-scaling direction depend on the dual resid-
uals, the dual residuals must be updated within the algorithms as well. The update
for the dual residuals can be written as
r+d = ATy+ +s+ ?c
= AT(y+?d?ya) + (s+?d?sa)?c
= (ATy+s?c) +?d(AT?ya + ?sa)
= (1??d)rd.
Observe that the expressions for the affine-scaling direction do not depend on the
primal residuals. As a result, the primal residuals are not incorporated into the
algorithms.
The algorithms terminate once x and y are feasible for their respective prob-
lems and the duality gap |cTx?bTy| falls below a small tolerance, ?.
30
3.2 A Primal-Dual Affine-Scaling (PDAS) Algorithm
In this section, we summarize a general PDAS algorithm (see Monteiro et al.
[19])
???????????????-
A General PDAS Algorithm
Input: (x,y,s) with x>0 and s>0, 0 <? < 1, convergence tolerance ?.
Initialize: rd = ATy+s?c.
Main Algorithm:
while |cTx?bTy|/max|cTx,1|>?
Compute the affine-scaling direction:
?ya = parenleftbigAS?1XATparenrightbig?1bracketleftbigb?AS?1Xrdbracketrightbig,
?sa = ?rd ?AT?ya,
?xa = ?x?S?1X?sa.
Compute the affine step:
??p =
?
???
???
1 if ?xaj ? 0, ?j
min?xaj<0bracketleftbig?xj/?xajbracketrightbig otherwise
,
??d =
??
??
???
1 if ?saj ? 0, ?j
min?saj <0bracketleftbig?sj/?sajbracketrightbig otherwise
,
?p = ? min(??p,1); ?d =? minparenleftbig??d,1parenrightbig.
31
Update the solution:
x+ = x+?p?xa,
y+ = y+?d?ya,
s+ = s+?d?sa.
Update the residuals:
r+d = parenleftbig1??dparenrightbigrd.
end(while)
???????????????-
3.3 The redPDAS Algorithm
The general PDAS algorithm in section 3.2 considers the entire LP data set
(A,b,c) at every iteration. In this section, we present a reduced PDAS algorithm,
redPDAS. The vectors ??xa ? Rfracturn?1 and (??ya,??sa) ? Rfractur(m+n)?1 define the affine-
scaling direction in the redPDAS algorithm. We refer to ??za = (??xa,??ya,??sa)
as the approximate affine-scaling direction. The vectors ??xaQ ? Rfracturk?1 and ??saQ ?
Rfracturk?1 are composed of the components ??xaj and ??saj, respectively, with j ? Q.
Let ?Q represent the index set of dual constraints that are added to Q when it is
determined the algorithm is not making progress towards satisfying dual feasibility.
To differentiate the scalar quantities ?p and ?d from the general PDAS algorithm,
we use the subscript Q to emphasize their computation within this algorithm.
???????????????-
32
The redPDAS Algorithm
Input: (x,y,s) with x> 0 and s> 0, ubnd ? 3m, 0 <? < 1, convergence tolerance
?.
Initialize: rd = ATy+s?c, lbnd = 3m, Q = {1,2,...,n}, ?Q = ? (null set).
Main Algorithm:
while |cTx?bTy|/max|cTx,1|>?
Select the most promising dual constraints 3
Q = Q? ?Q,
k = |Q|.
Compute the affine-scaling direction:
??ya = parenleftbigAQS?1Q XQATQparenrightbig?1bracketleftbigb?AQS?1Q XQ(rd)Qbracketrightbig, (3.11)
??saQ = ?(rd)Q ?ATQ??ya, (3.12)
??xaQ = ?xQ ?S?1Q XQ??saQ.
Compute the affine step:
??pQ =
?
??
???
1 if (??xaQ)j ? 0, ?j
min(??xaQ)j<0bracketleftbig?(xQ)j/(??xaQ)jbracketrightbig otherwise
,(3.13)
??dQ =
?
???
???
1 if (??saQ)j ? 0, ?j
min(??saQ)j<0bracketleftbig?(sQ)j/(??saQ)jbracketrightbig otherwise
,(3.14)
?pQ =? minparenleftbig??pQ,1parenrightbig, ?dQ =? minparenleftbig??dQ,1parenrightbig. (3.15)
3The set Q is determined by the algorithm in Section 2.2.3: Selection of Q
33
Create full length vectors, ??xa and ??sa:
??xaj =
?
???
???
(??xaQ)j? if ? ?Q
0 otherwise
, (3.16)
??saj =
?
???
???
(??saQ)j? if ??Q
0 otherwise
. (3.17)
Update the solution and dual residuals:
x+ = x+?pQ??xa, (3.18)
y+ = y+?dQ??ya,
s+ = s+?dQ??sa.
if k<n
Update (x+,s+) and rd. 4
else
r+d =parenleftbig1??dQparenrightbigrd.
end(if)
Prepare for the next iteration:
x= x++, s =s++, y = y+, rd = r+d .
end(while)
???????????????-
4The updated solution and dual residuals are determined by the algorithm in Section 2.2.4:
Update Strategy
34
3.4 Convergence Analysis of redPDAS
In this section, we provide preliminary results for the convergence of the
redPDAS algorithm. Specifically, we show that our algorithm is well-defined and
that the direction we compute is an approximation to the standard Newton direc-
tion used in PDAS. With these results and certain modifications to the algorithm,
we show how global and local quadratic convergence can be proved. A complete
analysis of global and local quadratic convergence follow from Tits et al. [24] which
is strongly modeled from the analysis in [23] inspired by [21] and [25].
We define N = {1,2,...,n} and ?Q = {j ?N : j /?Q}. We have normalized
the problem so that bardblajbardbl2 = 1 where aj denotes the jth column of A. The following
assumptions will be needed for the analysis:
Assumptions:
(A1) Every m?k submatrix of A has full row rank and bnegationslash= 0.
(A2) Nondegeneracy conditions hold: If (x?,y?,s?) solves the primal-dual pair then
there exist exactly m indices in ? = {j ? N : x?j > 0,s?j = 0} and n?m
indices in ?? = {j ?N : x?j = 0,s?j > 0}.
The first two lemmas show that the components of the dual residual as well as
vectors x and s remain strictly positive.
Lemma 3.1 : Suppose the initial point z0 is defined as in Section 2.2.2, with
(x0,s0) > 0. Then if z0 is not the optimal solution, r0d >0 and 0 <r+d <rd.
35
Proof: The initial dual residual vector is given by
r0d = ATy0 +s0 ?c,
where y0 = ?y = parenleftbigAATparenrightbig?1Ac and s0 = ?s+ ??se = c?ATy0 + ??se. Simplifying this
expression gives
r0d = ??se,
where e is the vector of all ones. The expression for ??s is given by
??s = ?s +.5
bracketleftBigg
(?x+?xe)T(?s+?se)summationtext
n
j=1 ?xj +?x
bracketrightBigg
,
where ?x = max(?1.5min{?xj},0) and ?s = max(?1.5min{?sj},0) with ?x =
AT parenleftbigAATparenrightbig?1b. By Assumption (A1), ?x negationslash= 0 and ?s negationslash= 0. To show r0d > 0, we
must prove ??s > 0. This can be done by showing (1) ?x+?xe ? 0, (2) ?s+?se ? 0,
and (3) summationtextnj=1 ?xj +?xe>0.
(1) If min{?xj} < 0, then ?x = max(?1.5min{?xj},0) = ?1.5min{?xj} > 0.
Therefore,
?xj +?x = ?xj ?1.5min{?xj}
? min{?xj}?1.5min{?xj}
= ?.5min{?xj}
> 0.
If min{?xj} ? 0, then ?x = max(?1.5min{?xj},0) = 0. Therefore,
?xj +?x = ?xj
? min{?xj}
? 0.
36
This proves ?x+?xe? 0.
(2) A similar proof shows ?s+?se? 0.
(3) Since ?x negationslash= 0, there exists at least one component of ?x such that ?xj negationslash= 0.
Therefore, summationtextnj=1 ?xj +?xe>0.
The previous three results imply that
(?x+?xe)T(?s+?se)summationtext
n
j=1 ?xj +?x
? 0.
If min{?sj}< 0, then ?s > 0 and
(rd)0j = ??s = ?s +.5
bracketleftBigg
(?x+?xe)T(?s+?se)summationtext
n
j=1 ?xj +?x
bracketrightBigg
> 0, ?j.
If min{?sj} ? 0, then ?s = 0 and
(rd)0j = ??s = .5
bracketleftBigg
(?x+?xe)T?ssummationtext
n
j=1 ?xj +?x
bracketrightBigg
>0, ?j,
since (?x,?s) negationslash= 0. This proves
r0d = ??se>0.
For the second part of this proof, let ? = ATy+ ?c. It should be noted that
the dual residuals are only updated in the case where the index lscript associated with
?lscript = maxj(?j) is in Q. The updated dual residuals are given by
r+d = parenleftbig?lscript +s+lscript parenrightbige
= bracketleftbig(1??dQ)(rd)lscriptbracketrightbige.
We know (rd)0j = ??s >0 for all j. Since the step length ?dQ ? (0,1), we have
0 < (1??dQ)(rd)j < (rd)j, ?j.
37
[]
Lemma 3.2 : Suppose (x,s)> 0 andrd >0. Then (i) (x+,s+) > 0; (ii) (x++,s++) >
0.
Proof: Suppose x,s> 0. Then for (i) we have,
x+ = x+?pQ??xa.
Since ??xa?Q = 0, x+j = xj > 0 for all j ? ?Q. We now examine x+j with j ? Q. By
definition,
?pQ = ?
?
???
???
1 if (??xaQ)j ? 0, ?j
min
parenleftbigg
minj
braceleftbigg
?(xQ)j
(??xaQ)j : (??x
a
Q)j < 0
bracerightbigg
, 1
parenrightbigg
otherwise
,
where 0 <? < 1. If (??xaQ)j ? 0 for all j, then
(x+Q)j = (xQ)j +?(??xaQ)j > 0, ?j.
Otherwise, let
tk = ?(xQ)k(??xa
Q)k
= min
j
braceleftBigg
?(xQ)j
(??xaQ)j : (??x
a
Q)j < 0
bracerightBigg
,
where k is the index of the minimum ratio. Suppose (x+Q)j = (xQ)j +?pQ(??xaQ)j ? 0
for some j. If ?pQ = ?tk, then
(x+Q)j = (xQ)j +?tk(??xaQ)j ? 0.
Solving for tk, we have
?tk(??xaQ)j ? ?(xQ)j,
tk ? ?(xQ)j? ?(??xa
Q)j
> ?(xQ)j(??xa
Q)j
.
38
This is a contradiction since tk is the minimum ratio so in this case x+Q > 0.
If ?pQ = ?, then ??xaQ ? 0 by (3.13) and xQ > 0 by assumption. Therefore,
x+Q =xQ +?pQ??xaQ > 0.
A similar analysis holds for s+. This proves (i).
For (ii), let ?lscript = maxj (?j) where ? = ATy+ ?c. If lscript? Q, the updated slack
vector s++ is given by
s++ = r+d ??,
where r+d =?lscript +s+lscript . Continuing we have
s++ = r+d ??
= parenleftbig?lscript +s+lscript parenrightbige??.
So, for all j,
sj ? ?j +s+l ??j
> 0.
Since
x++j =
?
???
???
x+j j ?Q
min(s++,x+) j /?Q
,
it follows that x++ > 0. If lscript ? ?Q, we return to the previous iterate and recompute
the solution. In this case (x++,s++) = (x,s)> 0. []
The next two lemmas show the reduced form of the normal equation matrix
and augmented Jacobian of F are nonsingular.
39
Lemma 3.3 : Suppose (A2) holds and XQ,SQ > 0. Then AQS?1Q XQATQ is positive
definite.
Proof: Suppose v is any non-zero vector in Rfracturm. Then
vTAQS?1Q XQATQv = bardblDQATQvbardbl2,
where DQ =S?1Q XQ. Our assumption implies DQ is positive definite. Since AQ has
full row rank and v negationslash= 0, ATQv negationslash= 0. Therefore bardblDQATQvbardbl2 > 0 and AQS?1Q XQATQ is
positive definite. []
In Chapter 2, we defined the vector
F(x,y,s) = ?
?
??
??
??
?
rp
rd
XSe
?
??
??
??
?
,
where rp = Ax?b and rd =ATy+s?c. The Jacobian of F, denoted J(A,x,s), is
given by ?
??
??
??
?
A 0 0
0 AT I
S 0 X
?
??
??
??
?
.
By eliminating ?s in the linear system,
J(A,x,s) ?z = ?F(x,y,s),
we have ?
??
?
A 0
S ?XAT
?
??
?
?
??
?
?x
?y
?
??
?= ?
?
??
?
rp
X(s?rd)
?
??
?.
40
The matrix ?
??
?
A 0
S ?XAT
?
??
?
is referred to as the augmented Jacobian of F, denoted Ja(A,x,s). The next lemma
shows Ja(AQ,xQ,sQ) is nonsingular.
Lemma 3.4 : Suppose (A2) holds and xQ,sQ > 0. Then the augmented Jacobian
of F(xQ,y,sQ), ?
??
?
0 AQ
?XQATQ SQ
?
??
?,
is nonsingular.
Proof:
Assume the system given by
?
??
?
0 AQ
?XQATQ SQ
?
??
?
?
??
?
?1
?2
?
??
?=
?
??
?
0
0
?
??
?
has a nonzero solution. This gives the equations
AQ?2 = 0, (3.19)
XTQATQ?1 ?SQ?2 = 0, (3.20)
for some vectors ?1 and ?2. Solving equation (3.20) for ?2 and substituting this
expression into (3.19) gives
AQS?1Q XQATQ?2 = 0.
41
By Lemma (3.3),AQS?1Q XQATQ is positive definite and therefore nonsingular. There-
fore, ?2 = 0. Equation (3.20) reduces to
ATQ?1 = 0,
since xQ > 0. This implies ?1 = 0 since AQ has full row rank. This contradicts our
assumption that [?1,?2]T is nonzero and the result follows. []
We now show the norm of the difference between the standard direction com-
puted in PDAS and redPDAS is bounded. The dual residual vector associated
with the redPDAS algorithm is denoted by tildewiderd. All other notation is standard.
Theorem 3.1 Suppose assumptions (A1) - (A3) hold. Given ?> 0, let
?? = {z : bardblz?z?bardbl<?, x> 0, s> 0}.
Given z ? ??, let ?za be the Newton direction at z. Then there exists ? > 0 such
that
vextenddoublevextenddouble??za
Q ??z
a
Q
vextenddoublevextenddouble??bardblz?z?bardblbardbl?zabardbl,
for all z ? ??.
Proof:
We use a strategy similar to the one in Tits et al. [24]. Let z ? ?a. The direction
??za is given by
?
??
?
0 AQ
?XQATQ SQ
?
??
?
?
??
?
??ya
??xaQ
?
??
? =
?
??
?
b?AQxQ
XQ
bracketleftBig
(tildewiderd)Q ?sQ
bracketrightBig
?
??
?. (3.21)
42
The PDAS Newton direction is given by
?
??
?
0 A
?XAT S
?
??
?
?
??
?
?ya
?xa
?
??
? =
?
??
?
b?Ax
X[rd ?s]
?
??
?.
Partitioning the previous expression in terms of Q and ?Q gives
?
??
??
??
?
0 AQ A?Q
?XQATQ SQ 0
?X?QAT?Q 0 S?Q
?
??
??
??
?
?
??
??
??
?
?ya
?xaQ
?xa?Q
?
??
??
??
?
=
?
??
??
??
?
b?AQxQ ?A?Qx?Q
XQ [(rd)Q ?sQ]
X?Qbracketleftbig(rd)?Q ?s?Qbracketrightbig
?
??
??
??
?
.
Premultiplying the third row by ?A?QS?1?Q and adding to the first row, we can elim-
inate ?xa?Q to obtain the following system of equations,?
??
?
A?QS?1?Q X?QAT?Q AQ
?XQATQ SQ
?
??
?
?
??
?
?ya
?xaQ
?
??
?=
?
??
?
b?AQxQ
XQ
bracketleftBig
(tildewiderd)Q ?sQ
bracketrightBig
?
??
??
?
??
?
A?QS?1?Q X?Q(rd)?Q
XQ
bracketleftBig
(tildewiderd)Q ?(rd)Q
bracketrightBig
?
??
?. (3.22)
The first term in the right-hand side of (3.22) is exactly the right-hand side of (3.21).
The lastk components of the second vector are zero sincerd = tildewiderd. Combining (3.21)
and (3.22), we have?
??
?
0 AQ
?XQATQ SQ
?
??
?
?
??
?
??ya
??xaQ
?
??
?=
?
??
?
A?QS?1?Q X?QAT?Q AQ
?XQATQ SQ
?
??
?
?
??
?
?ya
?xaQ
?
??
?+?,
where
? =
?
??
?
A?QS?1?Q X?Q(rd)?Q
0
?
??
?.
43
Since
?
??
?
A?QS?1?Q X?QAT?Q AQ
?XQATQ SQ
?
??
?=
?
??
?
A?QS?1?Q X?QAT?Q 0
0 0
?
??
?+
?
??
?
0 AQ
?XQATQ SQ
?
??
?,
and sinceJa(AQ,xQ,sQ) is nonsingular by Lemma 3.4, we can express the difference
between the two directions by?
??
?
??ya
??xaQ
?
??
??
?
??
?
?ya
?xaQ
?
??
?=
Ja(AQ,xQ,sQ)?1
?
??
?
?
??
?
A?QS?1?Q X?QAT?Q 0
0 0
?
??
?
?
??
?
?ya
?xaQ
?
??
?+?
?
??
?. (3.23)
The norm of the left-hand side of (3.23) is bounded by the sum of the norms of the
individual terms on the right-hand side. Since Ja(AQ,xQ,sQ) is nonsingular for all
Q and there are only a finite number of choices of Q,
vextenddoublevextenddoubleJ
a(AQ,xQ,sQ)?1
vextenddoublevextenddouble??
0,
44
for some ?0 > 0. The norm
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
?
??
?
A?QS?1?Q X?QAT?Q 0
0 0
?
??
?
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
=
vextenddoublevextenddouble
vextenddoubleA?QS?1?Q X?QAT?Q
vextenddoublevextenddouble
vextenddouble
= max
bardblvbardbl=1
vextenddoublevextenddouble
vextenddoubleA?QS?1?Q X?QAT?Qv
vextenddoublevextenddouble
vextenddouble, where v ? Rfracturn?k
= max
bardblvbardbl=1
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
summationdisplay
j??Q
xj
sj aja
T
j v
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
? max
bardblvbardbl=1
summationdisplay
j??Q
vextendsinglevextendsingle
vextendsinglevextendsinglexj
sj
vextendsinglevextendsingle
vextendsinglevextendsinglevextendsinglevextendsingleaTj vvextendsinglevextendsinglebardblajbardbl,
by the Triangle Inequality
?
summationdisplay
j??Q
vextendsinglevextendsingle
vextendsinglevextendsinglexj
sj
vextendsinglevextendsingle
vextendsinglevextendsingle, by Assumption (A4)
=
vextenddoublevextenddouble
vextenddoubleS?1?Q X?Q
vextenddoublevextenddouble
vextenddouble
1
? (n?k)
vextenddoublevextenddouble
vextenddoubleS?1?Q
vextenddoublevextenddouble
vextenddouble
vextenddoublevextenddoubleX
?Q
vextenddoublevextenddouble,
sincebardblhbardbl1 ?nbardblhbardbl2,?h?Rfracturn
= (n?k)
vextenddoublevextenddouble
vextenddoubleS?1?Q
vextenddoublevextenddouble
vextenddouble
vextenddoublevextenddouble
vextenddoubleX?Q ?X??Q
vextenddoublevextenddouble
vextenddouble, since X??Q = 0
? (n?k)
vextenddoublevextenddouble
vextenddoubleS?1?Q
vextenddoublevextenddouble
vextenddoublebardblz?z?bardbl.
The relation in the third to last step can be found, for example, in [10] or [27]. Let
vextenddoublevextenddouble
vextenddoubleS?1?Q
vextenddoublevextenddouble
vextenddouble= max
j??Q
(1/sj) = ?1. (3.24)
Then,
45
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
Ja(AQ,xQ,sQ)?1
?
??
?
A?QS?1?Q X?QAT?Q 0
0 0
?
??
?
?
??
?
?ya
?xaQ
?
??
?
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
? vextenddoublevextenddoubleJa(AQ,xQ,sQ)?1vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
?
??
?
A?QS?1?Q X?QAT?Q 0
0 0
?
??
?
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
?
??
?
?ya
?xaQ
?
??
?
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
? ?0 ?(n?k)?1bardblz?z?bardbl?bardbl?zabardbl
= ?2bardblz?z?bardblbardbl?zabardbl, (3.25)
where ?2 = ?0 ??1 ? (n?k). This gives a bound for the norm of the first term in
(3.23). Examining the norm of the second term in (3.23), we find
bardbl?bardbl =
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
?
??
?
A?QS?1?Q X?Q(rd)?Q
0
?
??
?
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
=
vextenddoublevextenddouble
vextenddoubleA?QS?1?Q X?Q(rd)?Q
vextenddoublevextenddouble
vextenddouble
=
vextenddoublevextenddouble
vextenddoubleA?QS?1?Q X?Q
parenleftBig
?AT?Q?ya ??sa?Q
parenrightBigvextenddoublevextenddouble
vextenddouble
?
vextenddoublevextenddouble
vextenddoubleA?QS?1?Q X?QAT?Q
vextenddoublevextenddouble
vextenddoublebardbl?yabardbl+
vextenddoublevextenddoubleA
?Q
vextenddoublevextenddoublevextenddoublevextenddouble
vextenddoubleS?1?Q
vextenddoublevextenddouble
vextenddouble
vextenddoublevextenddoubleX
?Q
vextenddoublevextenddoublevextenddoublevextenddouble
vextenddouble?sa?Q
vextenddoublevextenddouble
vextenddouble
? (n?k)?1bardblz?z?bardblbardbl?zabardbl+ (n?k)?1bardblz?z?bardblbardbl?zabardbl,
since
vextenddoublevextenddouble
vextenddoubleA?QS?1?Q X?QAT?Q
vextenddoublevextenddouble
vextenddouble? (n?k)?1bardblz?z?bardbl,
vextenddoublevextenddouble
vextenddoubleS?1?Q
vextenddoublevextenddouble
vextenddouble= ?1, and
vextenddoublevextenddoubleX
?Q
vextenddoublevextenddouble=vextenddoublevextenddouble
vextenddoubleX?Q ?X??Q
vextenddoublevextenddouble
vextenddouble?
bardblz?z?bardbl. The third line follows since rd = ?AT?ya??sa in the PDAS algorithm.
46
The norm of A?Q is bounded by
vextenddoublevextenddoubleA
?Q
vextenddoublevextenddouble = max
vnegationslash=0
vextenddoublevextenddoubleA
?Qv
vextenddoublevextenddouble
bardblvbardbl , where v ?Rfractur
n?k
= max
vnegationslash=0
vextenddoublevextenddouble
vextenddoublesummationtextj??Qajvj
vextenddoublevextenddouble
vextenddouble
bardblvbardbl
? max
vnegationslash=0
summationtext
j??Qbardblajbardbl|vj|
bardblvbardbl
= max
vnegationslash=0
bardblvbardbl1
bardblvbardbl
? max
vnegationslash=0
(n?k)bardblvbardbl
bardblvbardbl
= n?k.
Therefore,
vextenddoublevextenddoubleJ
a(AQ,xQ,sQ)?1?
vextenddoublevextenddouble ? vextenddoublevextenddoubleJ
a(AQ,xQ,sQ)?1
vextenddoublevextenddoublebardbl?bardbl
? ?0 ?2(n?k)?1bardblz?z?bardblbardbl?zabardbl
= 2?2bardblz?z?bardblbardbl?zabardbl. (3.26)
Combining (3.23), (3.25), and (3.26), we have
vextenddoublevextenddouble??za
Q ??z
a
Q
vextenddoublevextenddouble ? 3?
2bardblz?z?bardblbardbl?zabardbl.
Setting ? = 3?2 gives the desired result. []
Global and local quadratic convergence proofs can be shown provided specific
modifications are made to the redPDAS algorithm. These modifications follow
from Tits et. al. [24] where a complete convergence analysis can be found. Before
we explain these changes, we summarize the differences between the redPDAS al-
gorithm and its modified version in Table 3.1:
47
redPDAS Modified redPDAS
algorithm algorithm
Accepts a dual-infeasible Requires a dual-feasible
Feasibility: initial point; strives to point at each iteration
achieve dual-feasibility
Terms AQD2QATQ, ??xaQ, ??saQ, AQD2QATQ, ??xaQ
with Q: xQ, sQ, (rd)Q
Dual affine ?dQ = ? minparenleftbig??dQ,1parenrightbig ?dQ = minparenleftbigmaxbraceleftbig???dQ, ??dQ ?bardbl?yabardblbracerightbig,1parenrightbig
step: where 0 <? < 1 where 0 <? < 1
Uses a general updating Uses a general updating
strategy to compute strategy to compute
Update x+, y+, and s+; y+ and s+; provides
scheme: provides a unique updating a unique updating
scheme to compute r+d and scheme to compute x+
further update x+ and s+
Table 3.1: A summary of the differences between the redPDAS algorithm and a
modified version of the algorithm adapted from Tits et. al. [24] that proves to be
locally and quadratically convergent.
48
The following preliminary information comes from Tits et. al. [24]:
Let
F =braceleftbigy : ATy ?cbracerightbig (3.27)
be the feasible solution set for the dual LP problem and
I(y) =braceleftbigj ?N : aTj y = cjbracerightbig (3.28)
denote the index set of active constraints at y. It is assumed
(A4) The dual optimal solutionsetF? =braceleftbigy? ?F : bTy? ?bTy ?y ?Fbracerightbigisnonempty
and bounded,
(A5) For all y ?F, {aj : j ?I(y)} is a linearly independent set of vectors, and
(A6) The dual optimal solution set from assumption (A5) is such that F? = {y?}
(i.e. y? is unique).
Assumptions (A4) and (A5) are required to show global convergence. Assumption
(A6), which supersedes (A4), is used to show local quadratic convergence. We now
address the modifications required for the convergence analysis.
Two significant changes to the redPDAS algorithm allow a complete con-
vergence analysis. The first modification is maintaining dual-feasibility at each
iteration. This implies si = c?ATyi > 0 and rid = 0 for all i. Consequently, the
algorithm for updating (x+,s+) and rd has no effect on this dual-feasible algorithm
and can be discarded. Furthermore, equations (3.12) and (3.14) must be computed
49
with the full data to satisfy the above requirement on the slack variables. As a
result, equation (3.17) must be eliminated from the algorithm. The second critical
change is the update scheme for the primal variables. The ??xaQ component of the
normal equations remains the same; however the full length vector ??xa in (3.16) is
determined before the ?Compute the affine step? section, and the primal affine step
(3.13) and the updated primal solution (3.18) can be replaced with
?x = x+ ??xa,
[?x?]j = min{?xj,0},
and
x+j = min
braceleftBig
max
parenleftBig
minbraceleftbigbardbl?yabardbl2 +bardbl?x?bardbl2,xbracerightbig,(?x?)j
parenrightBig
,xmax
bracerightBig
?j ?N,(3.29)
respectively, with x> 0 and xj ? xmax ?j ?N. The key to the global convergence
analysis is the availability of a dual-feasible point at every iteration along with the
condition, bardbl?yabardbl2 + bardbl?x?bardbl2, on the primal updates. A local quadratic convergence
analysis follows provided the above conditions hold and the bound ??dQ ? bardbl?yabardbl
is imposed on the dual affine step in equation (3.15). The modified redPDAS
algorithm is presented below:
???????????????-
The Dual-Feasible redPDAS Algorithm
Input: (x,y,s) with x> 0,xmax > 0 such that xj ? xmax ?j ?N and s = c?ATy,
x>0, ubnd ? 3m, 0 <? < 1, convergence tolerance ?.
Initialize: lbnd = 3m, Q = {1,2,...,n}, ?Q= ?.
50
Main Algorithm:
while |cTx?bTy|/max|cTx,1|>?
Select most promising dual constraints 5
Q= Q? ?Q.
Compute the affine-scaling direction:
??ya = parenleftbigAQS?1Q XQATQparenrightbig?1b,
??sa = ?AT??ya,
??xaQ = ?xQ ?S?1Q XQ??saQ.
Create the full vector ??xa:
??xaj =
?
???
???
(??xaQ)j? if ??Q
0 otherwise
.
Set ?x= x+ ??xa and for j ?N, define
[?x?]j = min{?xj,0}.
Compute the affine step:
??dQ =
?
???
???
1 if ??saj ? 0, ?j
min??saj <0bracketleftbig?sj/??sajbracketrightbig otherwise
,
?dQ = minparenleftbigmaxbraceleftbig???dQ, ??dQ ?bardbl?yabardblbracerightbig,1parenrightbig.
Update the solution:
x+j = min
braceleftBig
max
parenleftBig
minbraceleftbigbardbl?yabardbl2 +bardbl?x?bardbl2,xbracerightbig,(?x?)j
parenrightBig
,xmax
bracerightBig
?j ?N,
5The set Q is determined by the algorithm in Section 2.2.3: Selection of Q
51
y+ = y+?dQ??ya,
s+ = s+?dQ??sa.
end(while)
???????????????-
Under Assumptions (A1),(A4), and (A5) and the aforementioned modifica-
tions to the redPDAS algorithm, we can produce a proof of global convergence
that follows from Tits et al. [24]. We give a brief outline here.
Lemmas (3.2) and (3.3) can be extended to the modified redPDAS algorithm
to show the algorithm is well-defined. As a result, the ?ya-component of the normal
equations can justifiably be expressed as
??ya = parenleftbigAQS?1Q XQATQparenrightbig?1b.
Since it is assumed b negationslash= 0, the sequence of dual objective function values braceleftbigbTybracerightbig is
nondecreasing. LetK be an infinite index set. If we assume {y} converges to y? and
{?ya} converges to 0 on K, then it can be shown that y? is stationary (i.e. Ax? =
b,X?s? = 0) and {?x} converges to x? where x? is the unique Lagrange multiplier
associated with y?. Proving {y} converges to F? (global convergence) requires two
main steps. The first step is to prove {y} converges to the set of stationary points
y? of the dual LP problem. This is achieved by a contradiction argument: if {y}
converges to a nonstationary point on K, then {?ya} must converge to 0 on K. The
proof is largely dependent on the bound, bardbl?yabardbl2 + bardbl?x?bardbl2, in the primal updates.
The second step is to prove that the mulitplier 6 vectors associated with all limit
6The set of x ? Rfracturn such that Ax = b and X(c?ATy) = 0.
52
points of {y} are the same. Using the proofs from steps 1 and 2 along with the fact
that braceleftbigbTybracerightbig is nondecreasing, it can be shown {y} converges to F?.
With the additional assumption (A6), (local) q-quadratic convergence of the
pair (x,y) 7 can be shown. Assumption (A6) implies that strict complementary
holds, i.e.
x?j > 0, s?j =cj ?aTj y? = 0 ?j ?I(y?).
Furthermore,
span({aj : j ?I(y?)}) = Rfracturm.
We can extend the above results to Lemma 3.4 to show J(AQ,x?Q,s?Q) is nonsingular.
This, in addition to Lemmas 1, 14 and Proposition 15, all in Tits et al. [24], along
with Lemma (3.1) with rd = 0 and the bound on the dual affine step, ??dQ ?bardbl?yabardbl,
provides the necessary tools to complete the q-quadratic convergence analysis.
3.5 Summary
In this chapter, we introduced the redPDAS algorithm and explained how
specific modifications to the algorithm provide global and local q-quadratic con-
vergence results. With further research, we hope to prove similar results for the
redPDAS algorithm. Numerical results from redPDAS are given in Chapter 5.
Before presenting the results we present the redMPC algorithm.
7The sequence braceleftbig(xi,yi)bracerightbig converges q-quadratically to (x?,y?) if it converges to (x?,y?) and
there exists a constant ? such that vextenddoublevextenddouble(xi+1,yi+1)?(x?,y?)vextenddoublevextenddouble? ?vextenddoublevextenddouble(xi,yi)vextenddoublevextenddouble2 .
53
Chapter 4
A Reduced Mehrotra Predictor-Corrector (redMPC) Algorithm
4.1 MPC Method
4.1.1 Overview
Mehrotra?sPredictor-Corrector algorithm generatesa sequence of approximate
solutions, z(?) = (x(?),y(?),s(?)), to the perturbed KKT conditions:
Ax(?)?b = 0, (4.1)
ATy(?) +s(?)?c = 0, (4.2)
X(?)S(?)e = ??e, (4.3)
(x(?),s(?)) > 0, (4.4)
where 0 ? ? ? 1 and ? = xTs/n > 0. The conditions here differ from the KKT
condtions in (2.3) in that the solution z(?) is uniquely defined for each ?> 0 and
the pairwise products xj(?)sj(?) =?? for each j. We define
C = {z(?) | ?> 0}
as the central path. The central path defines a trajectory of feasible solutions that
steer clear of the boundary of the primal-dual feasible region. As ? decreases to
zero, C converges to a primal-dual solution of the linear program (or of the KKT
conditions in (2.3)).
54
Mehrotra?s method differs from other interior-point methods in that two lin-
ear systems are solved at each iteration, one for the affine-scaling or predictor
direction, ?za, and one for the centering-corrector direction. The predictor di-
rection is obtained by solving (3.10). The centering-corrector direction, ?zcc =
(?xcc,?ycc,?scc), is calculated based on the amount of progress the predictor di-
rection has made in reducing ? and the error (or nonlinear term) in (2.8). This
direction is found by solving a slightly different system:
?
??
??
??
?
A 0 0
0 AT I
S 0 X
?
??
??
??
?
?
??
??
??
?
?xcc
?ycc
?scc
?
??
??
??
?
=
?
??
??
??
?
0
0
??e??Xa?Sae
?
??
??
??
?
, (4.5)
where we assume for convenience that (x,y,s) is primal and dual feasible. By
premultiplying the second block of equations by ?X and adding this to the third
block of equations, we can eliminate ?scc from the linear system. The result is the
following equivalent system:
?
??
?
A 0
S ?XAT
?
??
?
?
??
?
?xcc
?ycc
?
??
?=
?
??
?
0
??e??Xa?Sae
?
??
?,
with ?scc = ?AT?ycc. By further eliminating ?xcc by premultiplying the second
block of equations by ?AS?1 and adding this to the first block of equations, we
have
?ycc = ?(AS?1XAT)?1AS?1 (??e??Xa?Sae),
?scc = ?AT?ycc,
?xcc = S?1 (??e??Xa?Sae)?S?1X?scc.
55
The advantage of computing the centering-corrector direction is that we are able
to improve our linear, first-order model of F in (2.3) to a quadratic, second-order
model. In addition, Mehrotra?s method uses the same coefficient matrix to solve for
the two directions, and thus the extra cost of forming this second direction is that of
performing a single back-substitution. This additional cost is small. The sum of the
predictor and centering-corrector directions is the search direction used to derive an
improved solution to the perturbed KKT conditions.
For the remainder of the chapter, we focus on variants of Mehrotra?s Predictor-
Corrector (MPC) algorithm. A general MPC algorithm is given in section 4.2 and
a ?reduced? version called redMPC is presented in section 4.3. Preliminary results
for the convergence of the redMPC algorithm are provided in section 4.4. Global
and local convergence results follow with certain modifications to the algorithm.
Finally, the chapter is summarized in section 4.5.
4.1.2 Centering-Corrector Direction
The components of the centering-corrector direction are based on a centering
parameter, ?, and a corrector direction derived from the nonlinear term ?Xa?Sae.
We explain the significance of these terms here.
The affine-scaling direction aims to satisfy the KKT conditions by moving
toward the boundary of the feasible region defined by the nonnegative pairwise
products, xjsj > 0. If this direction makes significant progress in reducing the
duality measure, ?, while satisfying the conditions (x,s) > 0, we will be in good
56
proximity of the central path and closer to satisfying the KKT conditions (since
? ? 0). Therefore, our solution will require little centering (or movement towards
the central path). If the affine-scaling direction makes little progress in reducing ?,
then we will need a significant amount of centering so that the algorithm can make
better progress in reducing ? on the next iteration. To measure the efficiency of
the affine-scaling direction, we compare the hypothetical value of ? based on this
direction to the previous value of ?. We define
?a = (x+?
p
a?x
a)T parenleftbigs+?d
a?s
aparenrightbig
n
to be the hypothetical value of ? based on the affine-scaling direction and the cen-
tering parameter
? =
bracketleftbigg?a
?
bracketrightbigg3
=
bracketleftBigg
(x+?pa?xa)T parenleftbigs+?da?saparenrightbig
n?
bracketrightBigg3
to be the cube of the ratio of ?a and ?. 1 If ?a lessmuch ?, the affine-scaling direction
makes good progress in reducing ? and little centering is needed. Therefore, ? is
insignificant and is set close to 0. If?a is only a little smaller than ?, ? is set close to
1 so that the algorithm can be in a better position to reduce? on the next iteration.
The centering-corrector direction also uses the affine-scaling direction to com-
pensate for the error in the linear system, (2.4). We can interpret finding a solu-
tion z+ = z + ?z to the KKT conditions as calculating ?z to satisfy the primal
constraints, the dual constraints, and the complementary slackness condition [see
Chapter 2, equations (2.6) - (2.8)] given by
(xj + ?xj)(sj + ?sj) = 0, ?j,
1Although it is not a requirement to cube the ratio, Mehrotra [17] found this heuristic for
determining ? most efficient through exhaustive computational testing.
57
or equivalently
xjsj +xj?sj +sj?xj + ?xj?sj = 0, ?j,
X?s+S?x = ?XSe??X?Se. (4.6)
In computing the affine-scaling direction, the last term in (4.6) is ignored. That is,
X?sa +S?xa = ?XSe.
Since applying a full step along the affine-scaling direction to (4.6) gives
X?sa +S?xa +XSe+ ?Xa?Sae = ?Xa?Sae,
we can define a corrector direction, ?zcor = (?xcor,?ycor,?scor), to compensate for
this error term: ?
??
??
??
?
A 0 0
0 AT I
S 0 X
?
??
??
??
?
?
??
??
??
?
?xcor
?ycor
?scor
?
??
??
??
?
= ?
?
??
??
??
?
0
0
?Xa?Sae
?
??
??
??
?
.
The centering parameter and corrector direction must be computed after the
affine-scaling direction due to the fact that they are both dependent on it. In addi-
tion, the centering and corrector components can be combined into one step since
they are obtained by solving linear systems with the same coefficient matrix. The
result is the linear system in (4.5) for calculating the centering-correction direction.
4.2 A MPC Algorithm
In this section, we present a general MPC algorithm similar to the algorithm
in [29].
58
???????????????-
A General MPC Algorithm
Input: (x,y,s) with x>0 and s>0, 0 <? < 1, convergence tolerance ?.
Initialize: ? =xTs/n, rd = ATy+s?c.
Main Algorithm:
while |cTx?bTy|/max|cTx,1|>?
Compute the affine-scaling direction:
?ya = parenleftbigAS?1XATparenrightbig?1bracketleftbigb?AS?1Xrdbracketrightbig,
?sa = ?rd ?AT?ya,
?xa = ?x?S?1X?sa.
Compute the affine step:
??pa =
?
???
???
1 if ?xaj ? 0, ?j
min?xaj <0bracketleftbig?xj/?xajbracketrightbig otherwise
,
??da =
?
???
???
1 if ?saj ? 0, ?j
min?saj <0bracketleftbig?sj/?sajbracketrightbig otherwise
,
?pa = min(??pa,1), ?da = minparenleftbig??da,1parenrightbig.
Compute the centering parameter:
? =
bracketleftBigg
(x+?pa?xa)T parenleftbigs+?da?saparenrightbig
n?
bracketrightBigg3
.
59
Compute the centering-corrector direction:
? = ??S?1e?S?1?Xa?Sae,
?ycc = ?(AS?1XAT)?1A?,
?scc = ?AT?ycc,
?xcc = ??S?1X?scc.
Compute the predictor-corrector direction:
?x = ?xa + ?xcc,
?y = ?ya + ?ycc,
?s = ?sa + ?scc.
Compute the predictor-corrector step:
??p =
?
??
???
1 if ?xj > 0, ?j
min?xj<0 [xj/?xj] otherwise
,
??d =
?
???
???
1 if ?sj > 0, ?j
min?sj>0 [sj/?sj] otherwise
,
?p = ? min(??p,1), ?d =? minparenleftbig??d,1parenrightbig.
Update the variables:
x+ = x+?p?x,
y+ = y+?d?y,
s+ = s+?d?s.
60
Update quantities:
r+d = parenleftbig1??dparenrightbigrd,
?+ = (x+)T(s+)/n.
end(while) ???????????????-
4.3 The redMPC Algorithm
In this section, we present a reduced Mehrotra Predictor-Corrector algorithm,
redMPC. The vectors ??xa ? Rfracturn?1 and (??ya,??sa) ? Rfractur(m+n)?1 define the affine-
scaling direction. Similarily, the vectors ??xcc ? Rfracturn?1 and (??ycc,??scc) ? Rfractur(m+n)?1
define the centering-corrector direction. We refer to ??z = (??x,??y,??s) = (??xa +
??xcc,??ya + ??ycc,??sa + ??scc) as the approximate MPC direction. In accordance
with the notation, the vectors ??xQ ? Rfracturk?1 and ??sQ ? Rfracturk?1 are composed of the
components ??xj and ??sj, respectively with j ? Q. Let k = |Q| and ?Q represent
the index set of dual constraints that are added to Q when it is determined the
algorithm is not making progress towards satisfying dual feasibility. To differentiate
the scalar quantities ?pa, ?da, ?p, ?d, and ? from the general MPC algorithm, we use
the subscript Q to emphasize their computation within this algorithm.
???????????????-
The redMPC Algorithm
Input: (x,y,s) with x> 0 and s> 0, ubnd ? 3m, 0 <? < 1, convergence tolerance
?.
61
Initialize: ?Q = xTs/n, rd =ATy+s?c, lbnd = 3m, ?Q= ?.
Main Algorithm:
while |cTx?bTy|/max|cTx,1|>?
Select the most promising dual constraints 2
Q = Q? ?Q,
k = |Q|.
Compute the affine-scaling direction:
??ya = parenleftbigAQS?1Q XQATQparenrightbig?1bracketleftbigb?AQS?1Q XQ(rd)Qbracketrightbig, (4.7)
??saQ = ?(rd)Q ?ATQ??ya, (4.8)
??xaQ = ?xQ ?S?1Q XQ??saQ.
Compute the affine step:
??pQ,a =
?
???
???
1 ifparenleftbig??xaQparenrightbigj ? 0, ?j
min(??xa
Q)j<0
bracketleftBig
?(xQ)j/parenleftbig??xaQparenrightbigj
bracketrightBig
otherwise
,
??dQ,a =
?
???
???
1 if parenleftbig??saQparenrightbigj ? 0, ?j
min(??sa
Q)j<0
bracketleftBig
?(sQ)j/parenleftbig??saQparenrightbigj
bracketrightBig
otherwise
.(4.9)
?pQ,a = minparenleftbig??pQ,a,1parenrightbig, ?dQ,a = minparenleftbig??dQ,a,1parenrightbig. (4.10)
Compute the centering parameter:
?Q =
bracketleftBiggparenleftbig
xQ +?pQ,a??xaQparenrightbigT parenleftbigsQ +?dQ,a??saQparenrightbig
n?Q
bracketrightBigg3
. (4.11)
2The set Q is determined by the algorithm in Section 2.2.3: Selection of Q
62
Compute the centering-corrector direction:
?Q = ?Q?QS?1Q eQ ?S?1Q ? ?XaQ??SaQeQ,
??ycc = ?(AQS?1Q XQATQ)?1AQ?Q,
??sccQ = ?ATQ??ycc, (4.12)
??xccQ = ?Q ?S?1Q XQ??sccQ.
Compute the predictor-corrector direction:
??xQ = ??xaQ + ??xccQ, (4.13)
??y = ??ya + ??ycc, (4.14)
??sQ = ??saQ + ??sccQ. (4.15)
Compute the predictor-corrector step:
??pQ =
?
???
???
1 if (??xQ)j > 0, ?j
min(??xQ)j<0 [?(xQ)j/(??xQ)j] otherwise
,
??dQ =
?
???
??
1 if (??sQ)j >0, ?j
min(??sQ)j<0 [?(sQ)j/(??sQ)j] otherwise
,(4.16)
?pQ = ? minparenleftbig??pQ,1parenrightbig, ?dQ =? minparenleftbig??dQ,1parenrightbig. (4.17)
Create full length vectors, ??x and ??s:
??xj =
?
???
???
(??xQ)j? if ? ?Q
0 otherwise
, (4.18)
??sj =
??
??
???
(??sQ)j? if ? ?Q
0 otherwise
. (4.19)
63
Update the solution, the duality measure, and the dual residuals:
x+ = x+?pQ??x, (4.20)
y+ = y+?dQ??y, (4.21)
s+ = s+?dQ??s. (4.22)
if k<n
Update (x+,s+) and rd. 3
else
r+d =parenleftbig1??dQparenrightbigrd.
end(if)
?+Q = (x++)T(s++)/n.
Prepare for the next iteration
x= x++, s =s++, y = y+, rd = r+d , ?Q = ?+Q.
end(while) ???????????????-
4.4 Convergence Analysis of redMPC
In this section, we provide some preliminary results for the convergence of the
redMPC algorithm. We will show that our algorithm is well-defined and that the
direction we compute is an approximation to the standard Newton direction used
3The updated solution and dual residuals are determined by the algorithm in Section 2.2.4:
Update Strategy
64
in MPC. Using these results and certain modifications to the algorithm, we show
how global and local quadratic convergence can be proved. A complete analysis of
global and local quadratic convergence follows from Winternitz et al. [28].
In the MPC algorithm, two directions (affine-scaling and centering-corrector)
are calculated and added together to produce the MPC direction. The following
result, from Theorem 3.1 in Chapter 3, is used within the proof to show superlinear
convergence of the affine-scaling algorithm in Tits et. al. [24]:
vextenddoublevextenddouble??za
Q ??z
a
Q
vextenddoublevextenddouble??bardblz?z?bardblbardbl?zabardbl. (4.23)
In our analysis, we will use (4.23) to derive a bound on the centering-corrector
component of the MPC direction. The bound we seek is
vextenddoublevextenddouble??zcc
Q ??z
cc
Q
vextenddoublevextenddouble? ????(z), (4.24)
where ??zcc is the centering-corrector component of the approximate direction, ?zcc
is the centering-corrector component of the Newton direction, ?? > 0 and ?(z) is a
term that tends to zero as we approach the optimal solution. Combining (4.23) and
(4.24), we have the norm of the difference between the standard direction computed
in MPC and that computed in redMPC,
bardbl??zQ ??zQbardbl = vextenddoublevextenddoubleparenleftbig??zaQ + ??zccQparenrightbig?parenleftbig?zaQ + ?zccQparenrightbigvextenddoublevextenddouble
? vextenddoublevextenddouble??zaQ ??zaQvextenddoublevextenddouble+vextenddoublevextenddouble??zccQ ??zccQvextenddoublevextenddouble
? ?bardblz?z?bardblbardbl?zabardbl+ ????(z).
Let (xi,yi,si) be the solution at iteration i of MPC and redMPC. Assump-
tions (A1)-(A6) from Chapter 3 will be needed throughout the analysis along with
65
the following assumptions:
(A7) bardblxQbardbl?C1 and bardblsQbardbl ?C2 where 0 <C1,C2 <?.
(A8) |?Q?| ???bardbl?zabardbl where ?> 0.
The redMPC algorithm is well-defined since the iterates for the solution and
the dual residualremainstrictlypositiveandthe matricesAQS?1Q XQATQ andJa(AQ,xQ,sQ)
are positive definite. The proofs follow from Lemmas (3.1) - (3.4) in Chapter 3.
The Lemmas presented next are required to show (4.24). The first two lemmas give
bounds for terms involving the centering parameter, ?.
Lemma 4.1 : Suppose assumption (A7) and (4.23) and suppose ? = ?Q. Then,
|(???Q)?| ? 25?+V3bardblz?z?bardblbardbl?zabardbl+???parenleftbigbardbl?zabardbl2parenrightbig,
where
? = max{bardblz?z?bardbl,?} and
V3 = 1n [25(V1 +coV2bardblz?z?bardbl) + 10(V1 + 2V2)]
with
V1 = (C1 +C2)(co +?) + 3+?1bardbl?zabardbl and
V2 = (C1 +C2)?
for some co,?,?1 > 0.
Proof:
66
The quantity
(???Q)? =
bracketleftBig
(?a/?)3 ?parenleftbig?aQ/?Qparenrightbig3
bracketrightBig
?
= parenleftbig1/?2parenrightbig
bracketleftBig
(?a)3 ?parenleftbig?aQparenrightbig3
bracketrightBig
= parenleftbig1/?2parenrightbigparenleftbig?a ??aQparenrightbig
bracketleftBig
(?a)2 +?a ??aQ +parenleftbig?aQparenrightbig2
bracketrightBig
, (4.25)
since ? = ?Q. We begin by bounding the term ?a ??aQ. We can express this term
by:
?a ??aQ =
bracketleftBig
(x+?pa?xa)T parenleftbigs+?da?saparenrightbig?parenleftbigx+?pQ,a??xaparenrightbigT parenleftbigs+?dQ,a??saparenrightbig
bracketrightBig
/n
=
bracketleftBig
?daxT?sa +?pasT?xa +?pa?da (?xa)T ?sa
? ?dQ,axT??sa ??pQ,asT??xa ??pQ,a?dQ,a (??xa)T ??sa
bracketrightBig
/n.
By partitioning the right-hand side of the last expression into Q and ?Q, we can
eliminate the terms involving ??xa?Q and ??sa?Q since ??xa?Q = ??sa?Q = 0. This gives
?a ??aQ =
bracketleftBig
?daxTQ?saQ +?pasTQ?xaQ +?pa?daparenleftbig?xaQparenrightbigT ?saQ
+?daxT?Q?sa?Q +?pasT?Q?xa?Q +?pa?da
parenleftBig
?xa?Q
parenrightBigT
?sa?Q
? ?dQ,axTQ??saQ ??pQ,asTQ??xaQ ??pQ,a?dQ,aparenleftbig??xaQparenrightbigT ??saQ
bracketrightBig
/n.
67
By adding in, subtracting out, and grouping certain terms, we have
?a ??aQ = bracketleftbig?daxTQ?saQ ??daxTQ??saQ +?daxTQ??saQ ??dQ,axTQ??saQ
+?pasTQ?xaQ??pasTQ??xaQ +?pasTQ??xaQ ??pQ,asTQ??xaQ
+?pa?daparenleftbig?xaQparenrightbigT ?saQ??pQ,a?dQ,aparenleftbig?xaQparenrightbigT?saQ
+?pQ,a?dQ,aparenleftbig?xaQparenrightbigT?saQ ??pQ,a?dQ,aparenleftbig??xaQparenrightbigT ??saQ
+?daxT?Q?sa?Q +?pasT?Q?xa?Q +?pa?da
parenleftBig
?xa?Q
parenrightBigT
?sa?Q
bracketrightbigg
/n,
= bracketleftbig?daxTQparenleftbig?saQ ???saQparenrightbig+parenleftbig?da ??dQ,aparenrightbigxTQ??saQ
+?pasTQparenleftbig?xaQ ???xaQparenrightbig+parenleftbig?pa ??pQ,aparenrightbigsTQ??xaQ
+parenleftbig?pa?da ??pQ,a?dQ,aparenrightbigparenleftbig?xaQparenrightbigT ?saQ
+?pQ,a?dQ,a
bracketleftBigparenleftbig
?xaQparenrightbigT ?saQ ?parenleftbig??xaQparenrightbigT ??saQ
bracketrightBig
+?daxT?Q?sa?Q +?pasT?Q?xa?Q +?pa?da
parenleftBig
?xa?Q
parenrightBigT
?sa?Q
bracketrightbigg
/n.
Therefore,
vextendsinglevextendsingle?a ??a
Q
vextendsinglevextendsingle ? 1
n
bracketleftbigvextendsinglevextendsingle?d
a
vextendsinglevextendsinglebardblx
Qbardbl
vextenddoublevextenddouble?sa
Q ???s
a
Q
vextenddoublevextenddouble+vextendsinglevextendsingle?d
a ??
d
Q,a
vextendsinglevextendsinglebardblx
Qbardbl
vextenddoublevextenddouble??sa
Q
vextenddoublevextenddouble
+|?pa|bardblsQbardblvextenddoublevextenddouble?xaQ ???xaQvextenddoublevextenddouble+vextendsinglevextendsingle?pa ??pQ,avextendsinglevextendsinglebardblsQbardblvextenddoublevextenddouble??xaQvextenddoublevextenddouble
+vextendsinglevextendsingle?pa?da ??pQ,a?dQ,avextendsinglevextendsinglevextenddoublevextenddouble?xaQvextenddoublevextenddoublevextenddoublevextenddouble?saQvextenddoublevextenddouble
+
vextendsinglevextendsingle
vextendsingle?pQ,a?dQ,a
bracketleftBigparenleftbig
?xaQparenrightbigT ?saQ ?parenleftbig??xaQparenrightbigT ??saQ
bracketrightBigvextendsinglevextendsingle
vextendsingle
+vextendsinglevextendsingle?davextendsinglevextendsinglevextenddoublevextenddoublex?Qvextenddoublevextenddouble
vextenddoublevextenddouble
vextenddouble?sa?Q
vextenddoublevextenddouble
vextenddouble+|?pa|
vextendsinglevextendsingle
vextendsinglesT?Q?xa?Q
vextendsinglevextendsingle
vextendsingle+vextendsinglevextendsingle?pa?davextendsinglevextendsingle
vextendsinglevextendsingle
vextendsinglevextendsingleparenleftBig?xa?QparenrightBigT ?sa?Q
vextendsinglevextendsingle
vextendsinglevextendsingle
bracketrightbigg
.
We can bound vextendsinglevextendsingle?a ??aQvextendsinglevextendsingle by applying the result in (4.23) and proving vextendsinglevextendsingle?pa ??pQ,avextendsinglevextendsingle
and vextendsinglevextendsingle?da ??dQ,avextendsinglevextendsingle are small close to the optimal solution. We will use the following
68
results (see [13]): Assume v,w?Rfracturn. Then,
vextendsinglevextendsingle
vextendsinglemaxk
parenleftBigv
w
parenrightBig
k
vextendsinglevextendsingle
vextendsingle ? bardblvbardbl1bardblwbardbl
?
, (4.26)
bardblvbardbl1 ? ?nbardblvbardbl, (4.27)
1
bardblwbardbl? ?
?n 1
bardblwbardbl. (4.28)
Let
?? = max
braceleftBiggvextenddoublevextenddouble
?zaQvextenddoublevextenddoublevextenddouble
vextenddouble?xaQvextenddoublevextenddouble,
vextenddoublevextenddouble?za
Q
vextenddoublevextenddouble
vextenddoublevextenddouble?sa
Q
vextenddoublevextenddouble
bracerightBigg
. (4.29)
We examine the components of the current solution for which
braceleftBigparenleftbig
??xaQparenrightbigj ,parenleftbig??saQparenrightbigr
bracerightBig
<0, braceleftbig?pQ,a,?dQ,abracerightbig? 1, and
(xQ)j +?paparenleftbig?xaQparenrightbigj = 0, (sQ)r +?daparenleftbig?saQparenrightbigr = 0,
(xQ)j +?pQ,aparenleftbig??xaQparenrightbigj = 0, (sQ)r +?dQ,aparenleftbig??saQparenrightbigr = 0.
Here, ?pa = ?(xQ)j/parenleftbig?xaQparenrightbigj , ?pQ,a = ?(xQ)j/parenleftbig??xaQparenrightbigj , ?da = ?(sQ)r/parenleftbig?saQparenrightbigr, and
?dQ,a = ?(sQ)r/parenleftbig??saQparenrightbigr. The expression
vextendsinglevextendsingle?p
a ??
p
Q,a
vextendsinglevextendsingle =
vextendsinglevextendsingle
vextendsinglevextendsingle
vextendsingle
?(xQ)jparenleftbig
?xaQparenrightbigj +
(xQ)jparenleftbig
??xaQparenrightbigj
vextendsinglevextendsingle
vextendsinglevextendsingle
vextendsingle
=
vextendsinglevextendsingle
vextendsinglevextendsingle
vextendsinglevextendsingle
(xQ)j
bracketleftBigparenleftbig
??xaQparenrightbigj ?parenleftbig?xaQparenrightbigj
bracketrightBig
parenleftbig?xa
Q
parenrightbig
j ?
parenleftbig??xa
Q
parenrightbig
j
vextendsinglevextendsingle
vextendsinglevextendsingle
vextendsinglevextendsingle
?
vextendsinglevextendsingle
vextendsinglevextendsingle
vextendsingle
(xQ)jparenleftbig
??xaQparenrightbigj
vextendsinglevextendsingle
vextendsinglevextendsingle
vextendsingle?
vextendsinglevextendsingle
vextendsinglevextendsingle
vextendsingle
parenleftbig??xa
Q
parenrightbig
j ?
parenleftbig?xa
Q
parenrightbig
jparenleftbig
?xaQparenrightbigj
vextendsinglevextendsingle
vextendsinglevextendsingle
vextendsingle, ?j
? vextendsinglevextendsingle?pQ,avextendsinglevextendsingle?
vextendsinglevextendsingle
vextendsinglevextendsingle
vextendsingle
parenleftbig??xa
Q ??x
a
Q
parenrightbig
lparenleftbig
?xaQparenrightbigl
vextendsinglevextendsingle
vextendsinglevextendsingle
vextendsingle,
69
where l is the index of the maximum ratio. Continuing, we have
vextendsinglevextendsingle?p
a ??
p
Q,a
vextendsinglevextendsingle ? vextendsinglevextendsingle?p
Q,a
vextendsinglevextendsingle?vextenddoublevextenddouble??xaQ ??xaQvextenddoublevextenddouble1vextenddouble
vextenddouble?xaQvextenddoublevextenddouble
?
, by (4.26)
?
?
k?
?
k
parenleftBiggvextenddoublevextenddouble
??xaQ ??xaQvextenddoublevextenddoublevextenddouble
vextenddouble?xaQvextenddoublevextenddouble
parenrightBigg
, by (4.27), (4.28)
and vextendsinglevextendsingle?pQ,avextendsinglevextendsingle? 1
? k? ?bardblz?z
?bardblvextenddoublevextenddouble?za
Q
vextenddoublevextenddouble
vextenddoublevextenddouble?xa
Q
vextenddoublevextenddouble , by (4.23)
? k?bardblz?z?bardbl? ??, by (4.29)
= co bardblz?z?bardbl. (4.30)
where co = k??? ??. Similarily,
vextendsinglevextendsingle?d
a ??
d
Q,a
vextendsinglevextendsingle =
vextendsinglevextendsingle
vextendsinglevextendsingle
vextendsingle
?(sQ)rparenleftbig
?saQparenrightbigr +
(sQ)rparenleftbig
??saQparenrightbigr
vextendsinglevextendsingle
vextendsinglevextendsingle
vextendsingle
=
vextendsinglevextendsingle
vextendsinglevextendsingle
vextendsinglevextendsingle
(sQ)r
bracketleftBigparenleftbig
??saQparenrightbigr ?parenleftbig?saQparenrightbigr
bracketrightBig
parenleftbig?sa
Q
parenrightbig
r ?
parenleftbig??sa
Q
parenrightbig
r
vextendsinglevextendsingle
vextendsinglevextendsingle
vextendsinglevextendsingle
?
vextendsinglevextendsingle
vextendsinglevextendsingle
vextendsingle
(sQ)rparenleftbig
??saQparenrightbigr
vextendsinglevextendsingle
vextendsinglevextendsingle
vextendsingle?
vextendsinglevextendsingle
vextendsinglevextendsingle
vextendsingle
parenleftbig??sa
Q
parenrightbig
r ?
parenleftbig?sa
Q
parenrightbig
rparenleftbig
?saQparenrightbigr
vextendsinglevextendsingle
vextendsinglevextendsingle
vextendsingle, ?r
? vextendsinglevextendsingle?dQ,avextendsinglevextendsingle?
vextendsinglevextendsingle
vextendsinglevextendsingle
vextendsingle
parenleftbig??sa
Q ??s
a
Q
parenrightbig
pparenleftbig
?saQparenrightbigp
vextendsinglevextendsingle
vextendsinglevextendsingle
vextendsingle,
where p is the index of the maximum ratio. Therefore,
vextendsinglevextendsingle?d
a ??
d
Q,a
vextendsinglevextendsingle ? vextendsinglevextendsingle?d
Q,a
vextendsinglevextendsingle?vextenddoublevextenddouble??saQ ??saQvextenddoublevextenddouble1vextenddouble
vextenddouble?saQvextenddoublevextenddouble
?
, by (4.26)
?
?
k?
?
k
parenleftBiggvextenddoublevextenddouble
??saQ ??saQvextenddoublevextenddoublevextenddouble
vextenddouble?saQvextenddoublevextenddouble
parenrightBigg
, by (4.27), (4.28)
? k? ?bardblz?z
?bardblvextenddoublevextenddouble?za
Q
vextenddoublevextenddouble
vextenddoublevextenddouble?sa
Q
vextenddoublevextenddouble , by (4.23)
? k??bardblz?z?bardbl? ??, by (4.29)
= co bardblz?z?bardbl. (4.31)
70
Recall, the absolute value of the difference between ?a and ?aQ is bounded by
vextendsinglevextendsingle?a ??a
Q
vextendsinglevextendsingle ? 1
n
bracketleftbigvextendsinglevextendsingle?d
a
vextendsinglevextendsinglebardblx
Qbardbl
vextenddoublevextenddouble?sa
Q ???s
a
Q
vextenddoublevextenddouble+vextendsinglevextendsingle?d
a ??
d
Q,a
vextendsinglevextendsinglebardblx
Qbardbl
vextenddoublevextenddouble??sa
Q
vextenddoublevextenddouble
+|?pa|bardblsQbardblvextenddoublevextenddouble?xaQ ???xaQvextenddoublevextenddouble+vextendsinglevextendsingle?pa ??pQ,avextendsinglevextendsinglebardblsQbardblvextenddoublevextenddouble??xaQvextenddoublevextenddouble
+vextendsinglevextendsingle?pa?da ??pQ,a?dQ,avextendsinglevextendsinglevextenddoublevextenddouble?xaQvextenddoublevextenddoublevextenddoublevextenddouble?saQvextenddoublevextenddouble
+
vextendsinglevextendsingle
vextendsingle?pQ,a?dQ,a
bracketleftBigparenleftbig
?xaQparenrightbigT ?saQ ?parenleftbig??xaQparenrightbigT ??saQ
bracketrightBigvextendsinglevextendsingle
vextendsingle
+vextendsinglevextendsingle?davextendsinglevextendsinglevextenddoublevextenddoublex?Qvextenddoublevextenddouble
vextenddoublevextenddouble
vextenddouble?sa?Q
vextenddoublevextenddouble
vextenddouble+|?pa|
vextendsinglevextendsingle
vextendsinglesT?Q?xa?Q
vextendsinglevextendsingle
vextendsingle+
vextendsinglevextendsingle?p
a?
d
a
vextendsinglevextendsinglevextendsinglevextendsinglevextendsingle
vextendsingle
parenleftBig
?xa?Q
parenrightBigT
?sa?Q
vextendsinglevextendsingle
vextendsinglevextendsingle
bracketrightbigg
.
Using the results from (4.23), (4.30), and (4.31), we can bound the 9 terms on the
right-hand side of the above inequality:
1. vextendsinglevextendsingle?davextendsinglevextendsinglebardblxQbardblvextenddoublevextenddouble?saQ ???saQvextenddoublevextenddouble ? C1 ??bardblz?z?bardblbardbl?zabardbl by assumption (A7), (4.23)
and the fact that vextendsinglevextendsingle?davextendsinglevextendsingle? 1.
2. To bound the second term,vextendsinglevextendsingle?da ??dQ,avextendsinglevextendsinglebardblxQbardblvextenddoublevextenddouble??saQvextenddoublevextenddouble, we must first bound the
quantityvextenddoublevextenddouble??saQvextenddoublevextenddouble.
vextenddoublevextenddouble??sa
Q
vextenddoublevextenddouble = vextenddoublevextenddouble??sa
Q??s
a
Q +?s
a
Q
vextenddoublevextenddouble
? vextenddoublevextenddouble??saQ ??saQvextenddoublevextenddouble+vextenddoublevextenddouble?saQvextenddoublevextenddouble
? ?bardblz?z?bardblbardbl?zabardbl+bardbl?zabardbl
= (1 +?bardblz?z?bardbl)bardbl?zabardbl. (4.32)
The third line follows from (4.23). Therefore,
vextendsinglevextendsingle?d
a ??
d
Q,a
vextendsinglevextendsinglebardblx
Qbardbl
vextenddoublevextenddouble??sa
Q
vextenddoublevextenddouble ? c
obardblz?z?bardblC1 (1 +?bardblz?z?bardbl)bardbl?zabardbl
= coC1 (1+?bardblz?z?bardbl)bardblz?z?bardblbardbl?zabardbl.
This result follows from assumption (A7), (4.31), and (4.32).
71
3. |?pa|bardblsQbardblvextenddoublevextenddouble?xaQ ???xaQvextenddoublevextenddouble?C2??bardblz?z?bardblbardbl?zabardbl(similar to the bound on term
1).
4. vextendsinglevextendsingle?pa ??pQ,avextendsinglevextendsinglebardblsQbardblvextenddoublevextenddouble??xaQvextenddoublevextenddouble?coC2(1 +?bardblz?z?bardbl)bardblz?z?bardblbardbl?zabardbl(similartothe
bound on term 2).
5. vextendsinglevextendsingle?pa?da ??pQ,a?dQ,avextendsinglevextendsinglevextenddoublevextenddouble?xaQvextenddoublevextenddoublevextenddoublevextenddouble?saQvextenddoublevextenddouble
= vextendsinglevextendsingle?pa?da??pa?dQ,a +?pa?dQ,a ??pQ,a?dQ,avextendsinglevextendsinglevextenddoublevextenddouble?xaQvextenddoublevextenddoublevextenddoublevextenddouble?saQvextenddoublevextenddouble
= vextendsinglevextendsingle?paparenleftbig?da ??dQ,aparenrightbig+?dQ,aparenleftbig?pa ??pQ,aparenrightbigvextendsinglevextendsinglevextenddoublevextenddouble?xaQvextenddoublevextenddoublevextenddoublevextenddouble?saQvextenddoublevextenddouble
? parenleftbig|?pa|vextendsinglevextendsingle?da ??dQ,avextendsinglevextendsingle+vextendsinglevextendsingle?dQ,avextendsinglevextendsinglevextendsinglevextendsingle?pa ??pQ,avextendsinglevextendsingleparenrightbigvextenddoublevextenddouble?xaQvextenddoublevextenddoublevextenddoublevextenddouble?saQvextenddoublevextenddouble
? (cobardblz?z?bardbl+co bardblz?z?bardbl)bardbl?zabardblbardbl?zabardbl
= 2cobardblz?z?bardblbardbl?zabardbl2. (4.33)
The fourth line follows from (4.30), (4.31) and the fact that |?pa| ? 1 and
vextendsinglevextendsingle?d
Q,a
vextendsinglevextendsingle? 1.
6.
vextendsinglevextendsingle
vextendsingle?pQ,a?dQ,a
bracketleftBigparenleftbig
?xaQparenrightbigT ?saQ ?parenleftbig??xaQparenrightbigT ??saQ
bracketrightBigvextendsinglevextendsingle
vextendsingle
=
vextendsinglevextendsingle
vextendsingle?pQ,a?dQ,aparenleftbig?xaQparenrightbigT ?saQ??pa?dQ,aparenleftbig?xaQparenrightbigT??saQ
+?pa?dQ,aparenleftbig?xaQparenrightbigT??saQ ??pQ,a?dQ,aparenleftbig??xaQparenrightbigT ??saQ
vextendsinglevextendsingle
vextendsingle
=
vextendsinglevextendsingle
vextendsingle?dQ,aparenleftbig?xaQparenrightbigT parenleftbig?pQ,a?saQ ??pa??saQparenrightbig
+?dQ,aparenleftbig??saQparenrightbigT parenleftbig?pa?xaQ ??pQ,a??xaQparenrightbig
vextendsinglevextendsingle
vextendsingle
? vextendsinglevextendsingle?dQ,avextendsinglevextendsinglevextenddoublevextenddouble?xaQvextenddoublevextenddoublevextenddoublevextenddouble?pQ,a?saQ ??pa??saQvextenddoublevextenddouble
+vextendsinglevextendsingle?dQ,avextendsinglevextendsinglevextenddoublevextenddouble??saQvextenddoublevextenddoublevextenddoublevextenddouble?pa?xaQ ??pQ,a??xaQvextenddoublevextenddouble.
72
It suffices to bound vextenddoublevextenddouble?pQ,a?saQ ??pa??saQvextenddoublevextenddouble and vextenddoublevextenddouble?pa?xaQ ??pQ,a??xaQvextenddoublevextenddouble. The
bound for vextenddoublevextenddouble?pQ,a?saQ ??pa??saQvextenddoublevextenddoubleis given by:
vextenddoublevextenddouble?p
Q,a?s
a
Q ??
p
a??s
a
Q
vextenddoublevextenddouble
= vextenddoublevextenddouble?pQ,a?saQ??pa?saQ +?pa?saQ ??pa??saQvextenddoublevextenddouble
? vextenddoublevextenddoubleparenleftbig?pQ,a ??paparenrightbig?saQvextenddoublevextenddouble+vextenddoublevextenddouble?paparenleftbig?saQ ???saQparenrightbigvextenddoublevextenddouble
? vextendsinglevextendsingle?pQ,a ??pavextendsinglevextendsinglevextenddoublevextenddouble?saQvextenddoublevextenddouble+|?pa|vextenddoublevextenddouble?saQ ???saQvextenddoublevextenddouble
? cobardblz?z?bardblbardbl?zabardbl+?bardblz?z?bardblbardbl?zabardbl
= (co +?)bardblz?z?bardblbardbl?zabardbl. (4.34)
The fourth line follows from (4.30), (4.23) and the fact that |?pa| ? 1. A similar
analysis shows that
vextenddoublevextenddouble?p
a?x
a
Q ??
p
Q,a??x
a
Q
vextenddoublevextenddouble ? (c
o +?)bardblz?z?bardblbardbl?zabardbl. (4.35)
Therefore,
vextendsinglevextendsingle
vextendsingle?pQ,a?dQ,a
bracketleftBigparenleftbig
?xaQparenrightbigT ?saQ ?parenleftbig??xaQparenrightbigT ??saQ
bracketrightBigvextendsinglevextendsingle
vextendsingle
? vextendsinglevextendsingle?dQ,avextendsinglevextendsinglevextenddoublevextenddouble?xaQvextenddoublevextenddoublevextenddoublevextenddouble?pQ,a?saQ ??pa??saQvextenddoublevextenddouble
+vextendsinglevextendsingle?dQ,avextendsinglevextendsinglevextenddoublevextenddouble??saQvextenddoublevextenddoublevextenddoublevextenddouble?pa?xaQ ??pQ,a??xaQvextenddoublevextenddouble
? bardbl?zabardbl?(co +?)bardblz?z?bardblbardbl?zabardbl
+(1+?bardblz?z?bardbl)bardbl?zabardbl?(co +?)bardblz?z?bardblbardbl?zabardbl
= (2 +?bardblz?z?bardbl)(co +?)bardblz?z?bardblbardbl?zabardbl2.
73
7. vextendsinglevextendsingle?davextendsinglevextendsinglevextenddoublevextenddoublex?Qvextenddoublevextenddouble
vextenddoublevextenddouble
vextenddouble?sa?Q
vextenddoublevextenddouble
vextenddouble
= vextendsinglevextendsingle?davextendsinglevextendsingle
vextenddoublevextenddouble
vextenddoublex?Q ?x??Q
vextenddoublevextenddouble
vextenddouble
vextenddoublevextenddouble
vextenddouble?sa?Q
vextenddoublevextenddouble
vextenddouble
? bardblz?z?bardblbardbl?zabardbl, (4.36)
sincevextendsinglevextendsingle?davextendsinglevextendsingle? 1.
8. Since ?xa?Q = ?x?Q ?S?1?Q X?Q?sa?Q, we have
|?pa|
vextendsinglevextendsingle
vextendsinglesT?Q?xa?Q
vextendsinglevextendsingle
vextendsingle = |?pa|
vextendsinglevextendsingle
vextendsinglesT?Q
parenleftBig
?x?Q ?S?1?Q X?Q?sa?Q
parenrightBigvextendsinglevextendsingle
vextendsingle
?
vextendsinglevextendsingle
vextendsingle?xT?Qs?Q ?xT?Q?sa?Q
vextendsinglevextendsingle
vextendsingle
?
vextendsinglevextendsingle
vextendsinglexT?Qs?Q
vextendsinglevextendsingle
vextendsingle+
vextendsinglevextendsingle
vextendsinglexT?Q?sa?Q
vextendsinglevextendsingle
vextendsingle
? n?+vextenddoublevextenddoublex?Qvextenddoublevextenddouble
vextenddoublevextenddouble
vextenddouble?sa?Q
vextenddoublevextenddouble
vextenddouble
? n?+bardblz?z?bardblbardbl?zabardbl, (4.37)
with |?pa|? 1.
9. vextendsinglevextendsingle?pa?davextendsinglevextendsingle
vextendsinglevextendsingle
vextendsinglevextendsingleparenleftBig?xa?QparenrightBigT ?sa?Q
vextendsinglevextendsingle
vextendsinglevextendsingle
= vextendsinglevextendsingle?pa?davextendsinglevextendsingle
vextendsinglevextendsingle
vextendsinglevextendsingleparenleftBig?x?Q ?S?1?
Q X?Q?s
a?
Q
parenrightBigT
?sa?Q
vextendsinglevextendsingle
vextendsinglevextendsingle
= vextendsinglevextendsingle?pa?davextendsinglevextendsingle
vextendsinglevextendsingle
vextendsinglevextendsingle?xT?Q?sa?Q ?parenleftBig?sa?QparenrightBigT X?QS?1?
Q ?s
a?
Q
vextendsinglevextendsingle
vextendsinglevextendsingle
?
vextendsinglevextendsingle
vextendsinglexT?Q?sa?Q
vextendsinglevextendsingle
vextendsingle+
vextendsinglevextendsingle
vextendsinglevextendsingleparenleftBig?sa?QparenrightBigT X?QS?1?
Q ?s
a?
Q
vextendsinglevextendsingle
vextendsinglevextendsingle
? vextenddoublevextenddoublex?Qvextenddoublevextenddouble
vextenddoublevextenddouble
vextenddouble?sa?Q
vextenddoublevextenddouble
vextenddouble+
vextenddoublevextenddouble
vextenddouble?sa?Q
vextenddoublevextenddouble
vextenddouble
vextenddoublevextenddoubleX
?Q
vextenddoublevextenddoublevextenddoublevextenddouble
vextenddoubleS?1?Q
vextenddoublevextenddouble
vextenddouble
vextenddoublevextenddouble
vextenddouble?sa?Q
vextenddoublevextenddouble
vextenddouble
? bardblz?z?bardblbardbl?zabardbl+bardbl?zabardblbardblz?z?bardbl?1bardbl?zabardbl
= (1+?1bardbl?zabardbl)bardblz?z?bardblbardbl?zabardbl, (4.38)
where
vextenddoublevextenddouble
vextenddoubleS?1?Q
vextenddoublevextenddouble
vextenddouble??1 follows from (3.24).
74
Combining the bounds in 1-9, we have
vextendsinglevextendsingle?a ??a
Q
vextendsinglevextendsingle ? 1
n
bracketleftbigvextendsinglevextendsingle?d
a
vextendsinglevextendsinglebardblx
Qbardbl
vextenddoublevextenddouble?sa
Q ???s
a
Q
vextenddoublevextenddouble+vextendsinglevextendsingle?d
a ??
d
Q,a
vextendsinglevextendsinglebardblx
Qbardbl
vextenddoublevextenddouble??sa
Q
vextenddoublevextenddouble
+|?pa|bardblsQbardblvextenddoublevextenddouble?xaQ ???xaQvextenddoublevextenddouble+vextendsinglevextendsingle?pa ??pQ,avextendsinglevextendsinglebardblsQbardblvextenddoublevextenddouble??xaQvextenddoublevextenddouble
+vextendsinglevextendsingle?pa?da ??pQ,a?dQ,avextendsinglevextendsinglevextenddoublevextenddouble?xaQvextenddoublevextenddoublevextenddoublevextenddouble?saQvextenddoublevextenddouble
+
vextendsinglevextendsingle
vextendsingle?pQ,a?dQ,a
bracketleftBigparenleftbig
?xaQparenrightbigT ?saQ ?parenleftbig??xaQparenrightbigT ??saQ
bracketrightBigvextendsinglevextendsingle
vextendsingle
+vextendsinglevextendsingle?davextendsinglevextendsinglevextenddoublevextenddoublex?Qvextenddoublevextenddouble
vextenddoublevextenddouble
vextenddouble?sa?Q
vextenddoublevextenddouble
vextenddouble+|?pa|
vextendsinglevextendsingle
vextendsinglesT?Q?xa?Q
vextendsinglevextendsingle
vextendsingle+vextendsinglevextendsingle?pa?davextendsinglevextendsingle
vextendsinglevextendsingle
vextendsinglevextendsingleparenleftBig?xa?QparenrightBigT ?sa?Q
vextendsinglevextendsingle
vextendsinglevextendsingle
bracketrightbigg
? 1n[C1?bardblz?z?bardblbardbl?zabardbl+coC1(1 +?bardblz?z?bardbl)bardblz?z?bardblbardbl?zabardbl
C2?bardblz?z?bardblbardbl?zabardbl+coC2 (1+?bardblz?z?bardbl)bardblz?z?bardblbardbl?zabardbl
+2co bardblz?z?bardblbardbl?zabardbl2 + (2+?bardblz?z?bardbl)(co +?)bardblz?z?bardblbardbl?zabardbl2
+bardblz?z?bardblbardbl?zabardbl+n?+bardblz?z?bardblbardbl?zabardbl
+(1+?1 bardbl?zabardbl)bardblz?z?bardblbardbl?zabardbl].
We can simplify the last expression by factoring out the quantity bardblz?z?bardblbardbl?zabardbl
75
from some terms and grouping other terms. This gives
vextendsinglevextendsingle?a ??a
Q
vextendsinglevextendsingle ? ?+ 1
n [C1?+coC1 (1 +?bardblz?z
?bardbl) +C2?+coC2 (1+?bardblz?z?bardbl)
+2+ (1 +?1bardbl?zabardbl)]bardblz?z?bardblbardbl?zabardbl
+1n[2co + (2 +?bardblz?z?bardbl)(co +?)]bardblz?z?bardblbardbl?zabardbl2
= ?+ 1n [(C1 +C2)?+ (C1 +C2)co (1+?bardblz?z?bardbl)
+3+?1bardbl?zabardbl]bardblz?z?bardblbardbl?zabardbl
+1n[2co + (2 +?bardblz?z?bardbl)(co +?)]bardblz?z?bardblbardbl?zabardbl2
= ?+ 1n [(C1 +C2)(co +?) + 3 +?1bardbl?zabardbl
+(C1 +C2)co?bardblz?z?bardbl]bardblz?z?bardblbardbl?zabardbl
+bardblz?z?bardbl??parenleftbigbardbl?zabardbl2parenrightbig.
Let
V1 = (C1 +C2)(co +?) + 3+?1bardbl?zabardbl and (4.39)
V2 = (C1 +C2)?. (4.40)
Then
vextendsinglevextendsingle?a ??a
Q
vextendsinglevextendsingle ? ?+parenleftbiggV1 +coV2bardblz?z?bardbl
n
parenrightbigg
bardblz?z?bardblbardbl?zabardbl
+bardblz?z?bardbl??parenleftbigbardbl?zabardbl2parenrightbig. (4.41)
The absolute value of the remaining pieces in (4.25) can be bounded by
vextendsinglevextendsingle
vextendsingleparenleftbig1/?2parenrightbig
bracketleftBig
(?a)2 +?a ??aQ +parenleftbig?aQparenrightbig2
bracketrightBigvextendsinglevextendsingle
vextendsingle ?
vextendsinglevextendsingle
vextendsinglevextendsingle?
a +?a
Q
?
vextendsinglevextendsingle
vextendsinglevextendsingle
2
. (4.42)
In the analysis that follows, we concentrate on bounding the square root of
the expression on the right of (4.42). We have
76
vextendsinglevextendsingleparenleftbig?a +?a
Q
parenrightbig/?vextendsinglevextendsingle
= (1/n?)
bracketleftBig
(x+?pa?xa)T parenleftbigs+?da?saparenrightbig+parenleftbigx+?pQ,a??xaparenrightbigT parenleftbigs+?dQ,a??saparenrightbig
bracketrightBig
= (1/n?)
bracketleftBig
2 xTs+?daxT?sa +?pasT?xa +?pa?da (?xa)T ?sa
+ ?dQ,axT??sa +?pQ,asT??xa +?pQ,a?dQ,a (??xa)T ??sa
bracketrightBig
.
By partitioning the right-hand side of the last expression into Q and ?Q, we can
eliminate the terms involving ??xa?Q and ??sa?Q since ??xa?Q = ??sa?Q = 0. This gives
vextendsinglevextendsingleparenleftbig?a +?a
Q
parenrightbig/?vextendsinglevextendsingle = 2+ (1/n?)bracketleftBig?d
ax
T
Q?s
a
Q +?
p
as
T
Q?x
a
Q +?
p
a?
d
a
parenleftbig?xa
Q
parenrightbigT ?sa
Q
+?daxT?Q?sa?Q +?pasT?Q?xa?Q +?pa?da
parenleftBig
?xa?Q
parenrightBigT
?sa?Q
+ ?dQ,axTQ??saQ +?pQ,asTQ??xaQ +?pQ,a?dQ,aparenleftbig??xaQparenrightbigT ??saQ
bracketrightBig
.
The first term on the right-hand side of the last expression follows since ? =
parenleftbigxTsparenrightbig/n. By adding in, subtracting out, and grouping certain terms, we have
77
vextendsinglevextendsingleparenleftbig?a +?a
Q
parenrightbig/?vextendsinglevextendsingle
= 2+ (1/n?)bracketleftbig?daxTQ?saQ??dQ,axTQ??saQ +?dQ,axTQ??saQ +?dQ,axTQ??saQ
+?pasTQ?xaQ??pQ,asTQ??xaQ +?pQ,asTQ??xaQ +?pQ,asTQ??xaQ
+?pa?daparenleftbig?xaQparenrightbigT ?saQ??pQ,a?dQ,aparenleftbig?xaQparenrightbigT?saQ
+?pQ,a?dQ,aparenleftbig?xaQparenrightbigT?saQ +?pQ,a?dQ,aparenleftbig??xaQparenrightbigT ??saQ
+?daxT?Q?sa?Q +?pasT?Q?xa?Q +?pa?da
parenleftBig
?xa?Q
parenrightBigT
?sa?Q
bracketrightbigg
= 2+ (1/n?)bracketleftbigxTQparenleftbig?da?saQ ??dQ,a??saQparenrightbig+sTQparenleftbig?pa?xaQ ??pQ,a??xaQparenrightbig
+2parenleftbig?pQ,asTQ??xaQ +?dQ,axTQ??saQparenrightbig+parenleftbig?pa?da ??pQ,a?dQ,aparenrightbigparenleftbig?xaQparenrightbigT ?saQ
+?pQ,a?dQ,a
bracketleftBigparenleftbig
?xaQparenrightbigT ?saQ +parenleftbig??xaQparenrightbigT ??saQ
bracketrightBig
+?daxT?Q?sa?Q +?pasT?Q?xa?Q +?pa?da
parenleftBig
?xa?Q
parenrightBigT
?sa?Q
bracketrightbigg
.
Therefore,
vextendsinglevextendsingleparenleftbig?a +?a
Q
parenrightbig/?vextendsinglevextendsingle
? 2+
vextendsinglevextendsingle
vextendsinglevextendsingle 1
n?
vextendsinglevextendsingle
vextendsinglevextendsinglebracketleftbigbardblxQbardblvextenddoublevextenddouble?da?saQ ??dQ,a??saQvextenddoublevextenddouble+bardblsQbardblvextenddoublevextenddouble?pa?xaQ ??pQ,a??xaQvextenddoublevextenddouble
+2vextendsinglevextendsingle?pQ,asTQ??xaQ +?dQ,axTQ??saQvextendsinglevextendsingle+vextendsinglevextendsingle?pa?da ??pQ,a?dQ,avextendsinglevextendsinglevextenddoublevextenddouble?xaQvextenddoublevextenddoublevextenddoublevextenddouble?saQvextenddoublevextenddouble
+vextendsinglevextendsingle?pQ,a?dQ,avextendsinglevextendsingle
vextendsinglevextendsingle
vextendsingleparenleftbig?xaQparenrightbigT ?saQ +parenleftbig??xaQparenrightbigT ??saQ
vextendsinglevextendsingle
vextendsingle
+vextendsinglevextendsingle?davextendsinglevextendsinglevextenddoublevextenddoublex?Qvextenddoublevextenddouble
vextenddoublevextenddouble
vextenddouble?sa?Q
vextenddoublevextenddouble
vextenddouble+|?pa|
vextendsinglevextendsingle
vextendsinglesT?Q?xa?Q
vextendsinglevextendsingle
vextendsingle+
vextendsinglevextendsingle?p
a?
d
a
vextendsinglevextendsinglevextendsinglevextendsinglevextendsingle
vextendsingle
parenleftBig
?xa?Q
parenrightBigT
?sa?Q
vextendsinglevextendsingle
vextendsinglevextendsingle
bracketrightbigg
.
We can bound the 8 terms on the right-hand side of the above inequality by using
some of the previous results:
I. bardblxQbardblvextenddoublevextenddouble?da?saQ ??dQ,a??saQvextenddoublevextenddouble? C1 (co +?)bardblz?z?bardblbardbl?zabardbl, by assumption (A7)
and (4.34) with ?pa and ?pQ,a replaced by ?da and ?dQ,a, respectively.
78
II. bardblsQbardblvextenddoublevextenddouble?pa?xaQ ??pQ,a??xaQvextenddoublevextenddouble?C2 (co +?)bardblz?z?bardblbardbl?zabardbl, by assumption (A7)
and (4.35).
III. 2vextendsinglevextendsingle?pQ,asTQ??xaQ +?dQ,axTQ??saQvextendsinglevextendsingle
= 2vextendsinglevextendsingle?pQ,asTQ??xaQ??pQ,asTQ?xaQ +?pQ,asTQ?xaQ
+?dQ,axTQ??saQ??dQ,axTQ?saQ +?dQ,axTQ?saQvextendsinglevextendsingle
= 2vextendsinglevextendsingle?pQ,asTQparenleftbig??xaQ ??xaQparenrightbig+?dQ,axTQparenleftbig??saQ ??saQparenrightbig
+?pQ,asTQ?xaQ +?dQ,axTQ?saQvextendsinglevextendsingle
? 2bracketleftbigvextendsinglevextendsingle?pQ,avextendsinglevextendsinglebardblsQbardblvextenddoublevextenddouble??xaQ ??xaQvextenddoublevextenddouble+vextendsinglevextendsingle?dQ,avextendsinglevextendsinglebardblxQbardblvextenddoublevextenddouble??saQ ??saQvextenddoublevextenddouble
+vextendsinglevextendsingle?pQ,asTQ?xaQ +?dQ,axTQ?saQvextendsinglevextendsinglebracketrightbig
? 2[C2?bardblz?z?bardblbardbl?zabardbl+C1?bardblz?z?bardblbardbl?zabardbl
+vextendsinglevextendsinglemaxbraceleftbig?pQ,a,?dQ,abracerightbigvextendsinglevextendsinglevextendsinglevextendsinglesTQ?xaQ +xTQ?saQvextendsinglevextendsinglebracketrightbig
? 2[(C1 +C2)?bardblz?z?bardblbardbl?zabardbl+n?].
The fourthlinefollowsfrom (4.23). The fifthlinefollowssincevextendsinglevextendsinglemaxbraceleftbig?pQ,a,?dQ,abracerightbigvextendsinglevextendsingle?
1 and vextendsinglevextendsinglesTQ?xaQ +xTQ?saQvextendsinglevextendsingle=vextendsinglevextendsingle?xTQsQvextendsinglevextendsingle?n? by (2.4).
IV. vextendsinglevextendsingle?pa?da ??pQ,a?dQ,avextendsinglevextendsinglevextenddoublevextenddouble?xaQvextenddoublevextenddoublevextenddoublevextenddouble?saQvextenddoublevextenddouble? 2co bardblz?z?bardblbardbl?zabardbl2, by (4.33).
V. vextendsinglevextendsingle?pQ,a?dQ,avextendsinglevextendsingle
vextendsinglevextendsingle
vextendsingleparenleftbig?xaQparenrightbigT ?saQ +parenleftbig??xaQparenrightbigT ??saQ
vextendsinglevextendsingle
vextendsingle
? vextendsinglevextendsingle?pQ,avextendsinglevextendsinglevextendsinglevextendsingle?dQ,avextendsinglevextendsingle
parenleftBigvextendsinglevextendsingle
vextendsingleparenleftbig?xaQparenrightbigT ?saQ
vextendsinglevextendsingle
vextendsingle+
vextendsinglevextendsingle
vextendsingleparenleftbig??xaQparenrightbigT ??saQ
vextendsinglevextendsingle
vextendsingle
parenrightBig
? vextenddoublevextenddouble?xaQvextenddoublevextenddoublevextenddoublevextenddouble?saQvextenddoublevextenddouble+vextenddoublevextenddouble??xaQvextenddoublevextenddoublevextenddoublevextenddouble??saQvextenddoublevextenddouble
? bardbl?zabardbl2 + (1+?bardblz?z?bardbl)2bardbl?zabardbl2
= bracketleftbig1 + (1+?bardblz?z?bardbl)2bracketrightbigbardbl?zabardbl2.
79
The second line follows sincevextendsinglevextendsingle?pQ,avextendsinglevextendsingle? 1 andvextendsinglevextendsingle?dQ,avextendsinglevextendsingle? 1. The third line follows
from (4.32) since the bound on vextenddoublevextenddouble??xaQvextenddoublevextenddoubleis the same as the bound on vextenddoublevextenddouble??saQvextenddoublevextenddouble.
The bounds for the remaining terms (VI. - VIII.) follow from (4.36) - (4.38). Com-
bining the bounds in I-VIII., we have
vextendsinglevextendsingleparenleftbig?a +?a
Q
parenrightbig/?vextendsinglevextendsingle
? 2+
vextendsinglevextendsingle
vextendsinglevextendsingle 1
n?
vextendsinglevextendsingle
vextendsinglevextendsinglebracketleftbigbardblxQbardblvextenddoublevextenddouble?da?saQ ??dQ,a??saQvextenddoublevextenddouble+bardblsQbardblvextenddoublevextenddouble?pa?xaQ ??pQ,a??xaQvextenddoublevextenddouble
+2vextendsinglevextendsingle?pQ,asTQ??xaQ +?dQ,axTQ??saQvextendsinglevextendsingle+vextendsinglevextendsingle?pa?da ??pQ,a?dQ,avextendsinglevextendsinglevextenddoublevextenddouble?xaQvextenddoublevextenddoublevextenddoublevextenddouble?saQvextenddoublevextenddouble
+vextendsinglevextendsingle?pQ,a?dQ,avextendsinglevextendsingle
vextendsinglevextendsingle
vextendsingleparenleftbig?xaQparenrightbigT ?saQ +parenleftbig??xaQparenrightbigT ??saQ
vextendsinglevextendsingle
vextendsingle
+vextendsinglevextendsingle?davextendsinglevextendsinglevextenddoublevextenddoublex?Qvextenddoublevextenddouble
vextenddoublevextenddouble
vextenddouble?sa?Q
vextenddoublevextenddouble
vextenddouble+|?pa|
vextendsinglevextendsingle
vextendsinglesT?Q?xa?Q
vextendsinglevextendsingle
vextendsingle+
vextendsinglevextendsingle?p
a?
d
a
vextendsinglevextendsinglevextendsinglevextendsinglevextendsingle
vextendsingle
parenleftBig
?xa?Q
parenrightBigT
?sa?Q
vextendsinglevextendsingle
vextendsinglevextendsingle
bracketrightbigg
? 2+
vextendsinglevextendsingle
vextendsinglevextendsingle 1
n?
vextendsinglevextendsingle
vextendsinglevextendsingle[C1(co +?)bardblz?z?bardblbardbl?zabardbl+C2 (co +?)bardblz?z?bardblbardbl?zabardbl
+2[(C1 +C2)?bardblz?z?bardblbardbl?zabardbl+n?] + 2co bardblz?z?bardblbardbl?zabardbl2
+bracketleftbig1+ (1+?bardblz?z?bardbl)2bracketrightbigbardbl?zabardbl2 +bardblz?z?bardblbardbl?zabardbl
+n?+bardblz?z?bardblbardbl?zabardbl+ (1+?1bardbl?zabardbl)bardblz?z?bardblbardbl?zabardbl].
By factoring out bardblz?z?bardblbardbl?zabardbl from select terms, grouping other terms, and ap-
plying (4.39) and (4.40), we have
vextendsinglevextendsingleparenleftbig?a +?a
Q
parenrightbig/?vextendsinglevextendsingle
? 2+
vextendsinglevextendsingle
vextendsinglevextendsingle 1
n?
vextendsinglevextendsingle
vextendsinglevextendsingle([2 + (C1 +C2)(co +?) + (1+?1bardbl?zabardbl)
+2(C1 +C2)?]bardblz?z?bardblbardbl?zabardbl+ 3n?
+parenleftbig2cobardblz?z?bardbl+bracketleftbig1 + (1+?bardblz?z?bardbl)2bracketrightbigparenrightbigbardbl?zabardbl2parenrightbig
= 2+
vextendsinglevextendsingle
vextendsinglevextendsingle 1
n?
vextendsinglevextendsingle
vextendsinglevextendsingle(V1 + 2V2)bardblz?z?bardblbardbl?zabardbl+ 3+ ?parenleftbigbardbl?zabardbl2parenrightbig
= 5+
vextendsinglevextendsingle
vextendsinglevextendsingle 1
n?
vextendsinglevextendsingle
vextendsinglevextendsingle(V1 + 2V2)bardblz?z?bardblbardbl?zabardbl+ ?parenleftbigbardbl?zabardbl2parenrightbig.
80
Squaring the last result gives
vextendsinglevextendsingleparenleftbig?a +?a
Q
parenrightbig/?vextendsinglevextendsingle2
? 25 +
vextendsinglevextendsingle
vextendsinglevextendsingle10
n?
vextendsinglevextendsingle
vextendsinglevextendsingle(V1 + 2V2)bardblz?z?bardblbardbl?zabardbl+ ?parenleftbigbardbl?zabardbl2parenrightbig. (4.43)
Combining the bounds in (4.41) and (4.43), we have the final result:
|(???Q)?| = vextendsinglevextendsingle?a ??aQvextendsinglevextendsingle
vextendsinglevextendsingle
vextendsinglevextendsingle
vextendsingle
(?a)2 +?a ??aQ +parenleftbig?aQparenrightbig2
?2
vextendsinglevextendsingle
vextendsinglevextendsingle
vextendsingle
? vextendsinglevextendsingle?a ??aQvextendsinglevextendsingle
vextendsinglevextendsingle
vextendsinglevextendsingle?a +?aQ
?
vextendsinglevextendsingle
vextendsinglevextendsingle
2
?
bracketleftbigg
?+
parenleftbiggV
1 +coV2bardblz?z?bardbl
n
parenrightbigg
bardblz?z?bardblbardbl?zabardbl
+bardblz?z?bardbl??parenleftbigbardbl?zabardbl2parenrightbigbracketrightbig
?
bracketleftbigg
25 +
vextendsinglevextendsingle
vextendsinglevextendsingle10
n?
vextendsinglevextendsingle
vextendsinglevextendsingle(V1 + 2V2)bardblz?z?bardblbardbl?zabardbl+ ?parenleftbigbardbl?zabardbl2parenrightbig
bracketrightbigg
= 25?+ 25n (V1 +coV2 bardblz?z?bardbl)bardblz?z?bardblbardbl?zabardbl
+10n (V1 + 2V2)bardblz?z?bardblbardbl?zabardbl
+bardblz?z?bardbl??parenleftbigbardbl?zabardbl2parenrightbig+???parenleftbigbardbl?zabardbl2parenrightbig
? 25?+V3bardblz?z?bardblbardbl?zabardbl
+max{bardblz?z?bardbl,?}??parenleftbigbardbl?zabardbl2parenrightbig
? 25?+V3bardblz?z?bardblbardbl?zabardbl+???parenleftbigbardbl?zabardbl2parenrightbig, (4.44)
where
? = max{bardblz?z?bardbl,?} and
V3 = 1n [25(V1 +coV2bardblz?z?bardbl) + 10(V1 + 2V2)].
[]
81
Lemma 4.2 : Suppose assumption (A8) and the assumptions of Lemma 4.1 hold.
Then,
|??|? 25?+ (V3 +?)?bardbl?zabardbl+???parenleftbigbardbl?zabardbl2parenrightbig.
Proof:
|??| = |????Q?+?Q?|
? |(???Q)?|+|?Q?|
? 25?+V3 bardblz?z?bardblbardbl?zabardbl+???parenleftbigbardbl?zabardbl2parenrightbig
+??bardbl?zabardbl
? 25?+ (V3 +?)max{bardblz?z?bardbl,?}bardbl?zabardbl
+???parenleftbigbardbl?zabardbl2parenrightbig
= 25?+ (V3 +?)?bardbl?zabardbl+???parenleftbigbardbl?zabardbl2parenrightbig.
[]
The next two lemmas give bounds on the error components, ?Xa?Q?Sa?Qe?Q and
parenleftBig
?XaQ?SaQ ?? ?XaQ??SaQ
parenrightBig
eQ.
Lemma 4.3 : Suppose (3.24) holds (i.e.
vextenddoublevextenddouble
vextenddoubleS?1?Q
vextenddoublevextenddouble
vextenddouble??1 where ?1 > 0). Then
vextenddoublevextenddouble
vextenddouble?Xa?Q?Sa?Qe?Q
vextenddoublevextenddouble
vextenddouble? (1 +?1bardbl?zabardbl)bardblz?z?bardblbardbl?zabardbl.
Proof:
The ?Xa?Q component of the standard affine-scaling direction is given by ?Xa?Q =
82
?X?Q ?S?1?Q X?Q?Sa?Q. Therefore, we have
vextenddoublevextenddouble
vextenddouble?Xa?Q?Sa?Qe?Q
vextenddoublevextenddouble
vextenddouble =
vextenddoublevextenddouble
vextenddouble
parenleftBig
?X?Q ?S?1?Q X?Q?Sa?Q
parenrightBig
?Sa?Qe?Q
vextenddoublevextenddouble
vextenddouble
? vextenddoublevextenddoubleX?Qvextenddoublevextenddouble
vextenddoublevextenddouble
vextenddouble?sa?Q
vextenddoublevextenddouble
vextenddouble+
vextenddoublevextenddouble
vextenddoubleS?1?Q
vextenddoublevextenddouble
vextenddouble
vextenddoublevextenddoubleX
?Q
vextenddoublevextenddoublevextenddoublevextenddouble
vextenddouble?Sa?Q
vextenddoublevextenddouble
vextenddouble
vextenddoublevextenddouble
vextenddouble?sa?Q
vextenddoublevextenddouble
vextenddouble
? bardblz?z?bardblbardbl?zabardbl+?1bardblz?z?bardblbardbl?zabardbl2
= (1+?1bardbl?zabardbl)bardblz?z?bardblbardbl?zabardbl.
The third line follows since vextenddoublevextenddoubleX?Qvextenddoublevextenddouble=
vextenddoublevextenddouble
vextenddoubleX?Q ?X??Q
vextenddoublevextenddouble
vextenddouble? bardblz?z?bardbl and
vextenddoublevextenddouble
vextenddoubleS?1?Q
vextenddoublevextenddouble
vextenddouble??1.
[]
Lemma 4.4 : Suppose (4.23) and (4.32) holds. Then
vextenddoublevextenddouble
vextenddouble
parenleftBig
?XaQ?SaQ ?? ?XaQ??SaQ
parenrightBig
eQ
vextenddoublevextenddouble
vextenddouble? (2+?bardblz?z?bardbl)?bardblz?z?bardblbardbl?zabardbl2.
Proof:
The analysis to bound vextenddoublevextenddouble??xaQvextenddoublevextenddouble is similar to the analysis on the bound of vextenddoublevextenddouble??saQvextenddoublevextenddouble
which is shown in (4.32). Therefore,
vextenddoublevextenddouble
vextenddouble
parenleftBig
?XaQ?SaQ ?? ?XaQ??SaQ
parenrightBig
eQ
vextenddoublevextenddouble
vextenddouble
=
vextenddoublevextenddouble
vextenddouble
parenleftBig
?XaQ?SaQ???XaQ?SaQ +??XaQ?SaQ ?? ?XaQ??SaQ
parenrightBig
eQ
vextenddoublevextenddouble
vextenddouble
=
vextenddoublevextenddouble
vextenddouble
parenleftBig
?XaQ ?? ?XaQ
parenrightBig
?saQ + ? ?XaQparenleftbig?saQ ???saQparenrightbig
vextenddoublevextenddouble
vextenddouble
?
vextenddoublevextenddouble
vextenddouble?XaQ ?? ?XaQ
vextenddoublevextenddouble
vextenddoublevextenddoublevextenddouble?saQvextenddoublevextenddouble+
vextenddoublevextenddouble
vextenddouble? ?XaQ
vextenddoublevextenddouble
vextenddoublevextenddoublevextenddouble?saQ ???saQvextenddoublevextenddouble
? ?bardblz?z?bardblbardbl?zabardblbardbl?zabardbl+ (1 +?bardblz?z?bardbl)bardbl?zabardbl?bardblz?z?bardblbardbl?zabardbl
= (2+?bardblz?z?bardbl)?bardblz?z?bardblbardbl?zabardbl2,
where the fourth line follows from (4.23) and (4.32). []
83
Before we state the theorem to show (4.24), we present a lemma to show
the standand centering-corrector direction is bounded by the standard affine-scaling
direction.
Lemma 4.5 : Suppose (3.24), Assumptions (A7) - (A8) and the assumptions of
Lemma (4.2) hold. Let
vextenddoublevextenddouble(A
QS?1Q XQATQ)?1
vextenddoublevextenddouble?C
3 where 0 <C3 <?. Then,
vextenddoublevextenddouble?zcc
Q
vextenddoublevextenddouble??
3 bardbl?zabardbl,
where
?3 = ?2 +vextenddoublevextenddoubleATQvextenddoublevextenddouble?2 +parenleftbig?1 +?1C1vextenddoublevextenddoubleATQvextenddoublevextenddouble?2parenrightbig,
?2 = C2bardblAQbardbl?1,
?1 = ?1 (k[25 + (V3 +?)? +???(bardbl?zabardbl)]+bardbl?zabardbl).
Proof:
Let ?Q = ??S?1Q eQ ?S?1Q ?XaQ?SaQeQ. Then,
bardbl?Qbardbl = vextenddoublevextenddouble??S?1Q eQ ?S?1Q ?XaQ?SaQeQvextenddoublevextenddouble
? |??|vextenddoublevextenddoubleS?1Q vextenddoublevextenddoublebardbleQbardbl+vextenddoublevextenddoubleS?1Q vextenddoublevextenddoublevextenddoublevextenddouble?XaQ?SaQeQvextenddoublevextenddouble
? bracketleftbig25?+ (V3 +?)?bardbl?zabardbl+???parenleftbigbardbl?zabardbl2parenrightbigbracketrightbig??1 ?k
+?1 ?bardbl?zabardbl2
? ?1 (k[25+ (V3 +?)? +???(bardbl?zabardbl)]+bardbl?zabardbl)?max{bardbl?zabardbl,?}
= ?1 ?max{bardbl?zabardbl,?},
84
where the third line follows from (3.24) and Lemma (4.2) and
?1 = ?1 (k[25 + (V3 +?)? +???(bardbl?zabardbl)] +bardbl?zabardbl).
Using the previous result, we have
bardbl?yccbardbl = vextenddoublevextenddouble?(AQS?1Q XQATQ)?1AQ?Qvextenddoublevextenddouble
? vextenddoublevextenddouble(AQS?1Q XQATQ)?1vextenddoublevextenddoublebardblAQbardblbardbl?Qbardbl
? C3bardblAQbardbl?1 ?max{bardbl?zabardbl,?}
= ?2 ?max{bardbl?zabardbl,?},
where ?2 =C2bardblAQbardbl?1. The bounds on vextenddoublevextenddouble?sccQvextenddoublevextenddoubleand vextenddoublevextenddouble?zccQvextenddoublevextenddouble are given by:
vextenddoublevextenddouble?scc
Q
vextenddoublevextenddouble = vextenddoublevextenddouble?AT
Q?y
ccvextenddoublevextenddouble
? vextenddoublevextenddoubleATQvextenddoublevextenddoublebardbl?yccbardbl
? vextenddoublevextenddoubleATQvextenddoublevextenddouble?2 ?max{bardbl?zabardbl,?},
vextenddoublevextenddouble?xcc
Q
vextenddoublevextenddouble = vextenddoublevextenddouble?
Q ?S?1Q XQ?sccQ
vextenddoublevextenddouble
? bardbl?Qbardbl+vextenddoublevextenddoubleS?1Q vextenddoublevextenddoublebardblXQbardblvextenddoublevextenddouble?sccQvextenddoublevextenddouble
? ?1 ?max{bardbl?zabardbl,?}+?1C1vextenddoublevextenddoubleATQvextenddoublevextenddouble?2 ?max{bardbl?zabardbl,?}
= parenleftbig?1 +?1C1vextenddoublevextenddoubleATQvextenddoublevextenddouble?2parenrightbig?max{bardbl?zabardbl,?},
where the second to the last line follows from (3.24). Using the above bounds for
85
the components of the centering-corrector direction, we have
vextenddoublevextenddouble?zcc
Q
vextenddoublevextenddouble ? bardbl?yccbardbl+vextenddoublevextenddouble?scc
Q
vextenddoublevextenddouble+vextenddoublevextenddouble?xcc
Q
vextenddoublevextenddouble
? ?2 ?max{bardbl?zabardbl,?}+vextenddoublevextenddoubleATQvextenddoublevextenddouble?2 ?max{bardbl?zabardbl,?}
+parenleftbig?1 +?1C1vextenddoublevextenddoubleATQvextenddoublevextenddouble?2parenrightbig?max{bardbl?zabardbl,?}
= bracketleftbig?2 +vextenddoublevextenddoubleATQvextenddoublevextenddouble?2 +parenleftbig?1 +?1C1vextenddoublevextenddoubleATQvextenddoublevextenddouble?2parenrightbigbracketrightbig?max{bardbl?zabardbl,?}
= ?3 ?max{bardbl?zabardbl,?},
where ?3 =?2 +vextenddoublevextenddoubleATQvextenddoublevextenddouble?2 +parenleftbig?1 +?1C1vextenddoublevextenddoubleATQvextenddoublevextenddouble?2parenrightbig. []
We now present the analysis to show (4.24).
Theorem 4.1 Suppose Assumptions (A1) - (A8), (3.24), (4.23),vextenddoublevextenddouble(AQS?1Q XQATQ)?1vextenddoublevextenddouble?
C3 where 0 <C3 <?, and ? = ?Q hold. Let
? = {z : bardblz?z?bardbl<?, x> 0, s>0}
where ? > 0. Suppose Ja(AQ,xQ,sQ) is nonsingular for all Q. Let ?za be the
Newton direction at z. Then, there exists ??> 0 such that
vextenddoublevextenddouble??zcc
Q ??z
cc
Q
vextenddoublevextenddouble? ????(z)
for all z ? ? with ?(z) ? 0 as the optimal solution is approached.
Proof. Let z = (x,y,s) where x> 0 and s> 0. Then, ??ycc and ??xccQ are given by
?
??
?
0 AQ
?XQATQ SQ
?
??
?
?
??
?
??ycc
??xccQ
?
??
? =
?
??
?
0
??Q
?
??
?, (4.45)
86
where ??Q = ?Q?eQ ? ? ?XaQ??SaQeQ. The components of the centering-corrector
direction, ?ycc and ?xcc satisfy
?
??
?
0 A
?XAT S
?
??
?
?
??
?
?ycc
?xcc
?
??
? =
?
??
?
0
?
?
??
?,
where ? = ??e? ?Xa?Sae. Partitioning the previous expression in terms of Q
and ?Q gives
?
??
??
??
?
0 AQ A?Q
?XQATQ SQ 0
?X?QAT?Q 0 S?Q
?
??
??
??
?
?
??
??
??
?
?ycc
?xccQ
?xcc?Q
?
??
??
??
?
=
?
??
??
??
?
0
?Q
??Q
?
??
??
??
?
.
Premultiplying the third row by ?A?QS?1?Q and adding to the first row, we can elim-
inate ?xcc?Q to obtain the following system of equations,
?
??
?
A?QS?1?Q X?QAT?Q AQ
?XQATQ SQ
?
??
?
?
??
?
?ycc
?xccQ
?
??
? =
?
??
?
?A?QS?1?Q ??Q
?Q
?
??
?.
Adding ??Q to both sides of the last expression, we have
?
??
?
A?QS?1?Q X?QAT?Q AQ
?XQATQ SQ
?
??
?
?
??
?
?ycc
?xccQ
?
??
? +
?
??
?
A?QS?1?Q ??Q
??Q ??Q
?
??
?
=
?
??
?
0
??Q
?
??
?. (4.46)
The right-hand side of (4.46) is precisely the right-hand side of (4.45). Let
Ja(AQ,xQ,sQ) =
?
??
?
0 AQ
?XQATQ SQ
?
??
?.
87
Equating the left-hand sides of (4.45) and (4.46) gives
?
??
?
A?QS?1?Q X?QAT?Q AQ
?XQATQ SQ
?
??
?
?
??
?
?ycc
?xccQ
?
??
? +
?
??
?
A?QS?1?Q ??Q
??Q ??Q
?
??
?
= Ja(AQ,xQ,sQ)
?
??
?
??ycc
??xccQ
?
??
?.
Notice that
?
??
?
A?QS?1?Q X?QAT?Q AQ
?XQATQ SQ
?
??
?
?
??
?
?ycc
?xccQ
?
??
? =
Ja(AQ,xQ,sQ)
?
??
?
?ycc
?xccQ
?
??
?+
?
??
?
A?QS?1?Q X?QAT?Q 0
0 0
?
??
?
?
??
?
?ycc
?xccQ
?
??
?.
Now solving for ??zccQ ??zccQ, we have
?
??
?
??ycc
??xccQ
?
??
??
?
??
?
?ycc
?xccQ
?
??
? = Ja(AQ,xQ,sQ)?1
?
??
?
A?QS?1?Q X?QAT?Q 0
0 0
?
??
?
?
??
?
?ycc
?xccQ
?
??
?
+ Ja(AQ,xQ,sQ)?1
?
??
?
A?QS?1?Q ??Q
??Q ??Q
?
??
?.
Taking norms of both sides, we have
88
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
?
??
?
??ycc
??xccQ
?
??
??
?
??
?
?ycc
?xccQ
?
??
?
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
? vextenddoublevextenddoubleJa(AQ,xQ,sQ)?1vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
?
??
?
A?QS?1?Q X?QAT?Q 0
0 0
?
??
?
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
?
??
?
?ycc
?xccQ
?
??
?
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
(4.47)
+vextenddoublevextenddoubleJa(AQ,xQ,sQ)?1vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
?
??
?
A?QS?1?Q ??Q
??Q ??Q
?
??
?
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
. (4.48)
The bound on (4.47) is given by
bardblJa(AQ,xQ,sQ)?1bardbl
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
?
??
?
A?QS?1?Q X?QAT?Q 0
0 0
?
??
?
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
?
??
?
?ycc
?xccQ
?
??
?
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
? ?2bardblz?z?bardblvextenddoublevextenddouble?zccQvextenddoublevextenddouble
? ?2bardblz?z?bardbl?3 max{bardbl?zabardbl,?}. (4.49)
The first inequality follows from (3.25) and the last inequality follows from Lemma
4.5. We will now show a bound for
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
?
??
?
A?QS?1?Q ??Q
??Q ??Q
?
??
?
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
.
Let M =A?QS?1?Q ; thenvextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddouble
?
??
?
A?QS?1?Q ??Q
??Q ??Q
?
??
?
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
2
=
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
?
??
?
M
parenleftBig
??e?Q ??Xa?Q?Sa?Qe?Q
parenrightBig
bracketleftBig
(?Q ??)?eQ ?
parenleftBig
? ?XaQ??SaQ ??XaQ?SaQ
parenrightBigbracketrightBig
eQ
?
??
?
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
2
.
89
Expanding the right-hand side of this norm, we havevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddouble
?
??
?
A?QS?1?Q ??Q
??Q ??Q
?
??
?
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
2
?
vextenddoublevextenddouble
vextenddoubleM?Xa?Q?Sa?Qe?Q
vextenddoublevextenddouble
vextenddouble
2 + (??)2vextenddoublevextenddoubleMe
?Q
vextenddoublevextenddouble2 +vextenddoublevextenddouble
vextenddouble
parenleftBig
?XaQ?SaQ ?? ?XaQ??SaQ
parenrightBig
eQ
vextenddoublevextenddouble
vextenddouble
2
+[(???Q)?]2bardbleQbardbl2 + 2|??|
vextenddoublevextenddouble
vextenddouble?Xa?Q?Sa?Qe?Q
vextenddoublevextenddouble
vextenddouble
vextenddoublevextenddoubleMe
?Q
vextenddoublevextenddouble
+2|(???Q)?|bardbleQbardbl
vextenddoublevextenddouble
vextenddouble
parenleftBig
?XaQ?SaQ ?? ?XaQ??SaQ
parenrightBig
eQ
vextenddoublevextenddouble
vextenddouble
? bardblMbardbl2
vextenddoublevextenddouble
vextenddouble?Xa?Q?Sa?Qe?Q
vextenddoublevextenddouble
vextenddouble
2 + (n?k)bardblMbardbl2 (??)2
+
vextenddoublevextenddouble
vextenddouble
parenleftBig
?XaQ?SaQ ?? ?XaQ??SaQ
parenrightBig
eQ
vextenddoublevextenddouble
vextenddouble
2 +k[(???
Q)?]
2
+2?n?kbardblMbardbl|??|
vextenddoublevextenddouble
vextenddouble?Xa?Q?Sa?Qe?Q
vextenddoublevextenddouble
vextenddouble
+2
?
k|(???Q)?|
vextenddoublevextenddouble
vextenddouble
parenleftBig
?XaQ?SaQ ?? ?XaQ??SaQ
parenrightBig
eQ
vextenddoublevextenddouble
vextenddouble.
The six terms on the right-hand side of the last inequality can be bounded by
previous results within this section. Let ? = max{bardblz?z?bardbl,?}. The bounds are as
follows:
i. bardblMbardbl2
vextenddoublevextenddouble
vextenddouble?Xa?Q?Sa?Qe?Q
vextenddoublevextenddouble
vextenddouble
2
? bardblMbardbl2 (1+?1bardbl?zbardbl)2bardblz?z?bardbl2bardbl?zabardbl2
? ?2 ??parenleftbigbardbl?zabardbl2parenrightbig.
The second line follows from Lemma 4.3.
90
ii. (n?k)bardblMbardbl2 (??)2
? (n?k)bardblMbardbl2bracketleftbig25?+ (V3 +?)?bardbl?zabardbl+???parenleftbigbardbl?zabardbl2parenrightbigbracketrightbig2
= (n?k)bardblMbardbl2bracketleftbig625?2 + (V3 +?)2?2bardbl?zabardbl2 + 50?(V3 +?)?bardbl?zabardbl
+50?? ??parenleftbigbardbl?zabardbl2parenrightbigbracketrightbig
? (n?k)bardblMbardbl2bracketleftbig625?2 +?2 ??(bardbl?zabardbl) +?2 ??parenleftbigbardbl?zabardbl2parenrightbigbracketrightbig
? ?2bracketleftbig625(n?k)bardblMbardbl2 + ?(bardbl?zabardbl)bracketrightbig.
The bound on ?? follows from Lemma 4.2.
iii.
vextenddoublevextenddouble
vextenddouble
parenleftBig
?XaQ?SaQ ?? ?XaQ??SaQ
parenrightBig
eQ
vextenddoublevextenddouble
vextenddouble
2
? (2 +?bardblz?z?bardbl)2?2vextenddoublevextenddoublez?z2vextenddoublevextenddouble2bardbl?zabardbl4
? ?2 ??parenleftbigbardbl?zabardbl4parenrightbig.
The second line follows from Lemma 4.4.
iv. k[(???Q)?]2
? kbracketleftbig25?+V3bardblz?z?bardblbardbl?zabardbl+???parenleftbigbardbl?zabardbl2parenrightbigbracketrightbig2
? kbracketleftbig625?2 +V23 ?2bardbl?zabardbl2 +?2 ??parenleftbigbardbl?zabardbl4parenrightbig
+50?V3?bardbl?zabardbl+ 50?? ??parenleftbigbardbl?zabardbl2parenrightbig+ 2V3???parenleftbigbardbl?zabardbl3parenrightbigbracketrightbig
? kbracketleftbig625?2 +?2 ??(bardbl?zabardbl)??2 ??parenleftbigbardbl?zabardbl2parenrightbigbracketrightbig
? ?2 [625k + ?(bardbl?zabardbl)].
The second line follows from Lemma 4.1.
91
v. 2?n?kbardblMbardbl|??|
vextenddoublevextenddouble
vextenddouble?Xa?Q?Sa?Qe?Q
vextenddoublevextenddouble
vextenddouble
? 2?n?kbardblMbardblbracketleftbig25?+ (V3 +?)?bardbl?zabardbl+? ??parenleftbigbardbl?zabardbl2parenrightbigbracketrightbig?
[(1 +?1bardbl?zabardbl)bardblz?z?bardblbardbl?zabardbl]
? ?2 ??(bardbl?zabardbl).
The second line follows from Lemmas 4.2 and 4.3.
vi. 2?k|(???Q)?|
vextenddoublevextenddouble
vextenddouble
parenleftBig
?XaQ?SaQ ?? ?XaQ??SaQ
parenrightBig
eQ
vextenddoublevextenddouble
vextenddouble
? 2
?
kbracketleftbig25?+V3bardblz?z?bardblbardbl?zabardbl+???parenleftbigbardbl?zabardbl2parenrightbigbracketrightbig?
bracketleftbig(2+?bardblz?z?bardbl)?bardblz?z?bardblbardbl?zabardbl2bracketrightbig
? ?2 ??parenleftbigbardbl?zabardbl2parenrightbig+?2 ??parenleftbigbardbl?zabardbl3parenrightbig+?2 ??parenleftbigbardbl?zabardbl4parenrightbig
? ?2 ??parenleftbigbardbl?zabardbl2parenrightbig.
The second line follows from Lemmas 4.1 and 4.4.
Combining i.-vi., we have
92
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
?
??
?
A?QS?1?Q ??Q
??Q ??Q
?
??
?
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
2
? bardblMbardbl2
vextenddoublevextenddouble
vextenddouble?Xa?Q?Sa?Qe?Q
vextenddoublevextenddouble
vextenddouble
2 + (n?k)bardblMbardbl2 (??)2
+
vextenddoublevextenddouble
vextenddouble
parenleftBig
?XaQ?SaQ ?? ?XaQ??SaQ
parenrightBig
eQ
vextenddoublevextenddouble
vextenddouble
2 +k[(???
Q)?]
2
+2?n?kbardblMbardbl|??|
vextenddoublevextenddouble
vextenddouble?Xa?Q?Sa?Qe?Q
vextenddoublevextenddouble
vextenddouble
+2
?
k|(???Q)?|
vextenddoublevextenddouble
vextenddouble
parenleftBig
?XaQ?SaQ ?? ?XaQ??SaQ
parenrightBig
eQ
vextenddoublevextenddouble
vextenddouble
? ?2 ?parenleftbigbardbl?zabardbl2parenrightbig+?2bracketleftbig625(n?k)bardblMbardbl2 + ?(bardbl?zabardbl)bracketrightbig
+?2 ??parenleftbigbardbl?zabardbl4parenrightbig+?2 [625k + ?(bardbl?zabardbl)] +?2 ??(bardbl?zabardbl)
+?2 ??parenleftbigbardbl?zabardbl2parenrightbig.
Factoring out ?2 and combining like terms givesvextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddouble
?
??
?
A?QS?1?Q ??Q
??Q ??Q
?
??
?
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
2
? ?2bracketleftbig625parenleftbig(n?k)bardblMbardbl2 +kparenrightbig+ ?(bardbl?zabardbl) + ?parenleftbigbardbl?zabardbl2parenrightbig
+?parenleftbigbardbl?zabardbl4parenrightbigbracketrightbig
? ?2bracketleftbig625parenleftbig(n?k)bardblMbardbl2 +kparenrightbig+ ?(bardbl?zabardbl)bracketrightbig.
Taking the square root of both sides, we have
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
?
??
?
A?QS?1?Q ??Q
??Q ??Q
?
??
?
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
? ?
radicalbig
?3 + ?(bardbl?zabardbl), (4.50)
where
?3 = 625(n?k)bardblMbardbl2.
93
Combining (4.47), (4.48), (4.49), and (4.50), we have
vextenddoublevextenddouble??zcc
Q ??z
cc
Q
vextenddoublevextenddouble ? vextenddoublevextenddoubleJ
a(AQ,xQ,sQ)?1
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
?
??
?
A?QS?1?Q X?QAT?Q 0
0 0
?
??
?
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
?
??
?
?ycc
?xccQ
?
??
?
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
+vextenddoublevextenddoubleJa(AQ,xQ,sQ)?1vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
?
??
?
A?QS?1?Q ??Q
?Q ? ??Q
?
??
?
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
vextenddoublevextenddouble
? ?2bardblz?z?bardbl?3 ?max{bardbl?zabardbl,?}+??0
radicalbig
?3 + ?(bardbl?zabardbl)
? ???(max{bardbl?zabardbl,?}) +??0
radicalbig
?3 + ?(bardbl?zabardbl)
? ????
where ?? = ?0 max
braceleftBig
?(max{bardbl?zabardbl,?}),radicalbig?3 + ?(bardbl?zabardbl)
bracerightBig
=radicalbig?3 + ?(bardbl?zabardbl) >
0 since ?(max{bardbl?zabardbl,?}) ? 0 and ?3 > 0. Furthermore, let ?(z) = ?. Then,
?(z) = max{bardblz?z?bardbl,?} ? 0 as the optimal solution is approached. []
Global and local quadratic convergence follow from Winternitz et al. [28]
with five specific changes to the redMPC algorithm. The differences between the
redMPC algorithm and its modified version are summarized in Table 4.1. The
first modification is maintaining dual-feasibility at each iteration. This impliessi =
c ? ATyi > 0 and rid = 0 for all i. Consequently, the algorithm for updating
(x+,s+) and rd has no effect on this dual-feasible algorithm and can be discarded.
Furthermore, equations (4.8), (4.12), (4.9), and (4.16) must be computed with the
fulldata tosatisfy the above requirementonthe slackvariables. Asaresult, equation
(4.19) must be eliminated from the algorithm. The second difference is the formula
94
redMPC Modified redMPC
algorithm algorithm
Feasibility: Accepts dual-infeasible initial point; Requires dual-feasible
strives to achieve dual-feasibility point at each iteration
Terms AQD2QATQ, ??xaQ, ??saQ, AQD2QATQ, ??xaQ, xQ
with Q: ??xccQ, ??sccQ, xQ, sQ, (rd)Q
Centering ?Q =parenleftbig?aQ/?Qparenrightbig3 where ?Q = (1??Q,a)?
parameter: ?aQ = (xQ+?
p
Q,a??x
a
Q)
T(s
Q+?dQ,a??saQ)
n where ?? 2
??xQ = ??xaQ + ??xccQ ??xQ = ??xaQ +???xccQ
MPC ??y = ??ya + ??ycc ??y = ??ya +???ycc
direction: ??sQ = ??saQ + ??sccQ ??s= ??sa +???scc
where 0 <? < 1
Predictor- ?pQ = ? minparenleftbig??pQ,1parenrightbig ?pQ = maxbraceleftbig???pa, ??pQ ?bardbl?yabardblbracerightbig
Corrector ?dQ = ? minparenleftbig??dQ,1parenrightbig ?dQ = maxbraceleftbig???dQ, ??dQ ?bardbl?yabardblbracerightbig
step: where 0 <? < 1 where 0 <? <1
Update General update strategy for x+, General update strategy
scheme: y+, and s+; unique update for y+ and s+; unique
scheme for r+d with further update scheme for x+
updates for x+ and s+
Table 4.1: A summary of the differences between the redMPC algorithm and a
modified version of the algorithm adapted from Winternitz et. al. [28] that proves
to be locally and quadratically convergent.
95
for the centering parameter, ?Q. Replacing (4.11) with
?Q = (1??Q,a)?,
where ?Q,a = minbraceleftbig?pQ,a,?dQ,abracerightbig and ? ? 2, simplifies the analysis. When ? = 3,
(x,y,s) are primal and dual feasible, and ?Q,a = ?pQ,a = ?dQ,a, the formulas agree.
The third change involves replacing (4.13) - (4.15) with
(?xQ,?y,?s) =parenleftbig?xaQ,?ya,?saparenrightbig+?parenleftbig?xccQ,?ycc,?sccparenrightbig,
where? ? (0,1] is a mixingparameter. This parameter is essential to the analysis; it
is defined to ensure certain properties hold for convergence. In particular, ? ensures
(1) ?y is an ascent direction for bTy (i.e. bT?y> 0 implies bTy+ >bTy),
(2) if ?ya is ?small? then ?y and ??Q?Q are also ?small?, and
(3) ??dQ ???dQ,a where ? ? (0,1).
See Winternitz et al. [28] for details. The fourth modification involvesincorporating
lower bounds into the predictor-corrector step. The equations in (4.17) can be
replaced with
?pQ = maxbraceleftbig???pa,??pQ ?bardbl?yabardblbracerightbig (4.51)
and
?dQ = maxbraceleftbig???dQ, ??dQ ?bardbl?yabardblbracerightbig (4.52)
respectively, where the primal (??pQ ?bardbl?yabardbl) and dual (??pQ ?bardbl?yabardbl) lower bounds
allow local quadratic convergence. The fifth change to the redMPC algorithm is in
96
the update scheme for the primal variables. The ??xaQ and ??xccQ components of the
normal equations remain the same while the update in (4.20) - (4.22) is replaced by
parenleftbig?x
Q,y+,s+
parenrightbig = parenleftbigx
Q +?
p
Q??x,y+?
d
Q??y,s+?
d
Q??s
parenrightbig,
where ?pQ and ?dQ are given in (4.51) and (4.52), respectively. Let x> 0, xj ?xmax
for all j ?N, and k = |Q|. The primal update, x+, is defined as follows:
For all j ?Q,
x+j = minbraceleftbigmaxparenleftbigminbraceleftbigbardbl?yabardbl? +vextenddoublevextenddouble?xa?vextenddoublevextenddouble? ,xbracerightbig,?xjparenrightbig,xmaxbracerightbig, (4.53)
where
?xaj =
?
???
???
xj + ??xaj j ?Q
0 j ? ?Q
,
parenleftbig?xa
?
parenrightbig
j = min
braceleftbig?xa
j,0
bracerightbig.
Using (4.53), we can express ?+Q as
?+Q =
parenleftbigx+
Q
parenrightbigT s+
Q
k .
Thus, for all j ? ?Q, we have
?xj = ?
+
Q
s+j ,
x+j = min{?xj,xmax}.
The dual-feasible redMPC algorithm is presented below:
???????????????-
The Dual-Feasible redMPC Algorithm
Input: (x,y,s) with x> 0,xmax > 0 such that xj ?xmax ?j ?N and s = c?ATy>
97
0, x> 0, ?? 2, ? ? 2, ubnd ? 3m, 0 <? <1, convergence tolerance ?.
Initialize: lbnd = 3m, Q = {1,2,...,n}, ?Q= ?.
Main Algorithm:
while |cTx?bTy|/max|cTx,1|>?
Select the most promising dual constraints 4
Q = Q? ?Q,
k = |Q|,
?Q = x
T
QsQ
k .
Compute the affine-scaling direction:
??ya = parenleftbigAQS?1Q XQATQparenrightbig?1b,
??sa = ?AT??ya,
??xaQ = ?xQ ?S?1Q XQ??saQ.
Compute the affine step:
??pQ,a =
?
???
??
1 ifparenleftbig??xaQparenrightbigj ? 0, ?j
min(??xa
Q)j<0
bracketleftBig
?(xQ)j/parenleftbig??xaQparenrightbigj
bracketrightBig
otherwise
,
??dQ,a =
?
???
???
1 if ??saj ? 0, ?j
min??saj<0bracketleftbig?(sQ)j/??sajbracketrightbig otherwise
,
?Q,a = minbraceleftbig??pQ,a, ??dQ,abracerightbig.
Compute the centering parameter:
?Q = (1??Q,a)?.
4The set Q is determined by the algorithm in Section 2.2.2: Selection of Q
98
Compute the centering-corrector direction:
?Q = ?S?1Q ? ?XaQ??SaQeQ ??Q?QS?1Q eQ,
??ycc = ?(AQS?1Q XQATQ)?1AQ?Q,
??scc = ?AT??ycc,
??xccQ = ?Q ?S?1Q XQ??sccQ.
Compute the predictor-corrector direction:
??xQ = ??xaQ +???xccQ,
??y = ??ya +???ycc,
??s = ??sa +???scc,
where ? ? (0,1] ensures (1)?(3) hold.
Compute the predictor-corrector step:
??pQ =
?
???
???
1 if (??xQ)j > 0, ?j
min(??xQ)j<0 [?(xQ)j/(??xQ)j] otherwise
,
??dQ =
??
??
???
1 if ??sj > 0, ?j
min??sj<0 [?(sQ)j/??sj] otherwise
,
?pQ = maxbraceleftbig???pa, ??pQ ?bardbl?yabardblbracerightbig,
?pQ = maxbraceleftbig???pQ, ??pQ ?bardbl?yabardblbracerightbig.
99
Update the variables:
?xQ = xQ +?pQ??x,
y+ = y+?dQ??y,
s+ = s+?dQ??s.
For all j ?Q,
x+j = minbraceleftbigmaxparenleftbigminbraceleftbigbardbl?yabardbl? +vextenddoublevextenddouble?xa?vextenddoublevextenddouble? ,xbracerightbig,?xjparenrightbig,xmaxbracerightbig,
where
?xaj =
?
??
???
xj + ??xaj j ?Q
0 j ? ?Q
,
parenleftbig?xa
?
parenrightbig
j = min
braceleftbig?xa
j,0
bracerightbig.
Set
?+Q =
parenleftbigx+
Q
parenrightbigT s+
Q
k .
For all j ? ?Q,
?xj = ?
+
Q
s+j ,
x+j = min{?xj,xmax}.
???????????????-
The key to the global convergence analysis is the availability of a dual-feasible point
at every iteration, the definition of the mixing parameter ?, and the lower bound
condition on the primal updates. A local quadratic convergence analysis follows
provided the above conditions hold and the bounds ??pQ ?bardbl?yabardbl and ??dQ ?bardbl?yabardbl
100
are imposed on the predictor-corrector step in equation (4.17). We provided a brief
outline of the convergence proof for the modifiedredPDAS algorithm in Chapter 3.
The proof for the modified redMPC algorithm follows a similar format. We refer
the reader to Winternitz et al. [28] for specific details.
4.5 Summary
In this chapter, we introduced theredMPC algorithm and stated how specific
changes to the algorithm provide global and local q-quadratic convergence results.
With further research, we hope to prove similar results for the redMPC algorithm.
In the next chapter, we test the performance of the redPDAS and redMPC algo-
rithms against their counterparts without constraint reduction.
101
Chapter 5
Numerical Experiments
5.1 Overview
Algorithms redPDAS and redMPC were implemented in MATLAB (v.6.5,
R13) and run on an Intel(R) Pentium(R) M Processor CPU 1.60GHz Laptop ma-
chine with 512 MB of RAM. Each algorithm was tested against the general version
of the algorithm (see Sections 3.2 and 4.2) with reduced normal equations,
?ya = parenleftbigAQS?1Q XQATQparenrightbig?1bracketleftbigb?AS?1Xrdbracketrightbig,
?sa = ?rd ?AT?ya,
?xa = ?x+S?1X?sa.
For simplicity,the general version of the algorithms with this slight modification will
be referred to as PDAS and MPC. Algorithm redMPC was also tested against a
modified version of the reduced MPC algorithm (ipas35) in Tits et. al. [24]. In our
numerical experiments, we simply refer to this algorithm by ipas35. The redMPC
and ipas35 algorithms require the same initial point condition, (x,s) > 0, which
can easily be accommodated using the algorithm of Section 2.2.2. Unfortunately,
the redPDAS algorithm could not be tested with the same starting point as the
reduced PDAS (rPDAS) from [24] since redPDAS requires s> 0 at each iteration
and rPDAS requires s = c?ATy but not s > 0 at each iteration. Although the
102
combination s = c ? ATy > 0 is required for optimality, an initial point which
satisfies both conditions simultaneously is difficult to achieve. Therefore, the redP-
DAS algorithm was tested just with the PDAS algorithm by using the initial point
algorithm of Section 2.2.2.
The parameters for redPDAS and redMPC were chosen as ? = .99 and
ubnd = ?k where ?k was selected from 25 linearly spaced vector values (rounded to
the nearest integer) ranging from 3m to n. The value of ?k is fixed, however k varies
from iteration to iteration such that 3m ? k ? ?k ? n. The stopping criterion was
adapted from [17] and is based on the error in the duality gap,
vextendsinglevextendsinglecTx?bTyvextendsinglevextendsingle/maxparenleftbigcTx,1parenrightbig<?,
where ? = 10?8.
5.1.1 Test Problems
The first set of test problems(TAW1 - TAW5) were selectedfrom the test prob-
lems used in the code developed by Tits et. al. [24]. The LP data sets (A,b,c) were
generated using the MATLAB commands rand, randn, and ones. The commands
rand(m,n) and randn(m,n) generate an m?nmatrix with uniformly and normally
distributed random numbers, respectively. The entries in rand(m,n) are chosen
from a uniform distribution on the interval (0.0,1.0). The entries in randn(m,n) are
chosen from a normal distribution with mean zero and variance 1 (i.eN(0,1)). The
command ones(m,n) creates an m?n matrix of all ones. Table 5.1 displays the
test problem name, a description of the problem, and the MATLAB commands used
103
to define the data (A,b,c). The initial point (x0,y0,s0) was generated as discussed
in the previous section, unless otherwise stated in the table. For more information
about these test problems, see [24].
Problems Description Data
TAW1 Constraints tangent A*, b = randn(m,1), c = ones(n,1),
to unit sphere y0 = ?y/norm(?y) where ?y = randn(m,1).
TAW2 Random (normal) A*, b = randn(m,1), c =AT ?y+ ?s,
constraints where ?y = rand(m,1),?s= rand(n,1).
TAW3 Random as in A= 2 rand(m,n) - ones(m,n),
WT-18Jul2003 b= 2 rand(m,1) - ones(m,1),
(previous version of TAW) c = AT ?y+ ?s,
where ?y = b,?s = rand(n,1).
TAW4 RandomLP A= randn(m,n),b= randn(m,1),
c= AT ?y+ rand(n,1),
where ?y = randn(m,1).
TAW5 SIPND A*, b= randn(m,1),c = ones(n,1),
y0 = zeros(m,1).
Table 5.1: Random Problems (TAW1 - TAW5) from Tits et. al. [24]. Problems
with constraint matrix A* generate each column of A with randn(m,1) and then
normalize to make each column have norm one.
In the second set of test problems (RAND1 - RAND5), the matrix A and the
104
vectors b and c were randomly generated, for some specified m and n, so that both
the primal and dual problems produced nondegenerate solutions at each iteration
and terminatedwith an optimal solution in a finite number of iterations. The entries
in Awere randomly generated from a normal distribution with mean 0 and variance
1. Each data value in A was then multiplied by 2 and rounded to the next integer.
The entries in the vector b were randomly generated from a uniform distribution on
the interval (0,1). Each entry in b was then multiplied by 100 and rounded to the
next integer. The vector c is the vector of all ones with its first entry multiplied
by -1. Table 5.2 shows the test problem name, the dimension of each problem, and
its optimal solution. The optimal solution was determined by MATLAB?s linear
programming solver, LINPROG.
Problems Dimensions Optimal Solution
RAND1 m = 5,n = 100 18.28000000
RAND2 m = 12,n = 500 29.15542400
RAND3 m = 24,n = 780 34.05845050
RAND4 m= 58,n = 1004 58.61768707
RAND5 m= 75,n = 2016 61.42186535
Table 5.2: RandomTest Problems (RAND1 - RAND5) with specified m and n and
known optimal solution
The third set of test problems comes from the Netlib test problems for LPs [2].
A search for problems that satisfied the criterion of n greatermuch m led to approximately
105
12 problems. However, ensuring AQ had full rank at each iteration reduced the set
of 12 problems to 3. These three problems come from the SCSD series. Table 5.3
shows the name of the test problem, the dimension of each problem, and its optimal
solution.
Problems Dimensions Optimal Solution
SCSD1 m = 77,n = 760 8.66666666
SCSD6 m= 147,n = 1350 50.50000000
SCSD8 m= 397,n = 2750 904.99999999
Table 5.3: Netlib Problems from [2] with specified m and n and known optimal
solution
5.2 redPDAS Experiments
The selection of the most promising dual constraints in the working set Q at
each iteration is based on a subroutine that selects the ratios xj/sj greater than
C?max(xj/sj) where C is an experimental constant. Figure 5.1 shows the average
time (in seconds) to solve 50 randomly generated problems over varying values of
C. This experiment was conducted using the test problem TAW4 with m = 50,
n = 20000, and |Q|? [3m,ubnd] where ubnd is fixed to n.
Figure 5.1 shows that the average time to solve 50 randomly generated prob-
lems for PDAS is more than 2 times that of redPDAS for C values between 10?16
106
10?15 10?10 10?5 100
0
10
20
30
40
50
60
70
C
Average Time to Solve 50 Problems (seconds)
Reduced Affine on RandomLP, m = 50, n =  20000
redPDAS
PDAS
Figure 5.1: Performance of the redPDAS algorithm against the PDAS on test
problem TAW4 with m= 50,n = 20000, and |Q|? [3m,ubnd] where ubnd is
fixed to n. The average time (in seconds) to solve 50 randomly generated problems
from TAW4 is shown over varying values of C ranging from 10?16 to 10?1.
107
and 10?6. When C > 10?5, the time for the redPDAS algorithm to solve an LP
nearly triples relative to the PDAS algorithm. This is most likely caused by an
insufficient number of dual constraints in the working set at later iterations. The
subroutine for determining Q guarantees the set contains at least m indices at each
iteration and that its size is between lbnd = min(ubnd,3m) and ubnd = ?k where ?k is
an element from 25 linearly spaced vector values (rounded to the nearest integer)
ranging from 3m to n. However, as the algorithm approaches the optimal solution,
mxj/sj ratios tend to infinity since sj ? 0 for all j in the optimal set. If C is large
enough, the selection of ratios xj/sj greater than C ?max(xj/sj) (i.e. j ? Q) may
be too small at early iterations. If too few (or zero) ratios are chosen, theredPDAS
algorithm definesQto contain the indicesj fromn?lbnd+1 tonof the largest ratios
where lbnd = min(3m,ubnd). Since ubnd = n, m= 50, and n = 20000, the number of
dual constraints in the working set may be as small as 149 (out of 20000) in early
iterations. This can cause the algorithm to take a large number of iterations before
reaching the optimal solution. According to Figure 5.1, the redPDAS algorithm
performs best against the PDAS algorithm when C is set 10?8.
In the experiments that follow, the redPDAS and PDAS algorithms solve 25
randomly generated problems from each test set (TAW, RAND, and SCSD) over
various values of 3m< ?k ?n with C fixed to 10?8. A comparison of the total CPU
time, the CPU time to form and solve the normal matrix, and the mean number of
iterations is shown. Each one is plotted against various ?k/n ratios. In many cases,
the graphs are missing data values for small ?k/n values. These missing values are
caused by one of two scenarios: (i) the algorithm fails to terminate in a ?reasonable?
108
amount of time (maximum time alloted to solve an LP is set to 106 seconds) or (ii)
MATLAB generates NaN (not a number) values for the solution as a result of the
normal matrix becoming numerically singular.
We begin by examining the experiments based on the first set of test problems
(TAW1 - TAW5).
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
5
10
CPU time
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
2
4
6
CPU time: normal mat
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
50
100
Number of iterations
k^/n
redPDAS
PDAS
Figure 5.2: Comparison of the redPDAS algorithm versus PDAS algorithm using
25 randomly generated test problems from TAW1 (constraints tangent to the unit
sphere) with m = 50, n = 20000, and ubnd = ?k where 150 ? ?k ? 20000.
109
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
5
10
15
20
25
CPU time
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
5
10
15
20
CPU time: normal mat
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 150
100
150
200
Number of iterations
k^/n
redPDAS
PDAS
Figure 5.3: Comparison of the redPDAS algorithm versus PDAS algorithm using
25 randomly generated test problems from TAW2 (random [normal] constraints)
with m = 50, n = 20000, and ubnd = ?k where 150 ? ?k ? 20000.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
5
10
15
20
25
CPU time
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
5
10
15
20
25
CPU time: normal mat
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
50
100
150
200
Number of iterations
k^/n
redPDAS
PDAS
Figure 5.4: Comparison of the redPDAS algorithm versus PDAS algorithm using
25 randomly generated test problems from TAW3 (Random as in WT-18July2003)
with m = 50, n = 20000, and ubnd = ?k where 150 ? ?k ? 20000.
110
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
5
10
15
20
25
CPU time
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
5
10
15
20
CPU time: normal mat
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 150
100
150
200
250
300
Number of iterations
k^/n
redPDAS
PDAS
Figure 5.5: Comparison of the redPDAS algorithm versus PDAS algorithm using
25 randomly generated test problems from TAW4 (RandomLP) with m = 50, n =
20000, and ubnd = ?k where 150 ? ?k ? 20000.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
5
10
CPU time
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
2
4
6
CPU time: normal mat
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
50
100
150
Number of iterations
k^/n
redPDAS
PDAS
Figure 5.6: Comparison of the redPDAS algorithm versus PDAS algorithm using
25 randomly generated test problems from TAW5 (SIPND) withm = 50,n = 20000,
and ubnd = ?k where 150 ? ?k ? 20000.
111
For the first test set (TAW), the PDAS algorithm displayed (in general) an
increase in CPU time and a constant number of iterations to solve an LP over
increasing values of the ratio ?k/n. On the other hand, the redPDAS algorithm re-
mained (in general) constant in CPU time while the number of iterations fluctuated
and often times exceeded the general algorithm. The time difference between the
algorithms became increasing large as ?k/n increased, with the redPDAS algorithm
outperforming PDAS for every test problem with ?k/n>.2.
We now examine the experiments based on the second set of test problems
(RAND1 - RAND5).
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.01
0.02
0.03
0.04
CPU time
k^/n
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
2
4
6 x 10
?3
CPU time: normal mat
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 110
20
30
40
Number of iterations
k^/n
redPDAS
PDAS
k^/n
Figure 5.7: Comparison of the redPDAS algorithm versus PDAS algorithm using
25 randomly generated test problems from RAND1 with m = 5, n = 100, and
ubnd = ?k where 15 ? ?k? 100.
112
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
CPU time
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.05
0.1
CPU time: normal mat
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
100
200
300
Number of iterations
k^/n
redPDAS
PDAS__mod
Figure 5.8: Comparison of the redPDAS algorithm versus PDAS algorithm using
25 randomly generated test problems from RAND2 with m = 12, n = 500, and
ubnd = ?k where 36 ? ?k? 500.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.5
1
1.5
2
2.5
CPU time
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.2
0.4
0.6
CPU time: normal mat
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
500
1000
Number of iterations
k^/n
redPDAS
PDAS
Figure 5.9: Comparison of the redPDAS algorithm versus PDAS algorithm using
25 randomly generated test problems from RAND3 with m = 24, n = 780, and
ubnd = ?k where 72 ? ?k? 780.
113
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.5
1
CPU time
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
CPU time: normal mat
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
50
100
150
200
Number of iterations
k^/n
redPDAS
PDAS
Figure 5.10: Comparison of the redPDAS algorithm versusPDAS algorithm using
25 randomly generated test problems from RAND4 with m = 58, n = 1004, and
ubnd = ?k where 174 ? ?k ? 1004.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
2
4
6
8
CPU time
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
1
2
3
4
5
CPU time: normal mat
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
100
200
300
400
Number of iterations
k^/n
redPDAS
PDAS
Figure 5.11: Comparison of the redPDAS algorithm versusPDAS algorithm using
25 randomly generated test problems from RAND5 with m = 75, n = 2016, and
ubnd = ?k where 225 ? ?k ? 2016.
114
The performance of the algorithms on the second test set was quite different.
The graphs show missing data values for ?k/n<.15. For the remaining ?k/n values,
thePDAS algorithm was generally constant in CPU time and number of iterations.
The redPDAS displayed a very erratic behavior on almost all of the test sets in this
category. Experiments using data sets RAND4 and RAND5 show more favorably
for the redPDAS algorithm than those using data sets RAND1 - RAND3, possibly
due to mlessmuchn in the first case.
Finally,weexaminethe experimentsbasedon theNetlibtestproblems(SCSD1,
SCSD6, and SCSD8).
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.05
0.1
0.15
0.2
CPU time
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.02
0.04
CPU time: normal mat
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 110
15
20
Number of iterations
k^/n
redPDAS
PDAS
Figure 5.12: Comparison of the redPDAS algorithm versusPDAS algorithm using
the Netlib test problem SCSD1 with m = 77, n = 760, and ubnd = ?k where 231 ?
?k ? 760.
115
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10.2
0.25
CPU time
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0.04
0.06
0.08
0.1
CPU time: normal mat
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
10
20
Number of iterations
k^/n
redPDAS
PDAS
Figure 5.13: Comparison of the redPDAS algorithm versusPDAS algorithm using
the Netlib test problem SCSD6 with m = 147, n = 1350, and ubnd = ?k where
441 ? ?k? 1350.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 11.5
2
2.5
CPU time
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10.05
0.1
0.15
0.2
CPU time: normal mat
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 110
20
30
40
Number of iterations
k^/n
redPDAS
PDAS
Figure 5.14: Comparison of the redPDAS algorithm versusPDAS algorithm using
the Netlib test problem SCSD8 with m = 397, n = 2750, and ubnd = ?k where
1191 ? ?k ? 2750.
116
For the third test set (Netlib problems), a large portion of the data is missing
for the early ?k/nvalues. ThePDAS algorithm remainsconsistent initsperformance
on CPU time and number of iterations on the SCSD problems. In addition, it
outperforms the redPDAS algorithm for some ?k/n values between 0.3 and 0.7.
However, the redPDAS and PDAS algorithms performance is almost identical in
CPU time and number of iterations as ?k/n? 1.
In the next section, we examine experiments based on Mehrotra?s Predictor-
Corrector Method.
5.3 redMPC Experiments
The redMPC experiments test the performance of the redMPC algorithm
against the reduced MPC algorithm in Tits et. al [24] (ipas35) and a general MPC
algorithm with reduced normal equations (MPC). Figure (5.15) shows the average
time (in seconds) to solve 50 randomly generated problems over varying values of C
using the test problem TAW4 with m = 50, n = 20000, and |Q| ? [3m,ubnd] where
ubnd is fixed to n.
117
10?15 10?10 10?5 100
2
3
4
5
6
7
8
C
Average Time to Solve 50 Problems (seconds)
Reduced MPC on RandomLP, m = 50, n =  20000
redMPC
MPC
Figure 5.15: Performance of the redMPC algorithm against the MPC using test
problem TAW4 with m = 50,n = 20000, |Q| ? [3m,ubnd] where ubnd is fixed to n.
The average time (in seconds) to solve 50 randomly generated problems from TAW4
is shown over varying values of C ranging from 10?16 to 10?1.
118
The redMPC outperformed the MPC algorithm for all values of C between
10?16 and 10?1. In the experiments that follow, the redMPC, MPC, and ipas35
algorithms solve 25 randomly generated problems from each test set (TAW, RAND,
and SCSD) over various values of 3m< ?k ?n with C fixed to 10?4. A comparison
of the total CPU time, the CPU time to form and solve the normal matrix, and the
mean number of iterations is shown. Each one is plotted against various ?k/n ratios.
We begin by examining the experiments based on the first set of test problems
(TAW1 - TAW5).
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
2
4
6
8
CPU time
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
1
2
3
4
5
CPU time: normal mat
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 110
15
20
25
Number of iterations
k^/n
redMPC
ipas35
MPC
Figure 5.16: Comparison of the redMPC algorithm versus the MPC and ipas35
algorithms using 25 randomly generated test problems from TAW1 (constraints
tangent to the unit sphere) with m = 50, n = 20000, and ubnd = ?k where 150 ? ?k ?
20000.
119
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 12
4
6
8
10
CPU time
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
2
4
6
CPU time: normal mat
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
20
25
30
35
Number of iterations
k^/n
redMPC
ipas35
MPC
Figure 5.17: Comparison of the redMPC algorithm versus the MPC and ipas35
algorithmsusing 25 randomly generatedtest problemsfrom TAW2(random [normal]
constraints) with m= 50, n = 20000, and ubnd = ?k where 150 ? ?k? 20000.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 12
4
6
8
10
CPU time
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
2
4
6
CPU time: normal mat
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
20
25
30
35
Number of iterations
k^/n
redMPC
ipas35
MPC
Figure 5.18: Comparison of the redMPC algorithm versus the MPC and ipas35
algorithms using 25 randomly generated test problems from TAW3 (Random as in
WT-18July2003) with m= 50, n= 20000, and ubnd = ?k where 150 ? ?k ? 20000.
120
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
2
4
6
8
10
CPU time
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
2
4
6
CPU time: normal mat
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
20
25
30
35
Number of iterations
k^/n
redMPC
ipas35
MPC
Figure 5.19: Comparison of the redMPC algorithm versus the MPC and ipas35
algorithms using 25 randomly generated test problems from TAW4 (RandomLP)
with m = 50, n = 20000, and ubnd = ?k where 150 ? ?k ? 20000.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
2
4
6
8
CPU time
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
1
2
3
4
5
CPU time: normal mat
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
15
20
25
Number of iterations
k^/n
redMPC
ipas35
MPC
Figure 5.20: Comparison of the redMPC algorithm versus the MPC and ipas35
algorithms using 25 randomly generated test problems from TAW5 (SIPND) with
m = 50, n = 20000, and ubnd = ?k where 150 ? ?k ? 20000.
121
For problem sets TAW2-TAW4, theMPC andipas35 algorithms displayed(in
general) an increase in CPU time and a constant number of iterations to solve an LP
over increasing values of the ratio ?k/n. TheredMPC algorithm, however, remained
(in general) constant in CPU time while the number of iterations fluctuated and
exceeded the competition. In problem sets TAW1 and TAW5, theMPC and ipas35
algorithms show an increase in CPU time as ?k/n increases from .1 to .8, then a small
decrease in time for ?k/n values between .8 to 1. The number of iterations to solve an
LP fluctuates for all of the algorithms. In all of the problem sets (TAW1 - TAW5),
the time difference between the MPC and ipas35 algorithms and the redMPC
algorithm became increasing large as ?k/n increased, with the redMPC algorithm
outperforming the other algorithms in every test set for all ?k/n>.2.
We now examine the MPC experiments based on the second set of test prob-
lems (RAND1 - RAND5).
122
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.01
0.02
0.03
0.04
CPU time
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
2
4
6 x 10
?3
CPU time: normal mat
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
10
15
20
Number of iterations
k^/n
redMPC
ipas35
MPC
Figure 5.21: Comparison of the redMPC algorithm versus the MPC and ipas35
algorithms using 25 randomly generated test problems from RAND1 with m = 5,
n = 100, and ubnd = ?k where 15 ? ?k ? 100.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0.02
0.04
0.06
0.08
0.1
CPU time
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.01
0.02
0.03
0.04
CPU time: normal mat
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
15
20
25
Number of iterations
k^/n
redMPC
ipas35
MPC
Figure 5.22: Comparison of the redMPC algorithm versus the MPC and ipas35
algorithms using 25 randomly generated test problems from RAND2 with m = 12,
n = 500, and ubnd = ?k where 36 ? ?k ? 500.
123
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.05
0.1
0.15
0.2
CPU time
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.05
0.1
CPU time: normal mat
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
15
20
25
30
35
Number of iterations
k^/n
redMPC
ipas35
MPC
Figure 5.23: Comparison of the redMPC algorithm versus the MPC and ipas35
algorithms using 25 randomly generated test problems from RAND3 with m = 24,
n = 780, and ubnd = ?k where 72 ? ?k ? 780.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
CPU time
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
CPU time: normal mat
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
15
20
25
30
35
Number of iterations
k^/n
redMPC
ipas35
MPC
Figure 5.24: Comparison of the redMPC algorithm versus the MPC and ipas35
algorithms using 25 randomly generated test problems from RAND4 with m = 58,
n = 1004, and ubnd = ?k where 174 ? ?k ? 1004.
124
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0.5
1
1.5
CPU time
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0.2
0.4
0.6
0.8
1
CPU time: normal mat
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
15
20
25
30
35
Number of iterations
k^/n
redMPC
ipas35
MPC
Figure 5.25: Comparison of the redMPC algorithm versus the MPC and ipas35
algorithms using 25 randomly generated test problems from RAND5 with m = 75,
n = 2016, and ubnd = ?k where 225 ? ?k ? 2016.
The performance of the algorithms on the second test set was quite different
from the performance on the TAW test set. It can be observed that the graphs show
missing data values for ?k/n < .15. For the remaining ?k/n values, the performance
of the algorithms varied. Unlike the redPDAS algorithm, the redMPC algorithm
generally solved an LP in less time than the MPC and ipas35 algorithms. In
fact, as the problem size increased, the total CPU time between the MPC and
ipas35 algorithms and the redMPC algorithm increased. With the exception of
the RAND1 problem set, the same trend occurred when determining the CPU time
to form and solve the normal matrix. In all of the problem sets, the algorithms
generally solved an LP in the same number of iterations for ?k/n close to 1. For all
other ?k/n values, the number of iterations to solve an LP varied for each algorithm.
125
Lastly, we examinethe experimentsbased on the Netlibtest problems(SCSD1,
SCSD6, and SCSD8).
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.05
0.1
0.15
0.2
CPU time
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.05
0.1
0.15
CPU time: normal mat
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
10
15
20
Number of iterations
k^/n
redMPC
ipas35
MPC
Figure 5.26: Comparison of the redMPC algorithm versus the MPC and ipas35
algorithms using the Netlibtest problem SCSD1 withm= 77,n = 760, andubnd = ?k
where 231 ? ?k ? 760.
126
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10.2
0.4
0.6
0.8
CPU time
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0.2
0.4
CPU time:  normal mat
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 110
15
20
25
Number of iterations
k^/n
redMPC
ipas35
MPC
Figure 5.27: Comparison of the redMPC algorithm versus the MPC and ipas35
algorithms using the Netlib test problem SCSD6 with m = 147, n = 1350, and
ubnd = ?k where 441 ? ?k ? 1350.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
2
3
4
CPU time
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0.5
1
1.5
2
2.5
CPU time: normal mat
k^/n
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
10
15
20
Number of iterations
k^/n
redMPC
ipas35
MPC
Figure 5.28: Comparison of the redMPC algorithm versus the MPC and ipas35
algorithms using the Netlib test problem SCSD8 with m = 397, n = 2750, and
ubnd = ?k where 1197 ? ?k ? 2750.
127
For the third test set (Netlib problems), a large portion of the data is missing
for the early ?k/n values. The redMPC and MPC outperform ipas35 in CPU time.
As ?k/n increases, the time to solve an LP with ipas35 increases whereas the time to
solve an LP with the other algorithms is small and fixed. The number of iterations
to solve test problem SCSD8 for the redMPC algorithm is somewhat larger than
for the other algorithms. For test problems SCSD1 and SCSD6, the number of
iterations to solve is almost identical as ?k/n? 1.
5.4 Discussion
The experiments in the previous sections demonstrate that the redPDAS and
redMPC, in general, outperform their general counterparts in terms of CPU time.
The redMPC and MPC algorithms also perform well against ipas35. Further-
more, with the exception of a few cases (i.e. the small values of ?k/n where the
algorithm(s) did not display information), the algorithms in this chapter outper-
formed MATLAB?s LP solver, LINPROG, in total CPU time. In this section, we
will give reason or speculation as to a few of the outcomes.
Figures (5.1) and (5.15) show the average time (in seconds) to solve 50 ran-
domly generated problems over varying values of C. These experiments were con-
ductedusing the test problem TAW4 withm = 50,n = 20000, andubnd fixedton. It
was shown that the ?reduced? algorithms performed exceptionally well against their
general counterparts. We also performed these experiments on other test problems
with m lessmuch n. Although there was some fluctuation in the value of C, it remained
128
within 1/100 of the value we found in each case. Therefore, we felt confident the
?reduced? algorithms would perform well on other test problems with mlessmuchn based
on our specific choice of C determined by one of the problem sets used in this thesis,
TAW4.
The redPDAS and redMPC algorithms (in general) displayed small CPU
times and a large number of iterations over the various ?k/n values. The small CPU
times can be attributed to the initial point calculation, the formation of ?reduced?
matrices and vectors, and temporary matrices/vectors defined for AQ, DQ, and the
dual residual, (rd)Q. On the other hand, the large number of iterations are the result
of the update strategy used for the dual residual and solution. Since a small number
of constraints (lessmuchn) are chosen at each iteration, the algorithms have the potential
of cycling through a large number of solutions before reaching the optimal solution.
MATLAB?s LP Interior-Point Solver, LIPSOL, is capable of solving a wide
range of problems including large-scale LPs, see [33], [34]. On our test problems (i.e.
problems with m lessmuch n), the ?reduced? algorithms outperformed LIPSOL in CPU
time. In experimentation, we discovered by first converting a sparse matrix A to a
full matrix and then using a QR-factorization of A to solve for the initial point, the
time to formulate the initial point is significantly reduced. Prior to this conversion,
approximately80% of theCPU timewas devotedtocalculatingthe initialpoint, thus
causing the time to solve the ?reduced? algorithms to exceedthe time for LIPSOL to
solve the same LP. All of the algorithms (including LIPSOL), calculate the normal
matrix using MATLAB?s Cholesky-Infinity factorization (cholinc function). This is
necessary to preserve full rank and calculate efficient search directions.
129
5.5 Summary
Inthischapter, we havedemonstratedthat algorithmsredPDAS andredMPC
are more effective than their general counterparts in solving LPs with mlessmuchn. This
is clearly shown by the results displayed using the TAW problem sets (with m= 50
and n = 20000). In the second and third test sets, the difference between the m
and n values was not as large. As a result, the general algorithms performed just
as well and in some cases even better than the ?reduced? algorithms. The ipas35
algorithm was least effective in solving LPs compared to the redMPC and MPC
algorithms.
130
Chapter 6
Conclusions and Further Study
We have presented theredPDAS algorithm and the redMPC algorithm, new
column generation algorithms for solving linear programming problems (LPs). Ex-
periments conducted on these algorithms demonstrate that they are, in general,
more effective for solving LPs with mlessmuchn than their general counterparts (PDAS
and MPC). Although there is no known convergence proof at this time for the
redPDAS and redMPC algorithms, we believe our column generation strategy
will prove to be beneficial in solving other optimization problems as well.
In this dissertation, we demonstrated that by reducing the m ? m matrix
AD2AT, an essential component for solving for the search direction on large-scale
LPs, we can reduce the amount of computations per iteration and significantly re-
duce the time to solve a LP problem. This was illustrated by the results of the
experiments conducted using the random problem sets from TAW. The time differ-
ence between the ?reduced? and general algorithms became increasing large as k/n
increased, with the ?reduced? algorithms outperforming their general counterparts
on every test problem from this set with k/n > .2. Although the total number of
iterations to solve the ?reduced? LP often exceeded that of the general problem, the
computational cost per iteration of the ?reduced? problem was significantly lower
due to its size. The ?reduced? algorithms were not as effective in reducing solution
131
time for the other problem sets. In these problem sets, the difference between m
and n was not significantly large. As a result, the general algorithms performed just
as well and in some cases even better than the ?reduced? algorithms.
Based on the results of the experiments conducted using the random problem
sets with mlessmuchn, we believe our column generation scheme can be applied to other
algorithms with similar success. One algorithm we have begun to investigate is
Potra?s algorithm [22]. Potra?s algorithm is of particular interest because he allows
for a primal-dual infeasible starting point and proves his algorithm has polynomial
complexity under certain conditions. Potra uses a three-step method to improve
feasibility, optimality, and centrality of the iterates. He combines the two steps
of Mizuno-Todd-Ye [18] with a third to achieve feasibility and optimality at the
same rate. We developed a ?reduced? version of Potra?s algorithm called redPC,
presented in Appendix B. The redPC algorithm also uses this three-step method,
however feasibility and optimality may not necessarily be achieved at the same rate.
Despite this, we aim to show global convergence and investigate the complexity of
our algorithm.
In addition to further investigating the redPC algorithm, there are computa-
tional issues that can be researched in all of the ?reduced? algorithms. One topic
of research is the method in which the set Q is chosen. We chose Q based on the
k ? [m,n] largest xj/sj ratios at each iteration. Tits et al. [24], however, selects
a fixed index set Q associated with the most ?promising? dual constraints based
on the smallest slack values, sj, at each iteration. Dantzig and Ye [6] start with
m constraints and consider building up the constraint set based on the index set
132
associated with all sj < 2?epsilon1 at each iteration. There are a number of ways to select
the index set Q and determining if one technique saves time and computational ef-
fort over others is worth investigating. Another topic of investigation is the update
strategy used within our ?reduced? algorithms. Our update strategy is based on
the amount of progress we are making towards satisfying dual feasibility. However,
Potra?s algorithm uses an update strategy based on improving feasibility, optimal-
ity, and centrality of the iterates at the same rate. Experimentation on the redPC
algorithm may show a reduction in the total number of iterations using this update
concept. One other focus for research is determining a ?good? initial point. The
initial point we used in the redPDAS and redMPC algorithms is the same initial
point used in Mehrotra?s algorithm [17]. However, there are many papers in the
field of interior point methods that concentrate on starting-point strategies (see [7]
or [14]) Our algorithms perform quite well overall against standard algorithms and
we may be able to improve our results by using a better starting point.
Finally, the use of interior-point methods in column generation settings has
been extended to applications in semi-definite programming problems, network de-
sign problems, and second order cone programming problems, to name a few. We
believe our strategy will be very beneficial for solving other LPs and eventually for
solving more general problems.
133
Appendix A
Topics in Numerical Optimization
A.1 Steepest-Descent Method
Consider the n-dimensional problem
min
x?Rn
{f(x) : x?S} (A.1)
where S is the set of feasible points defined by a set of constraints (or Rfracturn). If (A.1)
is optimal at the current solution x, then x solves the problem. Otherwise, assume
x is feasible and let x+ = x+??x be the updated solution to (A.1) with search
direction ?x and step length ? > 0. Using only the first two terms of the Taylor?s
series, we can approximate f(x+) by
f(x+) = f(x+??x),
? f(x) +??xT?f(x)
where ?f(x) is the gradient of f at x. Therefore, for small ?, f(x+??x) <f(x)
provided ?xT?f(x) < 0. The direction ?x that provides the greatest descent per
unit step is given by ?x= ??f(x), hence the name steepest-descent. A step length
?> 0 is then computed to determine x+ so that x+ ?S.
In the case of a linear program
min
x?Rfracturn
braceleftbigcTx : Ax= b, x? 0bracerightbig, (A.2)
134
where A is an m?n matrix of full row rank, the steepest descent method computes
?x = ?c as the search direction if the current solution is not optimal. Assuming
the current solution x is feasible, the updated solution x+ is determined to maintain
feasibility (i.e. x+ ?S = {Ax= b, x? 0}). Therefore we require
Ax+ = Ax+?A?x= b (A.3)
and
x+ =x+??x>0. (A.4)
The equations in (A.3) imply that we need A?x = 0 since x ?S and ? > 0.
That is, ?x must lie in the null space of A. By premultiplying ?x = ?c by the
n?n matrix
PA = I ?AT parenleftbigAATparenrightbig?1A
we have a projected steepest descent direction (i.e. ?x = ?PAc) which simultane-
ously decreases the objective function value and satisfies A?x = 0. The matrix
PA has the effect of projecting any nonzero vector in Rfracturn onto the null space of A.
Further discussion of this topic is given in the next section.
The inequalitiesin (A.4) are satisfied by computing an appropriate step length
?> 0 along the search direction ?x= ?PAc.
A.2 Orthogonal Projection Matrix
The n?n matrix
PA =I ?AT parenleftbigAATparenrightbig?1A,
135
which was presented in the previous section, is also termed an orthogonal projection
matrix. Its name comes from the fact that premultiplying any n-dimensional vector
x byPA produces an orthogonal projection of x onto the null space of A. See Figure
A.1.
Figure A.1: A n-dimensional vectorx is projected onto the null space of A [denoted
N(A)] by the orthogonal projection matrix P = I ?AT(AAT)?1A.
The orthogonal projection matrix is symmetric(i.e. PTA = PA) and idempotent
(i.e. P2A = PA). In addition, it can be formed from any m?n matrix of full row
rank. For example, the matrix
PAD = I?DAT parenleftbigAD2ATparenrightbig?1AD,
is an orthogonal projection matrix into the null space of AD where D is a diagonal
matrix.
136
Appendix B
The redPC Algorithm
Potra?s three-step method involves solving three linear systems to improve
feasibility, optimality, and centrality of the iterates. Our method follows Potra?s
method by solving three ?reduced? linear systems inan attempt to reachthe optimal
solution for the primal-dual pair. In solving the first linear system, we aim to
improve optimality while keeping feasibility the same. In the second linear system,
we strive to improve feasibility while keeping optimality the same. Finally, in the
third linear system, we aim to ?centralize? the iterates for improved movement
within the next iteration. Our ?reduced? version is similar to the redPDAS and
redMPC algorithms in that we consider only those columns (components) of a
matrix (vector) associated with Q.
B.1 Background
Let rp =Ax?b and rd = ATy+s?c be the primal and dual residuals, respec-
tively. The searchdirectioncomputedtoimproveoptimalityor decrease?isgivenby
?
??
??
??
?
A 0 0
0 AT I
S 0 X
?
??
??
??
?
?
??
??
??
?
?xa
?ya
?sa
?
??
??
??
?
= ?
?
??
??
??
?
0
0
XSe
?
??
??
??
?
. (B.1)
137
The expressions for the components of ?za = (?xa,?ya,?sa) are given by
?ya = (AS?1XAT)?1Ax,
?sa = ?AT?ya,
?xa = ?x?S?1X?sa.
This is precisely the affine-scaling direction in (3.9) with rp = rd = 0. On the other
hand, the search direction computed to improve feasibility or decrease the residuals
is given by ?
??
??
??
?
A 0 0
0 AT I
S 0 X
?
??
??
??
?
?
??
??
??
?
?xa
?ya
?sa
?
??
??
??
?
= ?
?
??
??
??
?
rp
rd
0
?
??
??
??
?
. (B.2)
This is the affine-scaling direction in (3.9) with XSe = 0. Solving for ?za =
parenleftbig?xa,?ya,?saparenrightbig, we have
?ya = ?(AS?1XAT)?1(rp +AS?1Xrd),
?sa = ?rd ?AT?ya,
?xa = ?S?1X?sa.
The sum of ?za and ?za is the affine-scaling direction.
Let z = (x,y,s). Step lengths ?? and ?? are chosen so that the point ?z =
z+ ???za + ???za ?N? where
N? =
braceleftBig
(tildewidex,tildewides) :tildewidex>0,tildewides> 0,
vextenddoublevextenddouble
vextenddoubletildewideXtildewides?tildewide?e
vextenddoublevextenddouble
vextenddouble??tildewide?
bracerightBig
138
with
0 <?<? ? 2??
?2
1+?2 < 1.
The third search direction, ?zcc = (?xcc,?ycc,?scc) is computed to improve the
centrality of the iterates. This linear system is given by
?
??
??
??
?
A 0 0
0 AT I
tildewideS 0 tildewideX
?
??
??
??
?
?
??
??
??
?
?xcc
?ycc
?scc
?
??
??
??
?
=
?
??
??
??
?
0
0
tildewide?e? tildewideXtildewides
?
??
??
??
?
. (B.3)
The expressions for the centering-corrector direction, ?zcc = (?xcc,?ycc,?scc), are
given by
?ycc = (AtildewideS?1 tildewideXAT)?1A
parenleftBig
tildewidex?tildewide?tildewideS?1e
parenrightBig
,
?scc = ?AT?ycc,
?xcc = ?tildewidex+ tildewideS?1
parenleftBig
tildewide?e? tildewideX?scc
parenrightBig
.
The updated solution, z+, is obtained by setting
z+ = parenleftbigx+,y+,s+parenrightbig= ?z+ ?zcc,
where z+ satisfies (x+,s+) ?N? = {(x+,s+),bardblX+s+ ??+ebardbl???}.
B.2 Potra?s Predictor-Corrector Algorithm
It is assumed that A is an m?n matrix with full row rank. Therefore, we
denote its Moore-Penrose pseudoinverse by A? = AT(AAT)?1. The initial solution
(or starting point) has the form
x0 = ?e, s0 = ?e, y0 = 0,
139
where
? ?vextenddoublevextenddoubleA?bvextenddoublevextenddouble?, ? ? bardblcbardbl?.
We define step lengths ??? (0,1] and ?? such that
(?+,r+p ,r+d ) = (1? ??)(?,rp,rd)
and
?? = ?(??).
The details surrounding the compution of the step lengths in the affine-scaling di-
rection can be found in Potra [22]. Potra?s predictor-corrector algorithm is stated
below:
???????????????-
Input: (x,y,s) with x =?e? bardblA?bbardbl? e, s = ?e? bardblcbardbl? e, and y = 0.
Initialize: epsilon1? 0; rp = Ax?b; rd = ATy+s?c.
Main Algorithm:
while xTs>epsilon1,bardblrpbardbl>epsilon1, or bardblrdbardbl>epsilon1
Compute affine-scaling directions:
I. Compute ?za = (?xa,?sa,?ya):
?ya = parenleftbigAS?1XATparenrightbig?1Ax,
?sa = ?AT?ya,
?xa = ?x?S?1X?sa.
140
II. Compute ?za =parenleftbig?xa,?sa,?yaparenrightbig :
?ya = ?(AS?1XAT)?1(rp +AS?1Xrd),
?sa = ?rd ?AT?ya,
?xa = ?S?1X?sa.
Compute affine steps:
Set
? = (?x
a)T ?sa + (?sa)T ?xa
n? ,
? =
parenleftbig?xaparenrightbigT ?sa
n? ,
and compute
?? = minparenleftbigbraceleftbig?? (0,1) : ??R(?) ?parenleftbig?2?2 ?bardblfbardbl2parenrightbigbracerightbig?{1}parenrightbig,
?? =
parenleftbigg1+???
1????
parenrightbigg
??.
Compute ?z ?N?:
Compute ?z = z+ ???za+ ???za. If ?? = 1, then ?z ?F? (Optimal solution
set). Terminate.
Compute centering-corrector direction:
?ycc = (AtildewideS?1 tildewideXAT)?1A
parenleftBig
tildewidex?tildewide?tildewideS?1e
parenrightBig
,
?scc = ?AT?ycc,
?xcc = ?tildewidex+ tildewideS?1
parenleftBig
tildewide?e? tildewideX?scc
parenrightBig
.
Update solution
Compute z+ = ?z+ ?zcc.
141
end(while)
???????????????-
B.3 The redPC Algorithm
The redPC algorithm is simply a ?reduced? version of Potra?s algorithm. This
algorithm is similar to the redPDAS and redMPC algorithms in that we consider
only those columns (components) of a matrix (vector) associated with Q. We let
?rp = AQxQ ?b be the primal residual based on Q and define ??? to be a parameter
such that
0 ? ??? < ??? 1.
The redPC is presented below:
???????????????-
Input: (x,y,s) withx = ?e? bardblA?bbardbl? e, s= ?e? bardblcbardbl? e, andy = 0;m<ubnd ?n.
Initialize: epsilon1 = 10?8; lbnd = min{ubnd,3m}; rp = Ax?b; rd =ATy+s?c.
Main Algorithm:
while xTs>epsilon1,bardblrpbardbl>epsilon1, or bardblrdbardbl>epsilon1
Select the most promising dual constraints.
Compute affine-scaling directions:
142
I. Compute ?zaQ =parenleftbig?xaQ,?saQ,?yaparenrightbig
?ya = parenleftbigAQS?1Q XQATQparenrightbig?1AQxQ,
?saQ = ?ATQ?ya,
?xaQ = ?xQ ?S?1Q XQ?saQ.
II. Compute ?zaQ =parenleftbig?xaQ,?saQ,?yaparenrightbig
?ya = ?parenleftbigAQS?1Q XQATQparenrightbig?1bracketleftbig?rp +AQS?1Q XQ(rd)Qbracketrightbig,
?saQ = ?(rd)Q ?ATQ?ya,
?xaQ = ?S?1Q XQ?saQ.
Compute affine steps:
Set
? =
parenleftbig?xa
Q
parenrightbigT ?sa
Q +
parenleftbig?sa
Q
parenrightbigT ?xa
Q
n? ,
? =
parenleftbig?xa
Q
parenrightbigT ?sa
Q
n? ,
? =
parenleftbig?xa
Q
parenrightbigT ?sa
Q
n? ,
? = (1? ??
?)xT?
Q?s
a
?Q
n? ,
? = x
T
QsQ
n? ,
and compute
?? = minparenleftbigbraceleftbig?? (0,1) : ?? ?S(?) ??4parenleftbig?2?2 ?bardblfbardbl2parenrightbigbracerightbig?{1}parenrightbig,
??? = ???,
?? =
parenleftbigg1 +? + [??(1??)?] ??
?????
parenrightbigg
??.
143
Compute ?z:
Set ?xa?Q = 0, ?xa?Q = ?????? x?Q, ?sa?Q = ?AT?Q?ya, ?sa?Q = (rd)?Q ?AT?Q?ya
and compute ?z = z + ???za + ???za. If ?? = 1, then ?z ? F? (Optimal
solution set). Terminate.
Compute centering-corrector direction:
?ycc = (AQtildewideS?1Q tildewideXQATQ)?1AQ
parenleftBig
tildewidexQ ?tildewide?tildewideS?1Q eQ
parenrightBig
,
?sccQ = ?ATQ?ycc,
?xccQ = ?tildewidexQ + tildewideS?1Q
parenleftBig
tildewide?eQ ? tildewideXQ?sccQ
parenrightBig
.
Update solution
Set ?xcc?Q = 0, ?scc?Q = 0 and compute z+ = ?z+ ?zcc.
end(while)
???????????????-
In Potra?s algorithm, the affine-scaling directionsare computedin two separate
steps to improve feasibility and optimality. Using the separate components of the
affine-scaling direction, Potra derives and incorporates the affine steps, ??and ??, into
an intermediate solution (?z = z+ ???za + ???za) to determine optimality. Although
the details surrounding the affine steps have been eliminated in this thesis, they can
be easily followedin Potra [22]. If the solution is not optimal, the centering-corrector
direction is computed to produce a new solution.
There are signficant differences between Potra?s algorithm and the redPC al-
gorithm. The main difference is the ?reduced? size of the matrices and vectors
based on the set Q. Reducing the size of the data, however, drastically increased
144
the amount of mathematical computation and detail in deriving the affine steps for
the intermediate solution. Although the redPC algorithm can be shown to exhibit
global convergence, we have been unsuccessful in proving polynomial complexity.
Since the components of the directions in ?Q can be defined in different ways based
on the choice of ???, variants of the redPC algorithm will be considered in future
research.
145
BIBLIOGRAPHY
[1] M. Achache, H. Roumili, and A. Keraghel. A numerical study of an infeasible
primal-dual path-following algorithm for linear programming. Applied Mathe-
matics and Computation, 186:1472?1479, 2007.
[2] Various Authors. Netlib lp test problems. http://www-fp.mcs.anl.gov/OTC/
Guide/TestProblems/LPtest/, 1985. Retrieved July 2004.
[3] M. S. Bazaraa, H. D. Sherali, and C. M. Shetty. Nonlinear Programming:
Theory and Algorithms 2nd Ed. John Wiley and Sons, Hobeken, NJ, 1993.
[4] DimitrisBertsimasand John N. Tsitsiklis. Introduction to Linear Optimization.
Athena Scientific, Belmont, MA, 1997.
[5] George B. Dantzig. Linear Programming and Extensions. Princeton University
Press, Princeton, NJ, 1963.
[6] George B. Dantzig and Yinyu Ye. A build-up interior-point method for linear
programming: Affinescalingform. Working Paper, Department of Management
Science, University of Iowa, 1991.
[7] Michael Gertz, Jorge Nocedal, and Annick Sartenaer. A starting-point strategy
for nonlinear interior methods. Applied Math Letters, 17:945?952, 2004.
[8] P.C. Gilmore and R.E. Gomory. A linear programming approach to the cutting
stock problem. Operations Research, 9:849?859, 1961.
[9] J.-Louis Goffin, Z.-Quan Luo, and Y. Ye. On the Complexity of a Column
Generation Algorithm for Convex or Quasiconvex Feasibility Problems. Large
Scale Optimization: State of the Art, W. W. Hager, D. W. Hearn and P.M.
Pardalos, Editors, Kluwer Academic Publishers B.V, 1994.
[10] G. H. Golub and C. F. Van Loan. Matrix Computations. John Hopkins Uni-
versity Press, Baltimore, MD, 1989.
[11] D.den Hertog, Cornelis Roos, and T?amas Terlaky. A build-up variant of the
logarithmic barrier method for lp. Operations Research Letters, 12:181?186,
1992.
[12] D.den Hertog, Cornelis Roos, and Tamas Terlaky. Adding and deleting con-
straints in the logarithmic barrier method for lp. Advances in Optimization and
Approximation, pages 166?185, 1994.
[13] Wikipedia Foundation Inc. Matrix norm. http://en.wikipedia.org/wiki/
Matrix\_norm, 2002. Retrieved May 2007.
146
[14] Elizabeth John and E. Alper Yildirim. Implementation of warm-start strate-
gies in interior-point methods for linear programming in fixed dimension.
http://www.optimization-online.org/DB_FILE/2006/05/1389.pdf, 2006.
Retrieved April 2008.
[15] Narendra Karmarkar. A new polynomial-time algorithm for linear program-
ming. Combinatorica, 4:373?395, 1984.
[16] Masakazu Kojima, Nimrod Megiddo, and Shinji Mizuno. Polynomiality of
infeasible-interior-point algorithms for linear programming. Mathematical Pro-
gramming, 67:109?119, 1994.
[17] Sanjay Mehrotra. On the implementation of a primal-dual interior point
method. SIAM Journal on Optimization, 2:575?601, 1992.
[18] S. Mizuno, Michael J. Todd, and Yinyu Ye. On adaptive-step primal-dual
interior-point algorithms for linear programming. Mathematical Operations Re-
search, 18:964?981, 1993.
[19] Renato D.C. Monteiro, Ilan Adler, and Mauricio G.C. Resende. A polynomial-
time primal-dual affine scaling algorithm for linear and convex quadratic pro-
gramming and its power series extension. SIAM Journal on Optimization,
2(1):7?20, 1990.
[20] Stephen G. Nash and Ariela Sofer. Linear and Nonlinear Programming.
McGraw-Hill, New York, NY, 1996.
[21] E.R. Panier, Andr?e L. Tits, and J.N. Herskovits. A qp-free globally convergent,
globally superlinearlyconvergent algorithm for inequalityconstrainedoptimiza-
tion. SIAM Journal on Control and Optimization, 26(4):788?811, 1988.
[22] Florian A. Potra. An infeasible-interior-point predictor-corrector algorithm for
linear programming. SIAM Journal on Optimization, 6:19?32, 1996.
[23] Andr?e L. Tits. An interior point method for linear programming, with an active
set flavor. technicalreport TR-99-47, Institute for SystemsResearch, University
of Maryland, College Park, MD, 1999.
[24] Andr?e L. Tits, P.-A. Absil, and William P. Woessner. Constraint reduction for
linear programs with many inequality constraints. SIAM Journal on Optimiza-
tion, 17:119?146, 2006.
[25] Andr?e L. Tits and J. L. Zhou. A simple, quadratically convergent algorithm
for linear and convex quadratic programming. In W.W. Hager, D.W. Hearn,
and P.M. Pardalos, editors, Large Scale Optimization: State of the Art, pages
411 ? 427. Kluwer Academic Publishers, 1994.
[26] Robert J. Vanderbei. Linear Programming: Foundations and Extensions.
Kluwer Academic Publishers, Boston, MA, 1996.
147
[27] David S. Watkins. Fundamentals of Matrix Computations. Wiley, New York,
NY, 2002.
[28] Luke B. Winternitz, Stacey O. Nicholls, Andr?e L. Tits, and Dianne P.
O?Leary. A constraint-reduced variant of the Mehrotra predictor-corrector algo-
rithm. http://www.optimization-online.org/ARCHIVE\_DIGEST/2007-07.
html, 2007. Retrieved October 2007.
[29] Steven J. Wright. Primal-Dual Interior-Point Methods. SIAM, Philadelphia,
PA, 1997.
[30] Yinyu Ye. A build-down scheme for linear programming. Mathematical Pro-
gramming, 46:61?72, 1990.
[31] Yinyu Ye. A potential reduction algorithm allowing column generation. SIAM
Journal on Optimization, 2(1):7?20, 1992.
[32] Yinyu Ye. Complexity analysis of the analytic center cutting plane method
that uses mulitple cuts. Mathematical Programming, 78:85?104, 1997.
[33] Yin Zhang. Solving large-scale linear programs by interior-point methods under
the MATLAB environment. technical report TR96-01, Department of Math-
ematics and Statistics, University of Maryland Baltimore County, Baltimore,
MD, 1996.
[34] Yin Zhang. User?s guide to LIPSOL linear-programming interior point solvers
v0.4. Optimization Methods and Software, 11 & 12:385?396, 1999.
148