ABSTRACT Title of dissertation: COLUMN GENERATION IN INFEASIBLE PREDICTOR-CORRECTOR METHODS FOR SOLVING LINEAR PROGRAMS Stacey O. Nicholls, Doctor of Philosophy, 2009 Dissertation directed by: Professor Dianne P. O?Leary Department of Computer Science Institute for Advanced Computer Studies Primal-dual interior-point methods (IPMs) are distinguished for their excep- tional theoretical properties and computational behavior in solving linear program- ming (LP) problems. Consider solving the primal-dual LP pair using an IPM suchas a primal-dual Affine-Scaling method, Mehrotra?s Predictor-Corrector method (the most commonly used IPM to date), or Potra?s Predictor-Corrector method. The bulk of the computation in the process stems from the formation of the normal equation matrix, AD2AT, where A ? Rfracturm?n and D2 = S?1X is a diagonal matrix. In cases when n greatermuch m, we propose to reduce this cost by incorporating a column generation scheme into existing infeasible IPMs for solving LPs. In particular, we solve an LP problem based on an iterativeapproach where we select a ?small? subset of the constraints at each iteration with the aim of achieving both feasibility and optimality. Rather than n constraints, we work with k = |Q| ? [m,n] constraints at each iteration, where Q is an index set consisting of the k most nearly active con- straints at the current iterate. The cost of the formation of the matrix, AQD2QATQ, reduces from ?(m2n) to ?(m2k) operations, where k is relatively small compared to n. Although numerical results show an occasional increase in the number of iter- ations, the total operation count and time to solve the LP using our algorithms is, in most cases, small compared to other ?reduced? LP algorithms. COLUMN GENERATION IN INFEASIBLE PREDICTOR-CORRECTOR METHODS FOR SOLVING LINEAR PROGRAMS by Stacey O. Nicholls Dissertation submitted to the Faculty of the Graduate School of the University of Maryland, College Park in partial fulfillment of the requirements for the degree of Doctor of Philosophy 2009 Advisory Committee: Professor Dianne P. O?Leary, Chair/Advisor Professor Andr?e L. Tits Professor Eric V. Slud Professor Konstantina Trivisa Professor Michael Laskowski DEDICATION This dissertation is dedicated to my sons: Tyler Adedapo Akanji Jagun and Sean Adekunle Ayinde Jagun ii ACKNOWLEDGMENTS My journey to completing this dissertation came with many obstacles and challenges. However, there were a great number of people who supported my efforts to help make this thesis possible. I am truly blessed to have had the opportunity to learn from, work with, and be mentored by those who made my graduate experience one that I will always cherish. First and foremost, I would like to thank my advisor, Dr. Dianne P. O?Leary. Words cannot expresshow grateful I am to have workedwith such an extraordinarily brilliant woman. She gave me very challenging, yet interesting projects to work on overthese past few years, and her advicewas invaluable. She was alwaysavailable for me when I needed help and/or advice on issues related to and beyond my research. She has given me enthusiasm for the area of Numerical Optimization, an eye for detail in working mathematical proofs and describing computational results, and an appreciation for all of the time and energy she devotes to her students. She is an exceptional professor and mentor, and I aspire to be just as influential to my students and their careers as she has been to me. I would also like to thank Dr. Andr?e L. Tits. This thesis would not have been possible without his remarkable theoretical ideas and expertise. He has always given me helpful suggestions and advice for my research. He was a valuable asset to me in this process, and I thank him for all he has offered. iii Thank you to Dr. Andr?e L. Tits, Dr. Eric V. Slud, Dr. Konstantina Trivisa, and Dr. Michael Laskowski for agreeing to serve on my thesis committee. I greatly appreciate the time and energy you dedicated to reviewing the thesis and providing suggestions for revisions. During my years at UMCP, I was fortunate to receive funding from a number of sources. UMCP offered not only an opportunity for me to gain valuable informa- tion from professors, but also an opportunity to teach undergraduate mathematics courses. I taught lecture and recitation classes as a teaching assistant for four years. I must thank Dr. William Schildknecht for giving me the opportunity to teach a variety of mathematics courses at UMCP. These opportunities opened doors for teaching jobs at other universities, and I am grateful for this teaching support. I would also like to thank Dr. Johnetta Davis for her assistance throughout my grad- uate years. She was pivotal in helping me obtain a GAANN fellowship for which I am truly thankful. Dr. Davis also served as one of the PIs for the Promise Program, a program in which I gained valuable insight about the process to the doctorate. I would like to thank Dr. Michael Boyle for the opportunity to work under the VIGRE grant. This support provided me with an opportunity to work with Dr. Jeffrey Cooper and Dr. Tobias von Petersdorff in designing MATLAB projects for undergraduate students in computational mathematics using concepts in numeri- cal analysis, differential equations, and optimization. I was also funded under the Department of Energy grant DEFG0204ER25655: ?Interior Point Algorithms for Optimization Problems?, and I would like to thank Dr. Dianne P. O?Leary and Dr. Andr?e L. Tits for this support. Lastly, I would like to thank Dr. Daniel Syman- iv cyk and the rest of the hiring committee from the Mathematics Department at Anne Arundel Community College for believing in my ability to work as a full-time mathematics professor while finishing my graduate studies. The staff at UMCP has been extremely helpful throughout my graduate years. I would like to especially thank Ms. Alverda McCoy for all of her help, support, and advice with the AMSC program. Many of my most memorable graduate school moments are due to a group of friends that: (1) helped make my transition from undergraduate to graduate school a smooth one, (2) offered exceptional advice to help me reach certain milestones in my program, and (3) provided a friendship that will last a lifetime. I particularly would like to give a big thanks to my friend and former roommate, Ms. Joycelyn Wilson, for her immeasurable support throughout the years. I would also like to thank my graduate mentor Dr. Tasha Inniss, for her guidance in my early graduate school years and Dr. Monica Jackson, for her help as a fellow teaching assistant. I greatly appreciate the help and support of all of my friends. Thank you! Mydecisionto pursue graduate study was largelydue totheexceptionalfaculty in the Mathematics Department at Spelman College. I would like to thank Dr. Sylvia Bozeman for encouraging me to apply to graduate school. I would also like to thank Dr. Nagambal Shah and Dr. Yewande Olubummo for their constant support during my years of study as a math major at Spelman. I could not have completed my graduate program without the invaluable sup- port of my family. To my fiance?e, Adebola ?Taheel? Jagun, thank you for your continual encouragement over the years. Our sons, Tyler and Sean Jagun, are a v blessing from God, and I am so thankful to have them. They made my dissertation progress even more of a challenge (many sleepless nights) and yet they were my motivation for finishing the degree. I am also truly grateful for the endless love and support of my parents, Herbert and Jacqueline Nicholls. Thank you for always believing in me! To my sister, Cynthia Nicholls, you are a special person in my life and I want to thank you for just being you. Thank you to all of my family members for their encouraging words and support. It is truly impossible to remember all, and I apologize to those I?ve inadver- tently left out. Lastly, I faced obstacles and challenges at just about every stage of my gradu- ate program. God made a way for me to overcome these obstacles with the support of the individuals mentioned here. Thank you God for ALL you have provided me! vi TABLE OF CONTENTS List of Tables ix List of Figures x 1 Introduction 1 2 Background 5 2.1 Optimality Conditions for the Primal-Dual LP . . . . . . . . . . . . . 6 2.2 The Column Generation Heuristic . . . . . . . . . . . . . . . . . . . . 10 2.2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.2.2 Initial Point . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.2.3 Selection of Q . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.2.4 Update Strategy . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3 A Reduced Primal-Dual Affine-Scaling (redPDAS) Algorithm 21 3.1 Affine-Scaling Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.1.2 The Primal and Dual Affine-Scaling Methods . . . . . . . . . 22 3.1.3 The Primal-Dual Affine-Scaling Method . . . . . . . . . . . . 27 3.1.4 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.2 A Primal-Dual Affine-Scaling (PDAS) Algorithm . . . . . . . . . . . 31 3.3 The redPDAS Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.4 Convergence Analysis of redPDAS . . . . . . . . . . . . . . . . . . . . 35 3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 4 A Reduced Mehrotra Predictor-Corrector (redMPC) Algorithm 54 4.1 MPC Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.1.2 Centering-Corrector Direction . . . . . . . . . . . . . . . . . . 56 4.2 A MPC Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 4.3 The redMPC Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 61 4.4 Convergence Analysis of redMPC . . . . . . . . . . . . . . . . . . . . 64 4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 5 Numerical Experiments 102 5.1 Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 5.1.1 Test Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 5.2 redPDAS Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . 106 5.3 redMPC Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 5.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 5.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 vii 6 Conclusions and Further Study 131 A Topics in Numerical Optimization 134 A.1 Steepest-Descent Method . . . . . . . . . . . . . . . . . . . . . . . . . 134 A.2 Orthogonal Projection Matrix . . . . . . . . . . . . . . . . . . . . . . 135 B The redPC Algorithm 137 B.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 B.2 Potra?s Predictor-Corrector Algorithm . . . . . . . . . . . . . . . . . 139 B.3 The redPC Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 Bibliography 146 viii LIST OF TABLES 3.1 A summary of the differences between the redPDAS algorithm and a modified version of the algorithm adapted from Tits et. al. [24] that proves to be locally and quadratically convergent. . . . . . . . . 48 4.1 A summary of the differences between the redMPC algorithm and a modified version of the algorithm adapted from Winternitz et. al. [28] that proves to be locally and quadratically convergent. . . . . . . 95 5.1 Random Problems (TAW1 - TAW5) from Tits et. al. [24]. Prob- lems with constraint matrix A* generate each column of A with randn(m,1) and then normalize to make each column have norm one. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 5.2 RandomTest Problems (RAND1 - RAND5) with specified m and n and known optimal solution . . . . . . . . . . . . . . . . . . . . . . . 105 5.3 Netlib Problems from [2] with specified m and n and known optimal solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 ix LIST OF FIGURES 5.1 Performance of the redPDAS algorithm against the PDAS on test problem TAW4 with m = 50,n = 20000, and |Q| ? [3m,ubnd] where ubnd is . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 5.2 Comparison of the redPDAS algorithm versus PDAS algorithm us- ing 25 randomly generated test problems from TAW1 (constraints tangent to the unit sphere) with m = 50, n = 20000, and ubnd = ?k where 150 ? ?k ? 20000. . . . . . . . . . . . . . . . . . . . . . . . . . 109 5.3 Comparison of the redPDAS algorithm versus PDAS algorithm us- ing 25 randomly generated test problems from TAW2 (random [nor- mal] constraints) with m = 50, n = 20000, and ubnd = ?k where 150 ? ?k ? 20000. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 5.4 Comparison of the redPDAS algorithm versus PDAS algorithm us- ing 25 randomly generated test problems from TAW3 (Random as in WT-18July2003) with m = 50, n = 20000, and ubnd = ?k where 150 ? ?k ? 20000. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 5.5 Comparison of the redPDAS algorithm versus PDAS algorithm us- ing 25 randomly generated test problems from TAW4 (RandomLP) with m = 50, n = 20000, and ubnd = ?k where 150 ? ?k ? 20000. . . . . 111 5.6 Comparison of the redPDAS algorithm versus PDAS algorithm us- ing 25 randomly generated test problems from TAW5 (SIPND) with m = 50, n = 20000, and ubnd = ?k where 150 ? ?k ? 20000. . . . . . . . 111 5.7 Comparison of the redPDAS algorithm versus PDAS algorithm us- ing 25 randomly generated test problems from RAND1 with m = 5, n = 100, and ubnd = ?k where 15 ? ?k ? 100. . . . . . . . . . . . . . . . 112 5.8 Comparison of the redPDAS algorithm versus PDAS algorithm us- ing 25 randomly generated test problems from RAND2 with m = 12, n = 500, and ubnd = ?k where 36 ? ?k ? 500. . . . . . . . . . . . . . . . 113 5.9 Comparison of the redPDAS algorithm versus PDAS algorithm us- ing 25 randomly generated test problems from RAND3 with m = 24, n = 780, and ubnd = ?k where 72 ? ?k ? 780. . . . . . . . . . . . . . . . 113 5.10 Comparison of the redPDAS algorithm versus PDAS algorithm us- ing 25 randomly generated test problems from RAND4 with m = 58, n = 1004, and ubnd = ?k where 174 ? ?k ? 1004. . . . . . . . . . . . . . 114 x 5.11 Comparison of the redPDAS algorithm versus PDAS algorithm us- ing 25 randomly generated test problems from RAND5 with m = 75, n = 2016, and ubnd = ?k where 225 ? ?k ? 2016. . . . . . . . . . . . . . 114 5.12 Comparison of the redPDAS algorithm versus PDAS algorithm us- ing the Netlib test problem SCSD1 with m = 77, n = 760, and ubnd = ?k where 231 ? ?k ? 760. . . . . . . . . . . . . . . . . . . . . . . 115 5.13 Comparison of the redPDAS algorithm versus PDAS algorithm us- ing the Netlib test problem SCSD6 with m = 147, n = 1350, and ubnd = ?k where 441 ? ?k ? 1350. . . . . . . . . . . . . . . . . . . . . . 116 5.14 Comparison of the redPDAS algorithm versus PDAS algorithm us- ing the Netlib test problem SCSD8 with m = 397, n = 2750, and ubnd = ?k where 1191 ? ?k? 2750. . . . . . . . . . . . . . . . . . . . . 116 5.15 Performance of the redMPC algorithm against the MPC using test problem TAW4 with m = 50,n = 20000, |Q| ? [3m,ubnd] where ubnd is fixed to n. The average time (in seconds) to solve 50 randomly generated problems from TAW4 is shown over varying values of C ranging from 10?16 to 10?1. . . . . . . . . . . . . . . . . . . . . . . . 118 5.16 Comparison of the redMPC algorithm versus the MPC and ipas35 algorithms using 25 randomly generated test problems from TAW1 (constraints tangent to the unit sphere) withm = 50, n = 20000, and ubnd = ?k where 150 ? ?k ? 20000. . . . . . . . . . . . . . . . . . . . . 119 5.17 Comparison of the redMPC algorithm versus the MPC and ipas35 algorithms using 25 randomly generated test problems from TAW2 (random [normal] constraints) with m= 50, n = 20000, and ubnd = ?k where 150 ? ?k ? 20000. . . . . . . . . . . . . . . . . . . . . . . . . . 120 5.18 Comparison of the redMPC algorithm versus the MPC and ipas35 algorithms using 25 randomly generated test problems from TAW3 (Random as in WT-18July2003) with m= 50, n = 20000, and ubnd = ?k where 150 ? ?k ? 20000. . . . . . . . . . . . . . . . . . . . . . . . . 120 5.19 Comparison of the redMPC algorithm versus the MPC and ipas35 algorithms using 25 randomly generated test problems from TAW4 (RandomLP) with m = 50, n = 20000, and ubnd = ?k where 150 ? ?k ? 20000. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 5.20 Comparison of the redMPC algorithm versus the MPC and ipas35 algorithms using 25 randomly generated test problems from TAW5 (SIPND) with m = 50, n = 20000, and ubnd = ?k where 150 ? ?k ? 20000. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 xi 5.21 Comparison of the redMPC algorithm versus the MPC and ipas35 algorithms using 25 randomly generated test problems from RAND1 with m = 5, n = 100, and ubnd = ?k where 15 ? ?k ? 100. . . . . . . . . 123 5.22 Comparison of the redMPC algorithm versus the MPC and ipas35 algorithms using 25 randomly generated test problems from RAND2 with m = 12, n = 500, and ubnd = ?k where 36 ? ?k ? 500. . . . . . . . 123 5.23 Comparison of the redMPC algorithm versus the MPC and ipas35 algorithms using 25 randomly generated test problems from RAND3 with m = 24, n = 780, and ubnd = ?k where 72 ? ?k ? 780. . . . . . . . 124 5.24 Comparison of the redMPC algorithm versus the MPC and ipas35 algorithms using 25 randomly generated test problems from RAND4 with m = 58, n = 1004, and ubnd = ?k where 174 ? ?k ? 1004. . . . . . 124 5.25 Comparison of the redMPC algorithm versus the MPC and ipas35 algorithms using 25 randomly generated test problems from RAND5 with m = 75, n = 2016, and ubnd = ?k where 225 ? ?k ? 2016. . . . . . 125 5.26 Comparison of the redMPC algorithm versus the MPC and ipas35 algorithms using the Netlib test problem SCSD1 with m = 77, n = 760, and ubnd = ?k where 231 ? ?k ? 760. . . . . . . . . . . . . . . . . . 126 5.27 Comparison of the redMPC algorithm versus the MPC and ipas35 algorithms using the Netlib test problem SCSD6 with m = 147, n = 1350, and ubnd = ?k where 441 ? ?k ? 1350. . . . . . . . . . . . . . . . 127 5.28 Comparison of the redMPC algorithm versus the MPC and ipas35 algorithms using the Netlib test problem SCSD8 with m = 397, n = 2750, and ubnd = ?k where 1197 ? ?k ? 2750. . . . . . . . . . . . . . . . 127 A.1 A n-dimensional vector x is projected onto the null space of A [de- notedN(A)] by the orthogonal projectionmatrixP = I?AT(AAT)?1A.136 xii Chapter 1 Introduction For nearly 40 years, the simplex method was deemed the most efficient algo- rithm for solving linear programming (LP) problems, a class of problems involving the minimization or maximization of a linear function subject to linear constraints. Although the simplex algorithm can be shown to exhibit an exponential number of iterations in the worst case, on practical problems the number of iterations is usually linear in the size of the problem. Furthermore, the operation count per iteration is also rather low; an m?m linear system is solved at each iteration, where m is the number of constraints in the primal LP. 1 The work of Narendra Karmarkar [15], however, spawned a new class of meth- ods, interior-point methods (IPMs), capable of outperforming the simplex method on large-scale LPs and (in general) producing polynomial complexity results. Back- ground information on simplex and interior-point methods can be found in [5], [20] and [29]. Unlike the simplex method, IPMs solve a (2n + m) ? (2n + m) linear system at each iteration where most of the computation stems from solving an m ? m system with normal equation matrix AD2AT (A is an m ? n constraint matrix and D is a positive diagonal matrix). If n greatermuch m, the operation count per iteration can be quite large. Nonetheless, their low iteration count and speed make 1The primal LP is composed of a system of m equality constraints with n variables. The dual LP is the ?companion?to the primal LP in which the roles of constraints and variables are reversed. Therefore, the dual LP is composed of a system of n constraints with m variables. 1 IPMs the method of choice on large-scale LPs. Today, much attention is focused on primal-dual IPMs [29]; they stand out for their exceptional theoretical proper- ties and computational behavior. Most primal-dual IPM codes today are based on Mehrotra?s Predictor-Corrector Method (MPC) [17]. MPC combines the essence of the primal-dual framework with ingenious heuristics. Inthisthesis, we deviseseveral?reduced?variantsof existinginfeasibleinterior- point algorithms 2 for solving primal-dual LP pairs. Specifically, we examine the incorporation of a column generation scheme into algorithms which, through a se- quence of iterations, aim to achieve both feasibility and optimality. In our context, a column generation approach refers to considering only a subset of columns of A, or equivalently, a subset of the dual constraints, at each iteration. This reduces the operation count per iteration and, if the subsets are chosen well, we generally ob- serve no increase in the total number of iterations. Our algorithms can outperform other column generation algorithms in total operation count and CPU time. A vast number of papers have been written on the convergence of infeasible interior-point algorithms. Kojima et al. [16], Potra [22], and Achache et al. [1] devise infeasible algorithms which improve feasibility and find optimal solutions for the primal-dual pair. The algorithms also demonstrate both global convergence and polynomial complexity. Column generation algorithms within interior-point methods have been an- alyzed as well. Algorithms for ?building up? the number of constraints at each 2An ?infeasible algorithm? refers to an algorithm that accepts an initial solution that does not satisfy all of the constraints of an LP. 2 iteration have been developed by Dantzig et al. [6], Ye [31], Hertog et. al. [11], and Goffin et. al. [9]. Starting with a small subset of dual (working) constraints, an iter- ative search for the optimal solution is conducted until there is a constraint blocking the progression to this solution. Depending on the algorithm, the violatedconstraint or constraint generated by the algorithm is added to the set, and the process repeats until the optimal solution has been found. Ye [32] considers adding more than one constraint to the working set at each iteration. In all of these algorithms, constraints are never removedfrom the working set inthe current iteration. Ye [30], on the other hand, developed a ?build-down? scheme in which a constraint is discarded from the full constraint set if it is detected to be nonbinding in the optimal set. Convergence properties for the aforementioned algorithms are provided along with polynomial complexity analyses for some. Both the ?build-up? and ?build-down? approaches were combined and analyzed in Hertog et al. [12] where the algorithm was shown to terminate in polynomial time. More recently, a constraint reduction approach was developed in a primal-dual framework 3 by Tits-Absil-Woessner [24]. The au- thors consider replacing the normal equation matrix for the primal-dual LP with a ?reduced? matrix AQD2QATQ by solving a subset of m ? |Q| ? n dual constraints at each iteration where Q is a fixed index set associated with the most ?promising? dual constraints at the current iteration. Global convergence and local quadratic convergence are proved for their ?reduced? affine-scaling algorithm. The two algorithms we develop in this thesis are redPDAS and redMPC. The 3Here we are working with the primal and dual LPs simultaneously as opposed to working with them individually. The primal-dual framework involves solving the optimality conditions to satisfy both the primal and dual LPs. 3 redPDAS algorithm is an extension of the Tits-Absil-Woessner algorithm [24] to handle dual infeasible iterates, and the redMPC algorithm is a ?reduced? version of Mehotra?s Predictor-Corrector Method. We allow both primal and dual infeasibility as opposed to Tits-Absil-Woessner who allow only primal infeasibility in the PDAS case. The remainder of the thesis focuses on the two algorithms, their convergence, and their performance. The column generation scheme used within our algorithms is presented in Chapter 2. In the beginning of Chapter 3, a general discussion of affine- scaling methods is given. The remainder of the chapter includes a description of the Reduced Primal-Dual Affine-Scaling redPDAS algorithm along with some conver- gence results. We also show that with certain modifications, the algorithm is glob- ally and locally convergent. A general discussion of Mehrotra?s Predictor-Corrector method is given at the beginning of Chapter 4. The Reduced Mehrotra?s Predictor- Corrector redMPC algorithm is presented afterward, along with some convergence results. With modifications, the redMPC algorithm is also shown to be globally and locally convergent. Numerical experiments are presented for the redPDAS and redMPC algorithms in Chapter 5 as well as a discussion of their results. Chapter 6 provides the conclusion to the dissertation and directions for future work. 4 Chapter 2 Background Interior-point methods (IPMs) for alinear program (LP) generate points which lie in the interior of the region defined by the constraints of the problem. A special class of IPMs, primal-dual methods, exhibit exceptional theoretical properties and computational performance. This dissertation examines the effects of incorporat- ing a column generation scheme into primal-dual IPM algorithms. To understand the algorithm framework in the following chapters, we present some background information in this chapter. The chapter is divided into two sections: (1) the optimality conditions for a primal-dual LP and (2) column generation. In section 2.1, we begin with the standard notation used to express the primal and dual LPs and their optimality conditions. Next, we show how a primal-dual IPM algorithm finds a solution to these optimality conditions. In section 2.2, column generation is discussed in the context of our algorithm framework. We introduce the notation associated with our algorithms and explain how the index set Q associated with the reduced (dual) constraint set is formulated. Finally, we explain how to preserve a strictly interior point (or solution) within our algorithms. 5 2.1 Optimality Conditions for the Primal-Dual LP Consider the primal LP in standard form, min x {cTx : Ax= b, x? 0}, (2.1) where A?Rfracturm?n, b?Rfracturm, and c?Rfracturn are the known coefficient matrix and vectors, respectively, and x?Rfracturn is a vector of unknown variables. All vectors in this thesis are column vectors. It is assumed that A has full row rank, b,c negationslash= 0, and n greatermuch m. Solving the primal LP is equivalent to solving the dual LP, max y {bTy : ATy ?c}. An alternativeexpression of the dual problem resultsby incorporating a nonnegative slack vector, s, into the dual constraints: max y,s {bTy : ATy+s = c, s? 0}. (2.2) This last expression is often preferred due to its ease of implementation into IPMs. Together, problems (2.1) and (2.2) are referred to as the primal-dual pair. The Karush-Kuhn-Tucker (KKT) [3] or optimality conditions for the primal- dual pair, (2.1) and (2.2), can be viewed as a mappingF from Rfractur2n+m to Rfractur2n+m such that F(x,y,s) = ? ?? ?? ?? ? Ax?b ATy+s?c XSe ? ?? ?? ?? ? = 0, (2.3) (x,s) ? 0, 6 where X, S ? Rfracturn?n are diagonal matrices constructed from the vectors x and s, respectively, and e ? Rfracturn is the vector of all ones [29]. Any vector x that satisfies the conditions Ax?b = 0 and x ? 0 is a feasible point for the primal LP. A dual feasible point is any (y,s) satisfying the conditions ATy + s ?c = 0 and s ? 0. Thus the KKT conditions say that any point that is both primal feasible and dual feasible and satisfies XSe = 0 is optimal for both problems. According to duality theory (the theory that explains the relationship between the primal and dual LPs, discussed, for example, in [20], [4] ), if x and (y,s) are feasible for their respective problems, then bTy ? cTx. That is, the dual objective function value is a lower bound for the primal objective function value. The difference between the primal and dual objective functions, vextendsinglevextendsinglecTx?bTyvextendsinglevextendsingle, is known as the duality gap. At optimality, x? solves (2.1), (y?,s?) solves (2.2), and the duality gap is zero (i.e. bTy? = cTx?). Furthermore, the condition X?S?e = 0 implies that at least one of the pair x?j or s?j must be zero for allj = 1,2,...,n. Since the nonzero components ofxand soccur in complementary positions, the condition XSe = 0 in (2.3) is referred to as the complementary slackness condition. In this dissertation, we focus on a class of primal-dual interior-point methods which find solutions to (2.1) and (2.2) by applying variants of Newton?s method to (2.3). Let z = (x,y,s) represent the current approximation to the solution and assume that it satisfies the nonnegativity conditions x > 0 and s > 0. If z is not 7 optimal, Newton?s method forms a linear model tangent toF at z and finds a search direction, ?z = (?x,?y,?s)?Rfractur2n+m, by solving: J(A,x,s) ?z = ?F(x,y,s), where J is the Jacobian of F. This gives the linear system ? ?? ?? ?? ? A 0 0 0 AT I S 0 X ? ?? ?? ?? ? ? ?? ?? ?? ? ?x ?y ?s ? ?? ?? ?? ? = ? ? ?? ?? ?? ? rp rd XSe ? ?? ?? ?? ? , (2.4) whererp =Ax?bandrd = ATy+s?care the primaland dual residuals, respectively. By eliminating?sfrom the linear system, we obtain the following equivalent system (also known as the ?augmented system?): ? ?? ? A 0 S ?XAT ? ?? ? ? ?? ? ?x ?y ? ?? ?= ? ? ?? ? rp ?Xrd +XSe ? ?? ?, with ?s = ?rd ?AT?y. By further eliminating ?x, we have the set of equations that form a major component of primal-dual IPM algorithms: AD2AT?y = ?rp ?AD2rd +Ax, ?s = ?rd ?AT?y, (2.5) ?x = ?x?D2?s, where D2 = S?1X. The equations AD2AT?y = ?rp ?AD2rd +Ax can be equiva- lentlywritten asAD2AT?y = b?AD2rd sincerp = Ax?b. This system of equations is known as the ?normal equations? because they are the normal equations for a linear least squares problem with coefficient matrixDAT. Since it is assumed thatA 8 has full row rank and D is a positive diagonal matrix, the coefficient matrixAD2AT is symmetric and positive definite. As a result, the coefficient matrix can be factored using the Cholesky factorization as AD2AT =LLT where L is lower triangular [20], and this makes computing solutions of the normal equations easy. Once the search direction ?z is computed, the updated solution z+ is given by z+ =z+??z, where ?? (0,1] is a parameter designed to prevent z+ from violating the condition (x+,s+) > 0. The Newton iteration continues until the duality gap falls below a small defined tolerance. This is the basis for primal-dual IPMs, but there are many variants, two of which we discuss in Chapters 3 and 4. Another way to interpret finding a solution of (2.4) is that we calculate ?z to satisfy the primal constraints, A(x+ ?x) = b, A?x = b?Ax, (2.6) the dual constraints, AT(y+ ?y) + (s+ ?s) = c, AT?y+ ?s = c?ATy?s, (2.7) and the complementary slackness condition, (xj + ?xj)(sj + ?sj) = 0, ?j xjsj +xj?sj +sj?xj + ?xj?sj = 0, ?j X?s+S?x = ?XSe??X?Se. (2.8) 9 Equations (2.6) and (2.7) are precisely the first two blocks of equations in the linear system (2.4). The equations in (2.8) differ from the last block of equations in (2.4) by a nonlinear term, ?X?Se. To obtain an approximate solution for ?z, we ignore the nonlinear term. This action is permitted since ?X and ?S are small in the limit and their product is even smaller. The result is applying Newton?s Method to (2.3) and solving the linear system (2.4). 2.2 The Column Generation Heuristic 2.2.1 Overview The bulk of the computation in solving for the search direction, ?z, stems from the m?m matrix, AD2AT. For dense matrices, it takes ?(m2n) operations to form the matrix and another ?(m33 ) to factor it. On large-scale LPs where ngreatermuchm, the incorporation of a column generation scheme into the primal-dual framework can reduce this computational effort. Originally devised by Gilmore et al. [8] in the early 1960?s, column generation is a technique used to solve an LP by generating the columns of A as needed. This is a beneficial tool because most of the columns of A have no effect on the optimal solution. From duality theory, assuming nondegeneracy, there are exactly m zero components ofscorresponding to exactlympositive components ofxat the optimal solution (see [4], [5], or [20], for example). If x? solves the primal LP and (y?,s?) solves the dual LP, we define Q? to be the set of indices associated with those m positive components of x? (or m zero components of s?). Since the remaining n?m 10 components of x? are zero, the ?reduced? primal LP formed by just the columns associated with the index set Q? has the same solution as the standard primal LP. Furthermore, since the columns of A are simply the rows of AT, we can incorporate a column generation scheme into our primal-dual framework by working with just a subset of the dual constraints at each iteration. We try to choose this subset Q so that Q? ?Q. Our strategy is as follows: given any point (x,s)> 0, and any integer k with m ? k ? n, we define the set Q ? N = {1,2,...n} to be the indices, j, of the dual constraints, aTj y ?cj, associated with the k largest xj/sj ratios. We define AQ ?Rfracturm?k to be the submatrix of A comprised of the columns aj with j ?Q. The vectors (sQ,xQ) ?Rfracturk are comprised of the components sj and xj, respectively, with j ? Q. If k is relatively small compared to n, it takes only ?(m2k) operations to form the matrix AQS?1Q XQATQ. If we can use this matrix in place of AS?1XAT, our algorithm will significantly reduce the amount of work per iteration and possibly take only a fraction of the time to solve a particular LP problem. We let ??z = (??x,?y,??s) ?Rfractur2n+m be the approximate Newton direction at zi computed by using AQS?1Q XQATQ in place of AS?1XAT in (2.5). The components of the direction associated with the neglected constraints are set to zero. The directions ?zQ = (?xQ,?y,?sQ) and ??zQ = (??xQ,??y,??sQ) have dimension 2k +m. These vectors are obtained by deleting the components of ?xj, ?sj, ??xj, and ??sj with j /?Q. 11 Our goal is to prove that applying a column generation heuristic to primal- dual IPM algorithms for solving LPs will provide excellent theoretical properties and computational results. In the following three subsections, we discuss the main ingredients in the reduced algorithms: the initial point, the choice of Q and the update strategy. 2.2.2 Initial Point There are a number of techniques for generating initial points (see [11], [14], or [17], for example). Our column generation approach requires (x,s) > 0 at each iteration. To achieve this, we use the strategy of Mehrotra [17] and describe his technique here. Assume the columns of A are linearly independent and b,cnegationslash= 0 and set ?x = AT(AAT)?1b, ?y = (AAT)?1Ac, and ?s =c?ATy. The point (?x,?y,?s) satisfies the primal and dual equality constraints. However, in or- der to satisfy the condition (x,s)> 0, Mehrotra defines?x = max(?1.5min{?xj},0) and ?s = max(?1.5min{?sj},0) and then defines ??x = ?x +.5 bracketleftBigg (?x+?xe)T(?s+?se)summationtext n j=1 ?sj +?s bracketrightBigg , and ??s = ?s +.5 bracketleftBigg (?x+?xe)T(?s+?se)summationtext n j=1 ?xj +?x bracketrightBigg , where e is the vector of ones. Then, if x0 = ?x+ ??xe, y0 = ?y, and s0 = ?s+ ??se, we have an initial point z0 = (x0,y0,s0) (2.9) 12 such that (x0,s0) > 0. The equality constraints are not necessarily satisfied, so z0 might be infeasible. 2.2.3 Selection of Q The choice ofQis based on thek ? [m,n] largestxj/sj ratios at each iteration. Assumingnondegeneracy, there will be exactlymcomponents ofsconvergingto zero as the optimal solution is approached. Since the corresponding m components of x are not converging to zero, there will be exactly m xj/sj ratios tending to infinity as we tend to the optimal solution. Meanwhile, the remaining n?m components of x converge to zero while the corresponding n?m components of s do not. As a result, the remaining n?m xj/sj ratios tend to zero as the optimal solution is approached. Since Q is the index set associated with the k ? [m,n] largest xj/sj ratios at each iteration, when we are close enough to the solution, we are guaranteed to find an optimal solution to the ?reduced? dual LP based solely on the rows of AT associated with Q. This implies that we can solve the normal equations with coefficient matrix AQD2QATQ. We describe our selection of Q below. Let C satisfy 0 C?(xj1/sj1), ? = 1,2,...,n}; k = |Q|; 3. Modify Q if necessary so that lbnd ?k ?ubnd: if k ?ubnd Q = {j? : ? = 1,2,...,ubnd}; elseif k ?lbnd Q = {j? : ? = 1,2,...lbnd}; end(if) k = |Q|; ???????????????- Insteps 1 and 2of our algorithm, the ratiosxj/sj are orderedfrom largest to smallest and selected if they exceed the product of C and the value of the largest ratio at the current iterate. The indices associated with the selected ratios make up the set Q. The quantityC is an experimental constant chosen based on algorithm performance (see Sections 5.2 and 5.3 for details). Step 3 guarantees that Q contains at least 3m indices at each iteration. This is a critical component of the convergence analysis in the following chapters. Computationally, by ensuring Q is of an ?appropriate? size 14 relative to m and n, we can reduce the risk of instabilities in the performance of the algorithm. Since n greatermuch m, a value of k close to m is likely to cause the algorithm to cycle through an extremely large number of constraint choices before reaching the optimal solution. A value of k close to n in early iterations is acceptable since steps 2 and 3 of our subroutine will significantly reduce k as the optimal solution is approached. This poses the question, ?For which range of ratios k/n, where k ? (m,n), are the ?reduced? algorithms most efficient for solving LPs?? This and other topics will be discussed in Chapter 5. 2.2.4 Update Strategy Throughout the next two chapters, we use the superscript + to represent an update to a quantity within an algorithm. Here we examine the update strategy used within our ?reduced? algorithms for the primal and dual variables as well as the dual residuals. In a general primal-dual IPM algorithm, the primal and dual variables are updated by the equations: x+ = x+?p?x, y+ = y+?d?y, (2.10) s+ = s+?d?s where ?p and ?d are the step lengths between 0 and 1 along the primal and dual 15 search directions, respectively. The dual residuals are updated as follows: r+d = ATy+ +s+ ?c = AT parenleftbigy+?d?yparenrightbig+parenleftbigs+?d?sparenrightbig?c = parenleftbigATy+s?cparenrightbig+?dparenleftbigAT?y+ ?sparenrightbig = rd +?dparenleftbigAT?y?rd ?AT?yparenrightbig = (1??d)rd. We used the relations rd = ATy+s?c and ?s= ?rd?AT?y in the fourth line. In the ?reduced? algorithms, we set ?sj = 0, ?j /?Q. By setting r+d =ATy+ +s+?c, we have (rd)+j = ? ??? ??? (1??d)(rd)j if j ?Q (rd)j +?daTj ?y if j /?Q , where aj is the jth column of A. Since (x0,s0) > 0, it can be shown that the components of the initial dual residual r0d are strictly positive and have the same value. Therefore, 0 < (1??d)(rd)0j < (rd)0j, ?j ?Q, since ?d ? (0,1). If an index j remains in Q for every iteration, then we have 0 < (1??d)(rd)ij < (rd)ij, for j ?Q, ?i. (2.11) In this case, the components of the dual residual in Q at every iteration remain strictly positive and decrease to zero as the solution is approached. If at any iter- ation, the jth component of residual is not associated with Q, it will be updated 16 as (rd)+j = (rd)j +?daTj ?y. Here the jth component of the dual residual does not monotonically decrease to zero. This could potentially cause the ?reduced? algo- rithms to fail to satisfy dual feasibility. We resolve this issue by incorporating a new update strategy into our algorithms. In the case where |Q| = n, we follow the update strategy for a general primal-dual IPM algorithm. Otherwise, the update strategy is described below: We define x+ = x +?p?x and (y+ = y +?d?y,s+ = s+?d?s) to be the updates on the primal and dual variables, respectively. Since one of our goals is to satisfy dual feasibility (ATy+ ?c ? 0), we define ? = ATy+ ?c and examine its maximum component, denoted ?lscript, to determine if we are making progress to this goal. In addition, we define a set ?Q to represent the index set of dual constraints that are added to Q when it is determined we are not making progress towards satisfying dual feasibility. The set ?Q is initially empty with ?Q?Q = ?. After each iteration, we have the following possible outcomes: Case 1: Suppose lscript?Q. The quantity ?lscript +s+lscript is exactly the value of the lscriptth component of the updated dual residual, (rd)+lscript . Sincethis component of the residual is givenby (rd)+lscript = (1??d)(rd)lscript, it remainsstrictly positive and is smaller invalue than (rd)lscript. Therefore, we can make progress toward satisfying dual feasibility by setting each component of the dual residual to this value [i.e. r+d = (?lscript+s+lscript )ewhere e?Rfracturn is the vector of all ones]. To ensure every component of the slack vector remains strictly positive (as a result of this change), we define a new update on the slack vector as s++ = r+d ??. Although the primal vector x does not affect the progress toward achieving dual feasibility, 17 it was observed through numerical testing that s++ converged to its optimal value much faster than x+. Since the stopping criterion to our algorithms is based on the duality gap falling below a small tolerance, our algorithms failed to converge in reasonable time. Thus, we define a new update on the primal variables x++j = ? ??? ??? x+j j ?Q min(s++,x+) j /?Q to resolve this issue. Thus, the point (x++,y+,s++) is the new approximate solution to the LP and ?Q is set to ? (nullset). Case 2: Suppose lscript /?Q. Since the ?reduced? algorithms only consider the columns of A associated with Q, the term aTlscript ?y is never calculated in this case. Since it is unclear what the value of aTlscript ?y is from iteration to iteration, there is no guarantee that we will progress toward satisfying dual feasibility. As a result, we return to the previous solution [i.e. set (x++,y++,s++) = (x,y,s)] and recalculate the solution with ?Q = ?Q?{lscript}. The algorithm for the update strategy used in ?reduced? algorithms when |Q| 0). Once a dual-feasible solution (y+,s+) has been determined, the solution can be updated using the general update strategy in (2.10) until the optimal solution has been found. It should be noted that once the index lscript is added to Q (in the case lscript /? Q), the algorithm recomputes the search direction from scratch using normal matrix AQ+D2Q+ATQ+ where Q+ = Q?{lscript}. A faster way to compute this search direction would be to compute a rank-1 update. This can be accomplished since the normal 19 matrix based on Q+ is a known invertiblesquare matrix plus a perturbation matrix: AQ+D2Q+ATQ+ = AQD2QATQ + xlscripts lscript alscriptaTlscript The Sherman-Morrison formula can be applied here to provide a numerically in- expensive way to compute the inverse of AQ+D2Q+ATQ+ based on the rank-1 matrix xlscript slscriptalscripta T lscript . 2.3 Summary In this chapter, we stated the optimality conditions for a primal-dual LP, explained how to solve for these conditions and presented our column generation heuristic for our ?reduced? algorithms. In the next two chapters, we will focus on two ?reduced? algorithms, redPDAS and redMPC, and state the properties that guarantee their convergence. 20 Chapter 3 A Reduced Primal-Dual Affine-Scaling (redPDAS) Algorithm 3.1 Affine-Scaling Methods 3.1.1 Overview Primal-dual affine-scalingmethods transform a linear program to an equivalent problem via an affine scaling transformation. The transformation repositions the current point to one well inside the boundary of the feasible region determined by the constraints xjsj > 0 to prevent the progression to the optimal solution from becoming arbitrarily slow. Once the current point is transformed to one close to the ?center? of the feasible region, significant progress can be made towards the optimal solution by moving along the search direction (see [19], [20], and [26]). The primal, dual, and primal-dual affine-scaling methods are explained in this section. In all three cases, the problem is scaled so that the current point (or approximation to the solution) becomes the pointe(the vector of all ones) since it is equidistant from the bounds of the region defined by the nonnegativity constraints. A projected steepest descent 1 (ascent) direction is computed to simultaneously decrease (increase) the objective function value and satisfy feasibility. The updated approximation is formed by taking a step from e in the search direction. This point 1See Appendix A. 21 is then transformed back into its original coordinates. In this chapter, we examine primal-dual affine-scaling algorithms. First we discuss primal and dual variants. Then a general primal-dual affine scaling (PDAS) algorithm is given in section 3.2 and a ?reduced? version called redPDAS is pre- sented in section 3.3. Preliminary results for the convergence of the redPDAS algorithm are provided in section 3.4. Global and local convergence results follow with specific modifications to the algorithm. Finally, the chapter is summarized in section 3.5. 3.1.2 The Primal and Dual Affine-Scaling Methods Consider the linear scaling transformation ?D : Rfracturn ? Rfracturn, where ?D(x) = D?1x with positive diagonal scaling matrix D. Here the primal variables x are transformed to ? = D?1x. Thus x can be expressed as D?. The transformation leads to a new primal problem: Minx cTx Min? (Dc)T? s.t. s.t. Ax= b mapsto? AD? = b x? 0 ? ? 0 Assuming primal feasibility, (Ax= b,x? 0), we compute a search direction ?? in the transformed space, a projected steepest descent direction that maintains feasibility of the primal equality constraints. The steepest descent direction is given by the negative gradient of the objective function, ?Dc. Suppose ?+ = ? + ??, where the superscript + denotes the next iteration. To maintain feasibility, we 22 must satisfy AD(? + ??) = b, or equivalently AD?? = 0. That is, ?? must lie in the null space of AD. We define PAD = I ?DAT(AD2AT)?1AD (3.1) to be an orthogonal projection matrix 2 for AD. Then we want ?? to be the pro- jection of the steepest descent direction onto the null space of AD. The expression for ?? is given by ?? = ?PADDc. (3.2) Transforming ?+ back into the original coordinate system, we have x+ = D?+ = D(? + ??) = x+D??. The difference between the new iterate and the current iterate defines the search direction in the original coordinate system. We will denote this direction ?x. We 2See Appendix A. 23 can write ?x= ??1D (??) as ?x = D?? = ?DPADDc = [?D2 +D2AT(AD2AT)?1AD2] c. (3.3) The second and third statements follow from (3.2) and (3.1), respectively. To make ? = e we set D ? X. This gives the direction ?x generated by the primal affine- scaling algorithm. There is a similiar affine-scaling algorithm for the dual problem. Suppose we have a dual feasible point (y,s) : ATy +s = c,s ? 0. Consider the linear scaling transformation that maps the dual variables s to ? = Ds where D is positive and diagonal. The variables y are not transformed since they are unrestricted in value. Then s can be written as D?1?. Assuming A has full row rank, we can write the transformed problem exclusively in terms of the new variables ? by solving the dual equality constraints, ATy + s = c for y and substituting this expression into the problem. The expression for y is obtained by first premultiplying the dual equality constraints by A: ATy+s = c, AATy+As = Ac, y = parenleftbigAATparenrightbig?1 (Ac?As). The last step follows since A has full row rank. Substituting the expression for y 24 into the objective function of the dual problem gives bTy = bT bracketleftBigparenleftbig AATparenrightbig?1 (Ac?As) bracketrightBig = bT bracketleftBigparenleftbig AATparenrightbig?1Aparenleftbigc?D?1?parenrightbig bracketrightBig = bT parenleftbigAATparenrightbig?1Ac?bT parenleftbigAATparenrightbig?1AD?1?. The second line uses the relation s = D?1?. Since we are interested in maximizing the objectivefunction,bTy, of the dual problem overy, thisisthe same as minimizing bT parenleftbigAATparenrightbig?1AD?1? with respect to ?. The term bT parenleftbigAATparenrightbig?1Ac is constant, so it can be ignored. The dual equality constraints can be expressed as c = ATy+s = AT parenleftbigAATparenrightbig?1 (Ac?As) +s = AT parenleftbigAATparenrightbig?1parenleftbigAc?AD?1?parenrightbig+D?1?. Collecting terms, we have parenleftBig ?AT parenleftbigAATparenrightbig?1A+I parenrightBig D?1? = parenleftBig I ?AT parenleftbigAATparenrightbig?1A parenrightBig c, or PAD?1? = PAc, where PA = I?AT parenleftbigAATparenrightbig?1A (3.4) is the orthogonal projection matrix for A. The transformed dual problem becomes: 25 Maxy bTy Min? (D?1AT(AAT)?1b)T? s.t. s.t. ATy+s = c mapsto? (PAD?1)? =PAc s? 0 ?? 0 The steepest descent direction in the transformed space is given by ?D?1AT(AAT)?1b. (3.5) However, to maintain feasibility we must satisfy PAD?1(?+ ??) = PAc, or equivalently PAD?1?? = 0. By premultiplying the steepest descent direction in (3.5) by the orthogonal projec- tion matrix I ? PAD = DAT(AD2AT)?1AD, where PAD is defined in (3.1) , we have ?? = ?(I ?PAD)D?1AT(AAT)?1b = ?DAT(AD2AT)?1ADD?1AT(AAT)?1b = ?DAT(AD2AT)?1b. It follows that PAD?1?? = (I ?AT(AAT)?1A)D?1parenleftbig?DAT(AD2AT)?1bparenrightbig = ?AT(AD2AT)?1b+AT(AAT)?1AAT(AD2AT)?1b = ?AT(AD2AT)?1b+AT(AD2AT)?1b = 0. 26 If the next iterate is ?+ =?+ ??, we have s+ = D?1?+ = s+D?1?? and ?s= ?AT(AD2AT)?1b. (3.6) We determine ?y to maintain dual feasibility. That is, we must satisfy ATy+ +s+ = c, AT(y+ ?y)+ (s+ ?s) = c, AT?y+ ?s= 0. (3.7) This last equation follows from the fact that ATy+s = c. Premultiplying (3.7) by A and solving for ?y, we find ?y = ?(AAT)?1A?s = (AD2AT)?1b. (3.8) When D ? S?1, (?y,?s) is the direction generated by the dual affine-scaling algorithm. 3.1.3 The Primal-Dual Affine-Scaling Method In a primal-dual setting, the scaling matrix isD ? (S?1X)1/2. The pointXSe is repositioned with respect to both nonnegativity constraints, x > 0 and s > 0. 27 Assuming primal and dual feasibility, ?x from (3.3) and (?y,?s) from (3.6) and (3.8) with D? (S?1X)1/2 are given by ?x = [?S?1X +S?1XAT(AS?1XAT)?1AS?1X] c = [?S?1X +S?1XAT(AS?1XAT)?1AS?1X] (ATy+s) = ?S?1XATy?S?1Xs+S?1XAT(AS?1XAT)?1AS?1XATy +S?1XAT(AS?1XAT)?1AS?1Xs = ?S?1XATy?x+S?1XATy+S?1XAT(AS?1XAT)?1Ax = ?x+S?1XAT(AS?1XAT)?1Ax = ?x+S?1XAT(AS?1XAT)?1b, and ?y = (AS?1XAT)?1b, ?s = ?AT(AS?1XAT)?1b. Simplifying the expressions for the directions, we have ?y = (AS?1XAT)?1b, ?s = ?AT?y, ?x = ?x?S?1X?s. The primal-dual affine-scaling direction is exactly the standard Newton step for solving the optimality conditions for the primal or dual problem: 28 ? ?? ?? ?? ? A 0 0 0 AT I S 0 X ? ?? ?? ?? ? ? ?? ?? ?? ? ?x ?y ?s ? ?? ?? ?? ? = ? ? ?? ?? ?? ? 0 0 XSe ? ?? ?? ?? ? . (3.9) 3.1.4 Algorithms The algorithms in this chapter are written completely in terms of the original variables. We let ?xa and (?ya,?sa) be the components of the affine-scaling direc- tion associated with the primal and dual problems, respectively. We start with an initial point (x,y,s) satisfying x> 0 and s> 0. Primal and dual feasiblity are not required in our algorithms. Let rp = Ax?b and rd = ATy +s?c be the primal and dual residuals, respectively. The search direction we seek is the solution to the system: ? ?? ?? ?? ? A 0 0 0 AT I S 0 X ? ?? ?? ?? ? ? ?? ?? ?? ? ?xa ?ya ?sa ? ?? ?? ?? ? = ? ? ?? ?? ?? ? rp rd XSe ? ?? ?? ?? ? . (3.10) The expressions for the components of this direction are given by ?ya = (AS?1XAT)?1(b?AS?1Xrd), ?sa = ?rd ?AT?ya, ?xa = ?x?S?1X?sa. To avoid moving too far along the path of the search direction, step lengths are incorporated in the algorithms. Step lengths ??p and ??d are chosen as the largest 29 value (computed by a ratio test) to satisfy x + ??p?xa ? 0 and s + ??d?sa ? 0. To ensure that the step length does not exceed 1, we set ??p = min{??p,1} and ??d = minbraceleftbig??d,1bracerightbig. To ensurex+ > 0 ands+ > 0, the steplengths are then multiplied by a fixed positive number ? where 0 < ? < 1. Therefore, x and s are updated by computing x+ = x + ?p?xa > 0 and s+ = s + ?d?sa > 0 where parenleftbig?p,?dparenrightbig = ?parenleftbigmin{??p,1},minbraceleftbig??d,1bracerightbigparenrightbig. The dual variables y are unrestricted in value and updated by y+ =y+?d?ya. Since the expressions for the affine-scaling direction depend on the dual resid- uals, the dual residuals must be updated within the algorithms as well. The update for the dual residuals can be written as r+d = ATy+ +s+ ?c = AT(y+?d?ya) + (s+?d?sa)?c = (ATy+s?c) +?d(AT?ya + ?sa) = (1??d)rd. Observe that the expressions for the affine-scaling direction do not depend on the primal residuals. As a result, the primal residuals are not incorporated into the algorithms. The algorithms terminate once x and y are feasible for their respective prob- lems and the duality gap |cTx?bTy| falls below a small tolerance, ?. 30 3.2 A Primal-Dual Affine-Scaling (PDAS) Algorithm In this section, we summarize a general PDAS algorithm (see Monteiro et al. [19]) ???????????????- A General PDAS Algorithm Input: (x,y,s) with x>0 and s>0, 0 ? Compute the affine-scaling direction: ?ya = parenleftbigAS?1XATparenrightbig?1bracketleftbigb?AS?1Xrdbracketrightbig, ?sa = ?rd ?AT?ya, ?xa = ?x?S?1X?sa. Compute the affine step: ??p = ? ??? ??? 1 if ?xaj ? 0, ?j min?xaj<0bracketleftbig?xj/?xajbracketrightbig otherwise , ??d = ?? ?? ??? 1 if ?saj ? 0, ?j min?saj <0bracketleftbig?sj/?sajbracketrightbig otherwise , ?p = ? min(??p,1); ?d =? minparenleftbig??d,1parenrightbig. 31 Update the solution: x+ = x+?p?xa, y+ = y+?d?ya, s+ = s+?d?sa. Update the residuals: r+d = parenleftbig1??dparenrightbigrd. end(while) ???????????????- 3.3 The redPDAS Algorithm The general PDAS algorithm in section 3.2 considers the entire LP data set (A,b,c) at every iteration. In this section, we present a reduced PDAS algorithm, redPDAS. The vectors ??xa ? Rfracturn?1 and (??ya,??sa) ? Rfractur(m+n)?1 define the affine- scaling direction in the redPDAS algorithm. We refer to ??za = (??xa,??ya,??sa) as the approximate affine-scaling direction. The vectors ??xaQ ? Rfracturk?1 and ??saQ ? Rfracturk?1 are composed of the components ??xaj and ??saj, respectively, with j ? Q. Let ?Q represent the index set of dual constraints that are added to Q when it is determined the algorithm is not making progress towards satisfying dual feasibility. To differentiate the scalar quantities ?p and ?d from the general PDAS algorithm, we use the subscript Q to emphasize their computation within this algorithm. ???????????????- 32 The redPDAS Algorithm Input: (x,y,s) with x> 0 and s> 0, ubnd ? 3m, 0 ? Select the most promising dual constraints 3 Q = Q? ?Q, k = |Q|. Compute the affine-scaling direction: ??ya = parenleftbigAQS?1Q XQATQparenrightbig?1bracketleftbigb?AQS?1Q XQ(rd)Qbracketrightbig, (3.11) ??saQ = ?(rd)Q ?ATQ??ya, (3.12) ??xaQ = ?xQ ?S?1Q XQ??saQ. Compute the affine step: ??pQ = ? ?? ??? 1 if (??xaQ)j ? 0, ?j min(??xaQ)j<0bracketleftbig?(xQ)j/(??xaQ)jbracketrightbig otherwise ,(3.13) ??dQ = ? ??? ??? 1 if (??saQ)j ? 0, ?j min(??saQ)j<0bracketleftbig?(sQ)j/(??saQ)jbracketrightbig otherwise ,(3.14) ?pQ =? minparenleftbig??pQ,1parenrightbig, ?dQ =? minparenleftbig??dQ,1parenrightbig. (3.15) 3The set Q is determined by the algorithm in Section 2.2.3: Selection of Q 33 Create full length vectors, ??xa and ??sa: ??xaj = ? ??? ??? (??xaQ)j? if ? ?Q 0 otherwise , (3.16) ??saj = ? ??? ??? (??saQ)j? if ??Q 0 otherwise . (3.17) Update the solution and dual residuals: x+ = x+?pQ??xa, (3.18) y+ = y+?dQ??ya, s+ = s+?dQ??sa. if k 0,s?j = 0} and n?m indices in ?? = {j ?N : x?j = 0,s?j > 0}. The first two lemmas show that the components of the dual residual as well as vectors x and s remain strictly positive. Lemma 3.1 : Suppose the initial point z0 is defined as in Section 2.2.2, with (x0,s0) > 0. Then if z0 is not the optimal solution, r0d >0 and 0 0, we must prove ??s > 0. This can be done by showing (1) ?x+?xe ? 0, (2) ?s+?se ? 0, and (3) summationtextnj=1 ?xj +?xe>0. (1) If min{?xj} < 0, then ?x = max(?1.5min{?xj},0) = ?1.5min{?xj} > 0. Therefore, ?xj +?x = ?xj ?1.5min{?xj} ? min{?xj}?1.5min{?xj} = ?.5min{?xj} > 0. If min{?xj} ? 0, then ?x = max(?1.5min{?xj},0) = 0. Therefore, ?xj +?x = ?xj ? min{?xj} ? 0. 36 This proves ?x+?xe? 0. (2) A similar proof shows ?s+?se? 0. (3) Since ?x negationslash= 0, there exists at least one component of ?x such that ?xj negationslash= 0. Therefore, summationtextnj=1 ?xj +?xe>0. The previous three results imply that (?x+?xe)T(?s+?se)summationtext n j=1 ?xj +?x ? 0. If min{?sj}< 0, then ?s > 0 and (rd)0j = ??s = ?s +.5 bracketleftBigg (?x+?xe)T(?s+?se)summationtext n j=1 ?xj +?x bracketrightBigg > 0, ?j. If min{?sj} ? 0, then ?s = 0 and (rd)0j = ??s = .5 bracketleftBigg (?x+?xe)T?ssummationtext n j=1 ?xj +?x bracketrightBigg >0, ?j, since (?x,?s) negationslash= 0. This proves r0d = ??se>0. For the second part of this proof, let ? = ATy+ ?c. It should be noted that the dual residuals are only updated in the case where the index lscript associated with ?lscript = maxj(?j) is in Q. The updated dual residuals are given by r+d = parenleftbig?lscript +s+lscript parenrightbige = bracketleftbig(1??dQ)(rd)lscriptbracketrightbige. We know (rd)0j = ??s >0 for all j. Since the step length ?dQ ? (0,1), we have 0 < (1??dQ)(rd)j < (rd)j, ?j. 37 [] Lemma 3.2 : Suppose (x,s)> 0 andrd >0. Then (i) (x+,s+) > 0; (ii) (x++,s++) > 0. Proof: Suppose x,s> 0. Then for (i) we have, x+ = x+?pQ??xa. Since ??xa?Q = 0, x+j = xj > 0 for all j ? ?Q. We now examine x+j with j ? Q. By definition, ?pQ = ? ? ??? ??? 1 if (??xaQ)j ? 0, ?j min parenleftbigg minj braceleftbigg ?(xQ)j (??xaQ)j : (??x a Q)j < 0 bracerightbigg , 1 parenrightbigg otherwise , where 0 0, ?j. Otherwise, let tk = ?(xQ)k(??xa Q)k = min j braceleftBigg ?(xQ)j (??xaQ)j : (??x a Q)j < 0 bracerightBigg , where k is the index of the minimum ratio. Suppose (x+Q)j = (xQ)j +?pQ(??xaQ)j ? 0 for some j. If ?pQ = ?tk, then (x+Q)j = (xQ)j +?tk(??xaQ)j ? 0. Solving for tk, we have ?tk(??xaQ)j ? ?(xQ)j, tk ? ?(xQ)j? ?(??xa Q)j > ?(xQ)j(??xa Q)j . 38 This is a contradiction since tk is the minimum ratio so in this case x+Q > 0. If ?pQ = ?, then ??xaQ ? 0 by (3.13) and xQ > 0 by assumption. Therefore, x+Q =xQ +?pQ??xaQ > 0. A similar analysis holds for s+. This proves (i). For (ii), let ?lscript = maxj (?j) where ? = ATy+ ?c. If lscript? Q, the updated slack vector s++ is given by s++ = r+d ??, where r+d =?lscript +s+lscript . Continuing we have s++ = r+d ?? = parenleftbig?lscript +s+lscript parenrightbige??. So, for all j, sj ? ?j +s+l ??j > 0. Since x++j = ? ??? ??? x+j j ?Q min(s++,x+) j /?Q , it follows that x++ > 0. If lscript ? ?Q, we return to the previous iterate and recompute the solution. In this case (x++,s++) = (x,s)> 0. [] The next two lemmas show the reduced form of the normal equation matrix and augmented Jacobian of F are nonsingular. 39 Lemma 3.3 : Suppose (A2) holds and XQ,SQ > 0. Then AQS?1Q XQATQ is positive definite. Proof: Suppose v is any non-zero vector in Rfracturm. Then vTAQS?1Q XQATQv = bardblDQATQvbardbl2, where DQ =S?1Q XQ. Our assumption implies DQ is positive definite. Since AQ has full row rank and v negationslash= 0, ATQv negationslash= 0. Therefore bardblDQATQvbardbl2 > 0 and AQS?1Q XQATQ is positive definite. [] In Chapter 2, we defined the vector F(x,y,s) = ? ? ?? ?? ?? ? rp rd XSe ? ?? ?? ?? ? , where rp = Ax?b and rd =ATy+s?c. The Jacobian of F, denoted J(A,x,s), is given by ? ?? ?? ?? ? A 0 0 0 AT I S 0 X ? ?? ?? ?? ? . By eliminating ?s in the linear system, J(A,x,s) ?z = ?F(x,y,s), we have ? ?? ? A 0 S ?XAT ? ?? ? ? ?? ? ?x ?y ? ?? ?= ? ? ?? ? rp X(s?rd) ? ?? ?. 40 The matrix ? ?? ? A 0 S ?XAT ? ?? ? is referred to as the augmented Jacobian of F, denoted Ja(A,x,s). The next lemma shows Ja(AQ,xQ,sQ) is nonsingular. Lemma 3.4 : Suppose (A2) holds and xQ,sQ > 0. Then the augmented Jacobian of F(xQ,y,sQ), ? ?? ? 0 AQ ?XQATQ SQ ? ?? ?, is nonsingular. Proof: Assume the system given by ? ?? ? 0 AQ ?XQATQ SQ ? ?? ? ? ?? ? ?1 ?2 ? ?? ?= ? ?? ? 0 0 ? ?? ? has a nonzero solution. This gives the equations AQ?2 = 0, (3.19) XTQATQ?1 ?SQ?2 = 0, (3.20) for some vectors ?1 and ?2. Solving equation (3.20) for ?2 and substituting this expression into (3.19) gives AQS?1Q XQATQ?2 = 0. 41 By Lemma (3.3),AQS?1Q XQATQ is positive definite and therefore nonsingular. There- fore, ?2 = 0. Equation (3.20) reduces to ATQ?1 = 0, since xQ > 0. This implies ?1 = 0 since AQ has full row rank. This contradicts our assumption that [?1,?2]T is nonzero and the result follows. [] We now show the norm of the difference between the standard direction com- puted in PDAS and redPDAS is bounded. The dual residual vector associated with the redPDAS algorithm is denoted by tildewiderd. All other notation is standard. Theorem 3.1 Suppose assumptions (A1) - (A3) hold. Given ?> 0, let ?? = {z : bardblz?z?bardbl 0, s> 0}. Given z ? ??, let ?za be the Newton direction at z. Then there exists ? > 0 such that vextenddoublevextenddouble??za Q ??z a Q vextenddoublevextenddouble??bardblz?z?bardblbardbl?zabardbl, for all z ? ??. Proof: We use a strategy similar to the one in Tits et al. [24]. Let z ? ?a. The direction ??za is given by ? ?? ? 0 AQ ?XQATQ SQ ? ?? ? ? ?? ? ??ya ??xaQ ? ?? ? = ? ?? ? b?AQxQ XQ bracketleftBig (tildewiderd)Q ?sQ bracketrightBig ? ?? ?. (3.21) 42 The PDAS Newton direction is given by ? ?? ? 0 A ?XAT S ? ?? ? ? ?? ? ?ya ?xa ? ?? ? = ? ?? ? b?Ax X[rd ?s] ? ?? ?. Partitioning the previous expression in terms of Q and ?Q gives ? ?? ?? ?? ? 0 AQ A?Q ?XQATQ SQ 0 ?X?QAT?Q 0 S?Q ? ?? ?? ?? ? ? ?? ?? ?? ? ?ya ?xaQ ?xa?Q ? ?? ?? ?? ? = ? ?? ?? ?? ? b?AQxQ ?A?Qx?Q XQ [(rd)Q ?sQ] X?Qbracketleftbig(rd)?Q ?s?Qbracketrightbig ? ?? ?? ?? ? . Premultiplying the third row by ?A?QS?1?Q and adding to the first row, we can elim- inate ?xa?Q to obtain the following system of equations,? ?? ? A?QS?1?Q X?QAT?Q AQ ?XQATQ SQ ? ?? ? ? ?? ? ?ya ?xaQ ? ?? ?= ? ?? ? b?AQxQ XQ bracketleftBig (tildewiderd)Q ?sQ bracketrightBig ? ?? ?? ? ?? ? A?QS?1?Q X?Q(rd)?Q XQ bracketleftBig (tildewiderd)Q ?(rd)Q bracketrightBig ? ?? ?. (3.22) The first term in the right-hand side of (3.22) is exactly the right-hand side of (3.21). The lastk components of the second vector are zero sincerd = tildewiderd. Combining (3.21) and (3.22), we have? ?? ? 0 AQ ?XQATQ SQ ? ?? ? ? ?? ? ??ya ??xaQ ? ?? ?= ? ?? ? A?QS?1?Q X?QAT?Q AQ ?XQATQ SQ ? ?? ? ? ?? ? ?ya ?xaQ ? ?? ?+?, where ? = ? ?? ? A?QS?1?Q X?Q(rd)?Q 0 ? ?? ?. 43 Since ? ?? ? A?QS?1?Q X?QAT?Q AQ ?XQATQ SQ ? ?? ?= ? ?? ? A?QS?1?Q X?QAT?Q 0 0 0 ? ?? ?+ ? ?? ? 0 AQ ?XQATQ SQ ? ?? ?, and sinceJa(AQ,xQ,sQ) is nonsingular by Lemma 3.4, we can express the difference between the two directions by? ?? ? ??ya ??xaQ ? ?? ?? ? ?? ? ?ya ?xaQ ? ?? ?= Ja(AQ,xQ,sQ)?1 ? ?? ? ? ?? ? A?QS?1?Q X?QAT?Q 0 0 0 ? ?? ? ? ?? ? ?ya ?xaQ ? ?? ?+? ? ?? ?. (3.23) The norm of the left-hand side of (3.23) is bounded by the sum of the norms of the individual terms on the right-hand side. Since Ja(AQ,xQ,sQ) is nonsingular for all Q and there are only a finite number of choices of Q, vextenddoublevextenddoubleJ a(AQ,xQ,sQ)?1 vextenddoublevextenddouble?? 0, 44 for some ?0 > 0. The norm vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble ? ?? ? A?QS?1?Q X?QAT?Q 0 0 0 ? ?? ? vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble = vextenddoublevextenddouble vextenddoubleA?QS?1?Q X?QAT?Q vextenddoublevextenddouble vextenddouble = max bardblvbardbl=1 vextenddoublevextenddouble vextenddoubleA?QS?1?Q X?QAT?Qv vextenddoublevextenddouble vextenddouble, where v ? Rfracturn?k = max bardblvbardbl=1 vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble summationdisplay j??Q xj sj aja T j v vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble ? max bardblvbardbl=1 summationdisplay j??Q vextendsinglevextendsingle vextendsinglevextendsinglexj sj vextendsinglevextendsingle vextendsinglevextendsinglevextendsinglevextendsingleaTj vvextendsinglevextendsinglebardblajbardbl, by the Triangle Inequality ? summationdisplay j??Q vextendsinglevextendsingle vextendsinglevextendsinglexj sj vextendsinglevextendsingle vextendsinglevextendsingle, by Assumption (A4) = vextenddoublevextenddouble vextenddoubleS?1?Q X?Q vextenddoublevextenddouble vextenddouble 1 ? (n?k) vextenddoublevextenddouble vextenddoubleS?1?Q vextenddoublevextenddouble vextenddouble vextenddoublevextenddoubleX ?Q vextenddoublevextenddouble, sincebardblhbardbl1 ?nbardblhbardbl2,?h?Rfracturn = (n?k) vextenddoublevextenddouble vextenddoubleS?1?Q vextenddoublevextenddouble vextenddouble vextenddoublevextenddouble vextenddoubleX?Q ?X??Q vextenddoublevextenddouble vextenddouble, since X??Q = 0 ? (n?k) vextenddoublevextenddouble vextenddoubleS?1?Q vextenddoublevextenddouble vextenddoublebardblz?z?bardbl. The relation in the third to last step can be found, for example, in [10] or [27]. Let vextenddoublevextenddouble vextenddoubleS?1?Q vextenddoublevextenddouble vextenddouble= max j??Q (1/sj) = ?1. (3.24) Then, 45 vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble Ja(AQ,xQ,sQ)?1 ? ?? ? A?QS?1?Q X?QAT?Q 0 0 0 ? ?? ? ? ?? ? ?ya ?xaQ ? ?? ? vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble ? vextenddoublevextenddoubleJa(AQ,xQ,sQ)?1vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble ? ?? ? A?QS?1?Q X?QAT?Q 0 0 0 ? ?? ? vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble ? ?? ? ?ya ?xaQ ? ?? ? vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble ? ?0 ?(n?k)?1bardblz?z?bardbl?bardbl?zabardbl = ?2bardblz?z?bardblbardbl?zabardbl, (3.25) where ?2 = ?0 ??1 ? (n?k). This gives a bound for the norm of the first term in (3.23). Examining the norm of the second term in (3.23), we find bardbl?bardbl = vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble ? ?? ? A?QS?1?Q X?Q(rd)?Q 0 ? ?? ? vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble = vextenddoublevextenddouble vextenddoubleA?QS?1?Q X?Q(rd)?Q vextenddoublevextenddouble vextenddouble = vextenddoublevextenddouble vextenddoubleA?QS?1?Q X?Q parenleftBig ?AT?Q?ya ??sa?Q parenrightBigvextenddoublevextenddouble vextenddouble ? vextenddoublevextenddouble vextenddoubleA?QS?1?Q X?QAT?Q vextenddoublevextenddouble vextenddoublebardbl?yabardbl+ vextenddoublevextenddoubleA ?Q vextenddoublevextenddoublevextenddoublevextenddouble vextenddoubleS?1?Q vextenddoublevextenddouble vextenddouble vextenddoublevextenddoubleX ?Q vextenddoublevextenddoublevextenddoublevextenddouble vextenddouble?sa?Q vextenddoublevextenddouble vextenddouble ? (n?k)?1bardblz?z?bardblbardbl?zabardbl+ (n?k)?1bardblz?z?bardblbardbl?zabardbl, since vextenddoublevextenddouble vextenddoubleA?QS?1?Q X?QAT?Q vextenddoublevextenddouble vextenddouble? (n?k)?1bardblz?z?bardbl, vextenddoublevextenddouble vextenddoubleS?1?Q vextenddoublevextenddouble vextenddouble= ?1, and vextenddoublevextenddoubleX ?Q vextenddoublevextenddouble=vextenddoublevextenddouble vextenddoubleX?Q ?X??Q vextenddoublevextenddouble vextenddouble? bardblz?z?bardbl. The third line follows since rd = ?AT?ya??sa in the PDAS algorithm. 46 The norm of A?Q is bounded by vextenddoublevextenddoubleA ?Q vextenddoublevextenddouble = max vnegationslash=0 vextenddoublevextenddoubleA ?Qv vextenddoublevextenddouble bardblvbardbl , where v ?Rfractur n?k = max vnegationslash=0 vextenddoublevextenddouble vextenddoublesummationtextj??Qajvj vextenddoublevextenddouble vextenddouble bardblvbardbl ? max vnegationslash=0 summationtext j??Qbardblajbardbl|vj| bardblvbardbl = max vnegationslash=0 bardblvbardbl1 bardblvbardbl ? max vnegationslash=0 (n?k)bardblvbardbl bardblvbardbl = n?k. Therefore, vextenddoublevextenddoubleJ a(AQ,xQ,sQ)?1? vextenddoublevextenddouble ? vextenddoublevextenddoubleJ a(AQ,xQ,sQ)?1 vextenddoublevextenddoublebardbl?bardbl ? ?0 ?2(n?k)?1bardblz?z?bardblbardbl?zabardbl = 2?2bardblz?z?bardblbardbl?zabardbl. (3.26) Combining (3.23), (3.25), and (3.26), we have vextenddoublevextenddouble??za Q ??z a Q vextenddoublevextenddouble ? 3? 2bardblz?z?bardblbardbl?zabardbl. Setting ? = 3?2 gives the desired result. [] Global and local quadratic convergence proofs can be shown provided specific modifications are made to the redPDAS algorithm. These modifications follow from Tits et. al. [24] where a complete convergence analysis can be found. Before we explain these changes, we summarize the differences between the redPDAS al- gorithm and its modified version in Table 3.1: 47 redPDAS Modified redPDAS algorithm algorithm Accepts a dual-infeasible Requires a dual-feasible Feasibility: initial point; strives to point at each iteration achieve dual-feasibility Terms AQD2QATQ, ??xaQ, ??saQ, AQD2QATQ, ??xaQ with Q: xQ, sQ, (rd)Q Dual affine ?dQ = ? minparenleftbig??dQ,1parenrightbig ?dQ = minparenleftbigmaxbraceleftbig???dQ, ??dQ ?bardbl?yabardblbracerightbig,1parenrightbig step: where 0 0 and rid = 0 for all i. Consequently, the algorithm for updating (x+,s+) and rd has no effect on this dual-feasible algorithm and can be discarded. Furthermore, equations (3.12) and (3.14) must be computed 49 with the full data to satisfy the above requirement on the slack variables. As a result, equation (3.17) must be eliminated from the algorithm. The second critical change is the update scheme for the primal variables. The ??xaQ component of the normal equations remains the same; however the full length vector ??xa in (3.16) is determined before the ?Compute the affine step? section, and the primal affine step (3.13) and the updated primal solution (3.18) can be replaced with ?x = x+ ??xa, [?x?]j = min{?xj,0}, and x+j = min braceleftBig max parenleftBig minbraceleftbigbardbl?yabardbl2 +bardbl?x?bardbl2,xbracerightbig,(?x?)j parenrightBig ,xmax bracerightBig ?j ?N,(3.29) respectively, with x> 0 and xj ? xmax ?j ?N. The key to the global convergence analysis is the availability of a dual-feasible point at every iteration along with the condition, bardbl?yabardbl2 + bardbl?x?bardbl2, on the primal updates. A local quadratic convergence analysis follows provided the above conditions hold and the bound ??dQ ? bardbl?yabardbl is imposed on the dual affine step in equation (3.15). The modified redPDAS algorithm is presented below: ???????????????- The Dual-Feasible redPDAS Algorithm Input: (x,y,s) with x> 0,xmax > 0 such that xj ? xmax ?j ?N and s = c?ATy, x>0, ubnd ? 3m, 0 ? Select most promising dual constraints 5 Q= Q? ?Q. Compute the affine-scaling direction: ??ya = parenleftbigAQS?1Q XQATQparenrightbig?1b, ??sa = ?AT??ya, ??xaQ = ?xQ ?S?1Q XQ??saQ. Create the full vector ??xa: ??xaj = ? ??? ??? (??xaQ)j? if ??Q 0 otherwise . Set ?x= x+ ??xa and for j ?N, define [?x?]j = min{?xj,0}. Compute the affine step: ??dQ = ? ??? ??? 1 if ??saj ? 0, ?j min??saj <0bracketleftbig?sj/??sajbracketrightbig otherwise , ?dQ = minparenleftbigmaxbraceleftbig???dQ, ??dQ ?bardbl?yabardblbracerightbig,1parenrightbig. Update the solution: x+j = min braceleftBig max parenleftBig minbraceleftbigbardbl?yabardbl2 +bardbl?x?bardbl2,xbracerightbig,(?x?)j parenrightBig ,xmax bracerightBig ?j ?N, 5The set Q is determined by the algorithm in Section 2.2.3: Selection of Q 51 y+ = y+?dQ??ya, s+ = s+?dQ??sa. end(while) ???????????????- Under Assumptions (A1),(A4), and (A5) and the aforementioned modifica- tions to the redPDAS algorithm, we can produce a proof of global convergence that follows from Tits et al. [24]. We give a brief outline here. Lemmas (3.2) and (3.3) can be extended to the modified redPDAS algorithm to show the algorithm is well-defined. As a result, the ?ya-component of the normal equations can justifiably be expressed as ??ya = parenleftbigAQS?1Q XQATQparenrightbig?1b. Since it is assumed b negationslash= 0, the sequence of dual objective function values braceleftbigbTybracerightbig is nondecreasing. LetK be an infinite index set. If we assume {y} converges to y? and {?ya} converges to 0 on K, then it can be shown that y? is stationary (i.e. Ax? = b,X?s? = 0) and {?x} converges to x? where x? is the unique Lagrange multiplier associated with y?. Proving {y} converges to F? (global convergence) requires two main steps. The first step is to prove {y} converges to the set of stationary points y? of the dual LP problem. This is achieved by a contradiction argument: if {y} converges to a nonstationary point on K, then {?ya} must converge to 0 on K. The proof is largely dependent on the bound, bardbl?yabardbl2 + bardbl?x?bardbl2, in the primal updates. The second step is to prove that the mulitplier 6 vectors associated with all limit 6The set of x ? Rfracturn such that Ax = b and X(c?ATy) = 0. 52 points of {y} are the same. Using the proofs from steps 1 and 2 along with the fact that braceleftbigbTybracerightbig is nondecreasing, it can be shown {y} converges to F?. With the additional assumption (A6), (local) q-quadratic convergence of the pair (x,y) 7 can be shown. Assumption (A6) implies that strict complementary holds, i.e. x?j > 0, s?j =cj ?aTj y? = 0 ?j ?I(y?). Furthermore, span({aj : j ?I(y?)}) = Rfracturm. We can extend the above results to Lemma 3.4 to show J(AQ,x?Q,s?Q) is nonsingular. This, in addition to Lemmas 1, 14 and Proposition 15, all in Tits et al. [24], along with Lemma (3.1) with rd = 0 and the bound on the dual affine step, ??dQ ?bardbl?yabardbl, provides the necessary tools to complete the q-quadratic convergence analysis. 3.5 Summary In this chapter, we introduced the redPDAS algorithm and explained how specific modifications to the algorithm provide global and local q-quadratic con- vergence results. With further research, we hope to prove similar results for the redPDAS algorithm. Numerical results from redPDAS are given in Chapter 5. Before presenting the results we present the redMPC algorithm. 7The sequence braceleftbig(xi,yi)bracerightbig converges q-quadratically to (x?,y?) if it converges to (x?,y?) and there exists a constant ? such that vextenddoublevextenddouble(xi+1,yi+1)?(x?,y?)vextenddoublevextenddouble? ?vextenddoublevextenddouble(xi,yi)vextenddoublevextenddouble2 . 53 Chapter 4 A Reduced Mehrotra Predictor-Corrector (redMPC) Algorithm 4.1 MPC Method 4.1.1 Overview Mehrotra?sPredictor-Corrector algorithm generatesa sequence of approximate solutions, z(?) = (x(?),y(?),s(?)), to the perturbed KKT conditions: Ax(?)?b = 0, (4.1) ATy(?) +s(?)?c = 0, (4.2) X(?)S(?)e = ??e, (4.3) (x(?),s(?)) > 0, (4.4) where 0 ? ? ? 1 and ? = xTs/n > 0. The conditions here differ from the KKT condtions in (2.3) in that the solution z(?) is uniquely defined for each ?> 0 and the pairwise products xj(?)sj(?) =?? for each j. We define C = {z(?) | ?> 0} as the central path. The central path defines a trajectory of feasible solutions that steer clear of the boundary of the primal-dual feasible region. As ? decreases to zero, C converges to a primal-dual solution of the linear program (or of the KKT conditions in (2.3)). 54 Mehrotra?s method differs from other interior-point methods in that two lin- ear systems are solved at each iteration, one for the affine-scaling or predictor direction, ?za, and one for the centering-corrector direction. The predictor di- rection is obtained by solving (3.10). The centering-corrector direction, ?zcc = (?xcc,?ycc,?scc), is calculated based on the amount of progress the predictor di- rection has made in reducing ? and the error (or nonlinear term) in (2.8). This direction is found by solving a slightly different system: ? ?? ?? ?? ? A 0 0 0 AT I S 0 X ? ?? ?? ?? ? ? ?? ?? ?? ? ?xcc ?ycc ?scc ? ?? ?? ?? ? = ? ?? ?? ?? ? 0 0 ??e??Xa?Sae ? ?? ?? ?? ? , (4.5) where we assume for convenience that (x,y,s) is primal and dual feasible. By premultiplying the second block of equations by ?X and adding this to the third block of equations, we can eliminate ?scc from the linear system. The result is the following equivalent system: ? ?? ? A 0 S ?XAT ? ?? ? ? ?? ? ?xcc ?ycc ? ?? ?= ? ?? ? 0 ??e??Xa?Sae ? ?? ?, with ?scc = ?AT?ycc. By further eliminating ?xcc by premultiplying the second block of equations by ?AS?1 and adding this to the first block of equations, we have ?ycc = ?(AS?1XAT)?1AS?1 (??e??Xa?Sae), ?scc = ?AT?ycc, ?xcc = S?1 (??e??Xa?Sae)?S?1X?scc. 55 The advantage of computing the centering-corrector direction is that we are able to improve our linear, first-order model of F in (2.3) to a quadratic, second-order model. In addition, Mehrotra?s method uses the same coefficient matrix to solve for the two directions, and thus the extra cost of forming this second direction is that of performing a single back-substitution. This additional cost is small. The sum of the predictor and centering-corrector directions is the search direction used to derive an improved solution to the perturbed KKT conditions. For the remainder of the chapter, we focus on variants of Mehrotra?s Predictor- Corrector (MPC) algorithm. A general MPC algorithm is given in section 4.2 and a ?reduced? version called redMPC is presented in section 4.3. Preliminary results for the convergence of the redMPC algorithm are provided in section 4.4. Global and local convergence results follow with certain modifications to the algorithm. Finally, the chapter is summarized in section 4.5. 4.1.2 Centering-Corrector Direction The components of the centering-corrector direction are based on a centering parameter, ?, and a corrector direction derived from the nonlinear term ?Xa?Sae. We explain the significance of these terms here. The affine-scaling direction aims to satisfy the KKT conditions by moving toward the boundary of the feasible region defined by the nonnegative pairwise products, xjsj > 0. If this direction makes significant progress in reducing the duality measure, ?, while satisfying the conditions (x,s) > 0, we will be in good 56 proximity of the central path and closer to satisfying the KKT conditions (since ? ? 0). Therefore, our solution will require little centering (or movement towards the central path). If the affine-scaling direction makes little progress in reducing ?, then we will need a significant amount of centering so that the algorithm can make better progress in reducing ? on the next iteration. To measure the efficiency of the affine-scaling direction, we compare the hypothetical value of ? based on this direction to the previous value of ?. We define ?a = (x+? p a?x a)T parenleftbigs+?d a?s aparenrightbig n to be the hypothetical value of ? based on the affine-scaling direction and the cen- tering parameter ? = bracketleftbigg?a ? bracketrightbigg3 = bracketleftBigg (x+?pa?xa)T parenleftbigs+?da?saparenrightbig n? bracketrightBigg3 to be the cube of the ratio of ?a and ?. 1 If ?a lessmuch ?, the affine-scaling direction makes good progress in reducing ? and little centering is needed. Therefore, ? is insignificant and is set close to 0. If?a is only a little smaller than ?, ? is set close to 1 so that the algorithm can be in a better position to reduce? on the next iteration. The centering-corrector direction also uses the affine-scaling direction to com- pensate for the error in the linear system, (2.4). We can interpret finding a solu- tion z+ = z + ?z to the KKT conditions as calculating ?z to satisfy the primal constraints, the dual constraints, and the complementary slackness condition [see Chapter 2, equations (2.6) - (2.8)] given by (xj + ?xj)(sj + ?sj) = 0, ?j, 1Although it is not a requirement to cube the ratio, Mehrotra [17] found this heuristic for determining ? most efficient through exhaustive computational testing. 57 or equivalently xjsj +xj?sj +sj?xj + ?xj?sj = 0, ?j, X?s+S?x = ?XSe??X?Se. (4.6) In computing the affine-scaling direction, the last term in (4.6) is ignored. That is, X?sa +S?xa = ?XSe. Since applying a full step along the affine-scaling direction to (4.6) gives X?sa +S?xa +XSe+ ?Xa?Sae = ?Xa?Sae, we can define a corrector direction, ?zcor = (?xcor,?ycor,?scor), to compensate for this error term: ? ?? ?? ?? ? A 0 0 0 AT I S 0 X ? ?? ?? ?? ? ? ?? ?? ?? ? ?xcor ?ycor ?scor ? ?? ?? ?? ? = ? ? ?? ?? ?? ? 0 0 ?Xa?Sae ? ?? ?? ?? ? . The centering parameter and corrector direction must be computed after the affine-scaling direction due to the fact that they are both dependent on it. In addi- tion, the centering and corrector components can be combined into one step since they are obtained by solving linear systems with the same coefficient matrix. The result is the linear system in (4.5) for calculating the centering-correction direction. 4.2 A MPC Algorithm In this section, we present a general MPC algorithm similar to the algorithm in [29]. 58 ???????????????- A General MPC Algorithm Input: (x,y,s) with x>0 and s>0, 0 ? Compute the affine-scaling direction: ?ya = parenleftbigAS?1XATparenrightbig?1bracketleftbigb?AS?1Xrdbracketrightbig, ?sa = ?rd ?AT?ya, ?xa = ?x?S?1X?sa. Compute the affine step: ??pa = ? ??? ??? 1 if ?xaj ? 0, ?j min?xaj <0bracketleftbig?xj/?xajbracketrightbig otherwise , ??da = ? ??? ??? 1 if ?saj ? 0, ?j min?saj <0bracketleftbig?sj/?sajbracketrightbig otherwise , ?pa = min(??pa,1), ?da = minparenleftbig??da,1parenrightbig. Compute the centering parameter: ? = bracketleftBigg (x+?pa?xa)T parenleftbigs+?da?saparenrightbig n? bracketrightBigg3 . 59 Compute the centering-corrector direction: ? = ??S?1e?S?1?Xa?Sae, ?ycc = ?(AS?1XAT)?1A?, ?scc = ?AT?ycc, ?xcc = ??S?1X?scc. Compute the predictor-corrector direction: ?x = ?xa + ?xcc, ?y = ?ya + ?ycc, ?s = ?sa + ?scc. Compute the predictor-corrector step: ??p = ? ?? ??? 1 if ?xj > 0, ?j min?xj<0 [xj/?xj] otherwise , ??d = ? ??? ??? 1 if ?sj > 0, ?j min?sj>0 [sj/?sj] otherwise , ?p = ? min(??p,1), ?d =? minparenleftbig??d,1parenrightbig. Update the variables: x+ = x+?p?x, y+ = y+?d?y, s+ = s+?d?s. 60 Update quantities: r+d = parenleftbig1??dparenrightbigrd, ?+ = (x+)T(s+)/n. end(while) ???????????????- 4.3 The redMPC Algorithm In this section, we present a reduced Mehrotra Predictor-Corrector algorithm, redMPC. The vectors ??xa ? Rfracturn?1 and (??ya,??sa) ? Rfractur(m+n)?1 define the affine- scaling direction. Similarily, the vectors ??xcc ? Rfracturn?1 and (??ycc,??scc) ? Rfractur(m+n)?1 define the centering-corrector direction. We refer to ??z = (??x,??y,??s) = (??xa + ??xcc,??ya + ??ycc,??sa + ??scc) as the approximate MPC direction. In accordance with the notation, the vectors ??xQ ? Rfracturk?1 and ??sQ ? Rfracturk?1 are composed of the components ??xj and ??sj, respectively with j ? Q. Let k = |Q| and ?Q represent the index set of dual constraints that are added to Q when it is determined the algorithm is not making progress towards satisfying dual feasibility. To differentiate the scalar quantities ?pa, ?da, ?p, ?d, and ? from the general MPC algorithm, we use the subscript Q to emphasize their computation within this algorithm. ???????????????- The redMPC Algorithm Input: (x,y,s) with x> 0 and s> 0, ubnd ? 3m, 0 ? Select the most promising dual constraints 2 Q = Q? ?Q, k = |Q|. Compute the affine-scaling direction: ??ya = parenleftbigAQS?1Q XQATQparenrightbig?1bracketleftbigb?AQS?1Q XQ(rd)Qbracketrightbig, (4.7) ??saQ = ?(rd)Q ?ATQ??ya, (4.8) ??xaQ = ?xQ ?S?1Q XQ??saQ. Compute the affine step: ??pQ,a = ? ??? ??? 1 ifparenleftbig??xaQparenrightbigj ? 0, ?j min(??xa Q)j<0 bracketleftBig ?(xQ)j/parenleftbig??xaQparenrightbigj bracketrightBig otherwise , ??dQ,a = ? ??? ??? 1 if parenleftbig??saQparenrightbigj ? 0, ?j min(??sa Q)j<0 bracketleftBig ?(sQ)j/parenleftbig??saQparenrightbigj bracketrightBig otherwise .(4.9) ?pQ,a = minparenleftbig??pQ,a,1parenrightbig, ?dQ,a = minparenleftbig??dQ,a,1parenrightbig. (4.10) Compute the centering parameter: ?Q = bracketleftBiggparenleftbig xQ +?pQ,a??xaQparenrightbigT parenleftbigsQ +?dQ,a??saQparenrightbig n?Q bracketrightBigg3 . (4.11) 2The set Q is determined by the algorithm in Section 2.2.3: Selection of Q 62 Compute the centering-corrector direction: ?Q = ?Q?QS?1Q eQ ?S?1Q ? ?XaQ??SaQeQ, ??ycc = ?(AQS?1Q XQATQ)?1AQ?Q, ??sccQ = ?ATQ??ycc, (4.12) ??xccQ = ?Q ?S?1Q XQ??sccQ. Compute the predictor-corrector direction: ??xQ = ??xaQ + ??xccQ, (4.13) ??y = ??ya + ??ycc, (4.14) ??sQ = ??saQ + ??sccQ. (4.15) Compute the predictor-corrector step: ??pQ = ? ??? ??? 1 if (??xQ)j > 0, ?j min(??xQ)j<0 [?(xQ)j/(??xQ)j] otherwise , ??dQ = ? ??? ?? 1 if (??sQ)j >0, ?j min(??sQ)j<0 [?(sQ)j/(??sQ)j] otherwise ,(4.16) ?pQ = ? minparenleftbig??pQ,1parenrightbig, ?dQ =? minparenleftbig??dQ,1parenrightbig. (4.17) Create full length vectors, ??x and ??s: ??xj = ? ??? ??? (??xQ)j? if ? ?Q 0 otherwise , (4.18) ??sj = ?? ?? ??? (??sQ)j? if ? ?Q 0 otherwise . (4.19) 63 Update the solution, the duality measure, and the dual residuals: x+ = x+?pQ??x, (4.20) y+ = y+?dQ??y, (4.21) s+ = s+?dQ??s. (4.22) if k 0 and ?(z) is a term that tends to zero as we approach the optimal solution. Combining (4.23) and (4.24), we have the norm of the difference between the standard direction computed in MPC and that computed in redMPC, bardbl??zQ ??zQbardbl = vextenddoublevextenddoubleparenleftbig??zaQ + ??zccQparenrightbig?parenleftbig?zaQ + ?zccQparenrightbigvextenddoublevextenddouble ? vextenddoublevextenddouble??zaQ ??zaQvextenddoublevextenddouble+vextenddoublevextenddouble??zccQ ??zccQvextenddoublevextenddouble ? ?bardblz?z?bardblbardbl?zabardbl+ ????(z). Let (xi,yi,si) be the solution at iteration i of MPC and redMPC. Assump- tions (A1)-(A6) from Chapter 3 will be needed throughout the analysis along with 65 the following assumptions: (A7) bardblxQbardbl?C1 and bardblsQbardbl ?C2 where 0 0. The redMPC algorithm is well-defined since the iterates for the solution and the dual residualremainstrictlypositiveandthe matricesAQS?1Q XQATQ andJa(AQ,xQ,sQ) are positive definite. The proofs follow from Lemmas (3.1) - (3.4) in Chapter 3. The Lemmas presented next are required to show (4.24). The first two lemmas give bounds for terms involving the centering parameter, ?. Lemma 4.1 : Suppose assumption (A7) and (4.23) and suppose ? = ?Q. Then, |(???Q)?| ? 25?+V3bardblz?z?bardblbardbl?zabardbl+???parenleftbigbardbl?zabardbl2parenrightbig, where ? = max{bardblz?z?bardbl,?} and V3 = 1n [25(V1 +coV2bardblz?z?bardbl) + 10(V1 + 2V2)] with V1 = (C1 +C2)(co +?) + 3+?1bardbl?zabardbl and V2 = (C1 +C2)? for some co,?,?1 > 0. Proof: 66 The quantity (???Q)? = bracketleftBig (?a/?)3 ?parenleftbig?aQ/?Qparenrightbig3 bracketrightBig ? = parenleftbig1/?2parenrightbig bracketleftBig (?a)3 ?parenleftbig?aQparenrightbig3 bracketrightBig = parenleftbig1/?2parenrightbigparenleftbig?a ??aQparenrightbig bracketleftBig (?a)2 +?a ??aQ +parenleftbig?aQparenrightbig2 bracketrightBig , (4.25) since ? = ?Q. We begin by bounding the term ?a ??aQ. We can express this term by: ?a ??aQ = bracketleftBig (x+?pa?xa)T parenleftbigs+?da?saparenrightbig?parenleftbigx+?pQ,a??xaparenrightbigT parenleftbigs+?dQ,a??saparenrightbig bracketrightBig /n = bracketleftBig ?daxT?sa +?pasT?xa +?pa?da (?xa)T ?sa ? ?dQ,axT??sa ??pQ,asT??xa ??pQ,a?dQ,a (??xa)T ??sa bracketrightBig /n. By partitioning the right-hand side of the last expression into Q and ?Q, we can eliminate the terms involving ??xa?Q and ??sa?Q since ??xa?Q = ??sa?Q = 0. This gives ?a ??aQ = bracketleftBig ?daxTQ?saQ +?pasTQ?xaQ +?pa?daparenleftbig?xaQparenrightbigT ?saQ +?daxT?Q?sa?Q +?pasT?Q?xa?Q +?pa?da parenleftBig ?xa?Q parenrightBigT ?sa?Q ? ?dQ,axTQ??saQ ??pQ,asTQ??xaQ ??pQ,a?dQ,aparenleftbig??xaQparenrightbigT ??saQ bracketrightBig /n. 67 By adding in, subtracting out, and grouping certain terms, we have ?a ??aQ = bracketleftbig?daxTQ?saQ ??daxTQ??saQ +?daxTQ??saQ ??dQ,axTQ??saQ +?pasTQ?xaQ??pasTQ??xaQ +?pasTQ??xaQ ??pQ,asTQ??xaQ +?pa?daparenleftbig?xaQparenrightbigT ?saQ??pQ,a?dQ,aparenleftbig?xaQparenrightbigT?saQ +?pQ,a?dQ,aparenleftbig?xaQparenrightbigT?saQ ??pQ,a?dQ,aparenleftbig??xaQparenrightbigT ??saQ +?daxT?Q?sa?Q +?pasT?Q?xa?Q +?pa?da parenleftBig ?xa?Q parenrightBigT ?sa?Q bracketrightbigg /n, = bracketleftbig?daxTQparenleftbig?saQ ???saQparenrightbig+parenleftbig?da ??dQ,aparenrightbigxTQ??saQ +?pasTQparenleftbig?xaQ ???xaQparenrightbig+parenleftbig?pa ??pQ,aparenrightbigsTQ??xaQ +parenleftbig?pa?da ??pQ,a?dQ,aparenrightbigparenleftbig?xaQparenrightbigT ?saQ +?pQ,a?dQ,a bracketleftBigparenleftbig ?xaQparenrightbigT ?saQ ?parenleftbig??xaQparenrightbigT ??saQ bracketrightBig +?daxT?Q?sa?Q +?pasT?Q?xa?Q +?pa?da parenleftBig ?xa?Q parenrightBigT ?sa?Q bracketrightbigg /n. Therefore, vextendsinglevextendsingle?a ??a Q vextendsinglevextendsingle ? 1 n bracketleftbigvextendsinglevextendsingle?d a vextendsinglevextendsinglebardblx Qbardbl vextenddoublevextenddouble?sa Q ???s a Q vextenddoublevextenddouble+vextendsinglevextendsingle?d a ?? d Q,a vextendsinglevextendsinglebardblx Qbardbl vextenddoublevextenddouble??sa Q vextenddoublevextenddouble +|?pa|bardblsQbardblvextenddoublevextenddouble?xaQ ???xaQvextenddoublevextenddouble+vextendsinglevextendsingle?pa ??pQ,avextendsinglevextendsinglebardblsQbardblvextenddoublevextenddouble??xaQvextenddoublevextenddouble +vextendsinglevextendsingle?pa?da ??pQ,a?dQ,avextendsinglevextendsinglevextenddoublevextenddouble?xaQvextenddoublevextenddoublevextenddoublevextenddouble?saQvextenddoublevextenddouble + vextendsinglevextendsingle vextendsingle?pQ,a?dQ,a bracketleftBigparenleftbig ?xaQparenrightbigT ?saQ ?parenleftbig??xaQparenrightbigT ??saQ bracketrightBigvextendsinglevextendsingle vextendsingle +vextendsinglevextendsingle?davextendsinglevextendsinglevextenddoublevextenddoublex?Qvextenddoublevextenddouble vextenddoublevextenddouble vextenddouble?sa?Q vextenddoublevextenddouble vextenddouble+|?pa| vextendsinglevextendsingle vextendsinglesT?Q?xa?Q vextendsinglevextendsingle vextendsingle+vextendsinglevextendsingle?pa?davextendsinglevextendsingle vextendsinglevextendsingle vextendsinglevextendsingleparenleftBig?xa?QparenrightBigT ?sa?Q vextendsinglevextendsingle vextendsinglevextendsingle bracketrightbigg . We can bound vextendsinglevextendsingle?a ??aQvextendsinglevextendsingle by applying the result in (4.23) and proving vextendsinglevextendsingle?pa ??pQ,avextendsinglevextendsingle and vextendsinglevextendsingle?da ??dQ,avextendsinglevextendsingle are small close to the optimal solution. We will use the following 68 results (see [13]): Assume v,w?Rfracturn. Then, vextendsinglevextendsingle vextendsinglemaxk parenleftBigv w parenrightBig k vextendsinglevextendsingle vextendsingle ? bardblvbardbl1bardblwbardbl ? , (4.26) bardblvbardbl1 ? ?nbardblvbardbl, (4.27) 1 bardblwbardbl? ? ?n 1 bardblwbardbl. (4.28) Let ?? = max braceleftBiggvextenddoublevextenddouble ?zaQvextenddoublevextenddoublevextenddouble vextenddouble?xaQvextenddoublevextenddouble, vextenddoublevextenddouble?za Q vextenddoublevextenddouble vextenddoublevextenddouble?sa Q vextenddoublevextenddouble bracerightBigg . (4.29) We examine the components of the current solution for which braceleftBigparenleftbig ??xaQparenrightbigj ,parenleftbig??saQparenrightbigr bracerightBig <0, braceleftbig?pQ,a,?dQ,abracerightbig? 1, and (xQ)j +?paparenleftbig?xaQparenrightbigj = 0, (sQ)r +?daparenleftbig?saQparenrightbigr = 0, (xQ)j +?pQ,aparenleftbig??xaQparenrightbigj = 0, (sQ)r +?dQ,aparenleftbig??saQparenrightbigr = 0. Here, ?pa = ?(xQ)j/parenleftbig?xaQparenrightbigj , ?pQ,a = ?(xQ)j/parenleftbig??xaQparenrightbigj , ?da = ?(sQ)r/parenleftbig?saQparenrightbigr, and ?dQ,a = ?(sQ)r/parenleftbig??saQparenrightbigr. The expression vextendsinglevextendsingle?p a ?? p Q,a vextendsinglevextendsingle = vextendsinglevextendsingle vextendsinglevextendsingle vextendsingle ?(xQ)jparenleftbig ?xaQparenrightbigj + (xQ)jparenleftbig ??xaQparenrightbigj vextendsinglevextendsingle vextendsinglevextendsingle vextendsingle = vextendsinglevextendsingle vextendsinglevextendsingle vextendsinglevextendsingle (xQ)j bracketleftBigparenleftbig ??xaQparenrightbigj ?parenleftbig?xaQparenrightbigj bracketrightBig parenleftbig?xa Q parenrightbig j ? parenleftbig??xa Q parenrightbig j vextendsinglevextendsingle vextendsinglevextendsingle vextendsinglevextendsingle ? vextendsinglevextendsingle vextendsinglevextendsingle vextendsingle (xQ)jparenleftbig ??xaQparenrightbigj vextendsinglevextendsingle vextendsinglevextendsingle vextendsingle? vextendsinglevextendsingle vextendsinglevextendsingle vextendsingle parenleftbig??xa Q parenrightbig j ? parenleftbig?xa Q parenrightbig jparenleftbig ?xaQparenrightbigj vextendsinglevextendsingle vextendsinglevextendsingle vextendsingle, ?j ? vextendsinglevextendsingle?pQ,avextendsinglevextendsingle? vextendsinglevextendsingle vextendsinglevextendsingle vextendsingle parenleftbig??xa Q ??x a Q parenrightbig lparenleftbig ?xaQparenrightbigl vextendsinglevextendsingle vextendsinglevextendsingle vextendsingle, 69 where l is the index of the maximum ratio. Continuing, we have vextendsinglevextendsingle?p a ?? p Q,a vextendsinglevextendsingle ? vextendsinglevextendsingle?p Q,a vextendsinglevextendsingle?vextenddoublevextenddouble??xaQ ??xaQvextenddoublevextenddouble1vextenddouble vextenddouble?xaQvextenddoublevextenddouble ? , by (4.26) ? ? k? ? k parenleftBiggvextenddoublevextenddouble ??xaQ ??xaQvextenddoublevextenddoublevextenddouble vextenddouble?xaQvextenddoublevextenddouble parenrightBigg , by (4.27), (4.28) and vextendsinglevextendsingle?pQ,avextendsinglevextendsingle? 1 ? k? ?bardblz?z ?bardblvextenddoublevextenddouble?za Q vextenddoublevextenddouble vextenddoublevextenddouble?xa Q vextenddoublevextenddouble , by (4.23) ? k?bardblz?z?bardbl? ??, by (4.29) = co bardblz?z?bardbl. (4.30) where co = k??? ??. Similarily, vextendsinglevextendsingle?d a ?? d Q,a vextendsinglevextendsingle = vextendsinglevextendsingle vextendsinglevextendsingle vextendsingle ?(sQ)rparenleftbig ?saQparenrightbigr + (sQ)rparenleftbig ??saQparenrightbigr vextendsinglevextendsingle vextendsinglevextendsingle vextendsingle = vextendsinglevextendsingle vextendsinglevextendsingle vextendsinglevextendsingle (sQ)r bracketleftBigparenleftbig ??saQparenrightbigr ?parenleftbig?saQparenrightbigr bracketrightBig parenleftbig?sa Q parenrightbig r ? parenleftbig??sa Q parenrightbig r vextendsinglevextendsingle vextendsinglevextendsingle vextendsinglevextendsingle ? vextendsinglevextendsingle vextendsinglevextendsingle vextendsingle (sQ)rparenleftbig ??saQparenrightbigr vextendsinglevextendsingle vextendsinglevextendsingle vextendsingle? vextendsinglevextendsingle vextendsinglevextendsingle vextendsingle parenleftbig??sa Q parenrightbig r ? parenleftbig?sa Q parenrightbig rparenleftbig ?saQparenrightbigr vextendsinglevextendsingle vextendsinglevextendsingle vextendsingle, ?r ? vextendsinglevextendsingle?dQ,avextendsinglevextendsingle? vextendsinglevextendsingle vextendsinglevextendsingle vextendsingle parenleftbig??sa Q ??s a Q parenrightbig pparenleftbig ?saQparenrightbigp vextendsinglevextendsingle vextendsinglevextendsingle vextendsingle, where p is the index of the maximum ratio. Therefore, vextendsinglevextendsingle?d a ?? d Q,a vextendsinglevextendsingle ? vextendsinglevextendsingle?d Q,a vextendsinglevextendsingle?vextenddoublevextenddouble??saQ ??saQvextenddoublevextenddouble1vextenddouble vextenddouble?saQvextenddoublevextenddouble ? , by (4.26) ? ? k? ? k parenleftBiggvextenddoublevextenddouble ??saQ ??saQvextenddoublevextenddoublevextenddouble vextenddouble?saQvextenddoublevextenddouble parenrightBigg , by (4.27), (4.28) ? k? ?bardblz?z ?bardblvextenddoublevextenddouble?za Q vextenddoublevextenddouble vextenddoublevextenddouble?sa Q vextenddoublevextenddouble , by (4.23) ? k??bardblz?z?bardbl? ??, by (4.29) = co bardblz?z?bardbl. (4.31) 70 Recall, the absolute value of the difference between ?a and ?aQ is bounded by vextendsinglevextendsingle?a ??a Q vextendsinglevextendsingle ? 1 n bracketleftbigvextendsinglevextendsingle?d a vextendsinglevextendsinglebardblx Qbardbl vextenddoublevextenddouble?sa Q ???s a Q vextenddoublevextenddouble+vextendsinglevextendsingle?d a ?? d Q,a vextendsinglevextendsinglebardblx Qbardbl vextenddoublevextenddouble??sa Q vextenddoublevextenddouble +|?pa|bardblsQbardblvextenddoublevextenddouble?xaQ ???xaQvextenddoublevextenddouble+vextendsinglevextendsingle?pa ??pQ,avextendsinglevextendsinglebardblsQbardblvextenddoublevextenddouble??xaQvextenddoublevextenddouble +vextendsinglevextendsingle?pa?da ??pQ,a?dQ,avextendsinglevextendsinglevextenddoublevextenddouble?xaQvextenddoublevextenddoublevextenddoublevextenddouble?saQvextenddoublevextenddouble + vextendsinglevextendsingle vextendsingle?pQ,a?dQ,a bracketleftBigparenleftbig ?xaQparenrightbigT ?saQ ?parenleftbig??xaQparenrightbigT ??saQ bracketrightBigvextendsinglevextendsingle vextendsingle +vextendsinglevextendsingle?davextendsinglevextendsinglevextenddoublevextenddoublex?Qvextenddoublevextenddouble vextenddoublevextenddouble vextenddouble?sa?Q vextenddoublevextenddouble vextenddouble+|?pa| vextendsinglevextendsingle vextendsinglesT?Q?xa?Q vextendsinglevextendsingle vextendsingle+ vextendsinglevextendsingle?p a? d a vextendsinglevextendsinglevextendsinglevextendsinglevextendsingle vextendsingle parenleftBig ?xa?Q parenrightBigT ?sa?Q vextendsinglevextendsingle vextendsinglevextendsingle bracketrightbigg . Using the results from (4.23), (4.30), and (4.31), we can bound the 9 terms on the right-hand side of the above inequality: 1. vextendsinglevextendsingle?davextendsinglevextendsinglebardblxQbardblvextenddoublevextenddouble?saQ ???saQvextenddoublevextenddouble ? C1 ??bardblz?z?bardblbardbl?zabardbl by assumption (A7), (4.23) and the fact that vextendsinglevextendsingle?davextendsinglevextendsingle? 1. 2. To bound the second term,vextendsinglevextendsingle?da ??dQ,avextendsinglevextendsinglebardblxQbardblvextenddoublevextenddouble??saQvextenddoublevextenddouble, we must first bound the quantityvextenddoublevextenddouble??saQvextenddoublevextenddouble. vextenddoublevextenddouble??sa Q vextenddoublevextenddouble = vextenddoublevextenddouble??sa Q??s a Q +?s a Q vextenddoublevextenddouble ? vextenddoublevextenddouble??saQ ??saQvextenddoublevextenddouble+vextenddoublevextenddouble?saQvextenddoublevextenddouble ? ?bardblz?z?bardblbardbl?zabardbl+bardbl?zabardbl = (1 +?bardblz?z?bardbl)bardbl?zabardbl. (4.32) The third line follows from (4.23). Therefore, vextendsinglevextendsingle?d a ?? d Q,a vextendsinglevextendsinglebardblx Qbardbl vextenddoublevextenddouble??sa Q vextenddoublevextenddouble ? c obardblz?z?bardblC1 (1 +?bardblz?z?bardbl)bardbl?zabardbl = coC1 (1+?bardblz?z?bardbl)bardblz?z?bardblbardbl?zabardbl. This result follows from assumption (A7), (4.31), and (4.32). 71 3. |?pa|bardblsQbardblvextenddoublevextenddouble?xaQ ???xaQvextenddoublevextenddouble?C2??bardblz?z?bardblbardbl?zabardbl(similar to the bound on term 1). 4. vextendsinglevextendsingle?pa ??pQ,avextendsinglevextendsinglebardblsQbardblvextenddoublevextenddouble??xaQvextenddoublevextenddouble?coC2(1 +?bardblz?z?bardbl)bardblz?z?bardblbardbl?zabardbl(similartothe bound on term 2). 5. vextendsinglevextendsingle?pa?da ??pQ,a?dQ,avextendsinglevextendsinglevextenddoublevextenddouble?xaQvextenddoublevextenddoublevextenddoublevextenddouble?saQvextenddoublevextenddouble = vextendsinglevextendsingle?pa?da??pa?dQ,a +?pa?dQ,a ??pQ,a?dQ,avextendsinglevextendsinglevextenddoublevextenddouble?xaQvextenddoublevextenddoublevextenddoublevextenddouble?saQvextenddoublevextenddouble = vextendsinglevextendsingle?paparenleftbig?da ??dQ,aparenrightbig+?dQ,aparenleftbig?pa ??pQ,aparenrightbigvextendsinglevextendsinglevextenddoublevextenddouble?xaQvextenddoublevextenddoublevextenddoublevextenddouble?saQvextenddoublevextenddouble ? parenleftbig|?pa|vextendsinglevextendsingle?da ??dQ,avextendsinglevextendsingle+vextendsinglevextendsingle?dQ,avextendsinglevextendsinglevextendsinglevextendsingle?pa ??pQ,avextendsinglevextendsingleparenrightbigvextenddoublevextenddouble?xaQvextenddoublevextenddoublevextenddoublevextenddouble?saQvextenddoublevextenddouble ? (cobardblz?z?bardbl+co bardblz?z?bardbl)bardbl?zabardblbardbl?zabardbl = 2cobardblz?z?bardblbardbl?zabardbl2. (4.33) The fourth line follows from (4.30), (4.31) and the fact that |?pa| ? 1 and vextendsinglevextendsingle?d Q,a vextendsinglevextendsingle? 1. 6. vextendsinglevextendsingle vextendsingle?pQ,a?dQ,a bracketleftBigparenleftbig ?xaQparenrightbigT ?saQ ?parenleftbig??xaQparenrightbigT ??saQ bracketrightBigvextendsinglevextendsingle vextendsingle = vextendsinglevextendsingle vextendsingle?pQ,a?dQ,aparenleftbig?xaQparenrightbigT ?saQ??pa?dQ,aparenleftbig?xaQparenrightbigT??saQ +?pa?dQ,aparenleftbig?xaQparenrightbigT??saQ ??pQ,a?dQ,aparenleftbig??xaQparenrightbigT ??saQ vextendsinglevextendsingle vextendsingle = vextendsinglevextendsingle vextendsingle?dQ,aparenleftbig?xaQparenrightbigT parenleftbig?pQ,a?saQ ??pa??saQparenrightbig +?dQ,aparenleftbig??saQparenrightbigT parenleftbig?pa?xaQ ??pQ,a??xaQparenrightbig vextendsinglevextendsingle vextendsingle ? vextendsinglevextendsingle?dQ,avextendsinglevextendsinglevextenddoublevextenddouble?xaQvextenddoublevextenddoublevextenddoublevextenddouble?pQ,a?saQ ??pa??saQvextenddoublevextenddouble +vextendsinglevextendsingle?dQ,avextendsinglevextendsinglevextenddoublevextenddouble??saQvextenddoublevextenddoublevextenddoublevextenddouble?pa?xaQ ??pQ,a??xaQvextenddoublevextenddouble. 72 It suffices to bound vextenddoublevextenddouble?pQ,a?saQ ??pa??saQvextenddoublevextenddouble and vextenddoublevextenddouble?pa?xaQ ??pQ,a??xaQvextenddoublevextenddouble. The bound for vextenddoublevextenddouble?pQ,a?saQ ??pa??saQvextenddoublevextenddoubleis given by: vextenddoublevextenddouble?p Q,a?s a Q ?? p a??s a Q vextenddoublevextenddouble = vextenddoublevextenddouble?pQ,a?saQ??pa?saQ +?pa?saQ ??pa??saQvextenddoublevextenddouble ? vextenddoublevextenddoubleparenleftbig?pQ,a ??paparenrightbig?saQvextenddoublevextenddouble+vextenddoublevextenddouble?paparenleftbig?saQ ???saQparenrightbigvextenddoublevextenddouble ? vextendsinglevextendsingle?pQ,a ??pavextendsinglevextendsinglevextenddoublevextenddouble?saQvextenddoublevextenddouble+|?pa|vextenddoublevextenddouble?saQ ???saQvextenddoublevextenddouble ? cobardblz?z?bardblbardbl?zabardbl+?bardblz?z?bardblbardbl?zabardbl = (co +?)bardblz?z?bardblbardbl?zabardbl. (4.34) The fourth line follows from (4.30), (4.23) and the fact that |?pa| ? 1. A similar analysis shows that vextenddoublevextenddouble?p a?x a Q ?? p Q,a??x a Q vextenddoublevextenddouble ? (c o +?)bardblz?z?bardblbardbl?zabardbl. (4.35) Therefore, vextendsinglevextendsingle vextendsingle?pQ,a?dQ,a bracketleftBigparenleftbig ?xaQparenrightbigT ?saQ ?parenleftbig??xaQparenrightbigT ??saQ bracketrightBigvextendsinglevextendsingle vextendsingle ? vextendsinglevextendsingle?dQ,avextendsinglevextendsinglevextenddoublevextenddouble?xaQvextenddoublevextenddoublevextenddoublevextenddouble?pQ,a?saQ ??pa??saQvextenddoublevextenddouble +vextendsinglevextendsingle?dQ,avextendsinglevextendsinglevextenddoublevextenddouble??saQvextenddoublevextenddoublevextenddoublevextenddouble?pa?xaQ ??pQ,a??xaQvextenddoublevextenddouble ? bardbl?zabardbl?(co +?)bardblz?z?bardblbardbl?zabardbl +(1+?bardblz?z?bardbl)bardbl?zabardbl?(co +?)bardblz?z?bardblbardbl?zabardbl = (2 +?bardblz?z?bardbl)(co +?)bardblz?z?bardblbardbl?zabardbl2. 73 7. vextendsinglevextendsingle?davextendsinglevextendsinglevextenddoublevextenddoublex?Qvextenddoublevextenddouble vextenddoublevextenddouble vextenddouble?sa?Q vextenddoublevextenddouble vextenddouble = vextendsinglevextendsingle?davextendsinglevextendsingle vextenddoublevextenddouble vextenddoublex?Q ?x??Q vextenddoublevextenddouble vextenddouble vextenddoublevextenddouble vextenddouble?sa?Q vextenddoublevextenddouble vextenddouble ? bardblz?z?bardblbardbl?zabardbl, (4.36) sincevextendsinglevextendsingle?davextendsinglevextendsingle? 1. 8. Since ?xa?Q = ?x?Q ?S?1?Q X?Q?sa?Q, we have |?pa| vextendsinglevextendsingle vextendsinglesT?Q?xa?Q vextendsinglevextendsingle vextendsingle = |?pa| vextendsinglevextendsingle vextendsinglesT?Q parenleftBig ?x?Q ?S?1?Q X?Q?sa?Q parenrightBigvextendsinglevextendsingle vextendsingle ? vextendsinglevextendsingle vextendsingle?xT?Qs?Q ?xT?Q?sa?Q vextendsinglevextendsingle vextendsingle ? vextendsinglevextendsingle vextendsinglexT?Qs?Q vextendsinglevextendsingle vextendsingle+ vextendsinglevextendsingle vextendsinglexT?Q?sa?Q vextendsinglevextendsingle vextendsingle ? n?+vextenddoublevextenddoublex?Qvextenddoublevextenddouble vextenddoublevextenddouble vextenddouble?sa?Q vextenddoublevextenddouble vextenddouble ? n?+bardblz?z?bardblbardbl?zabardbl, (4.37) with |?pa|? 1. 9. vextendsinglevextendsingle?pa?davextendsinglevextendsingle vextendsinglevextendsingle vextendsinglevextendsingleparenleftBig?xa?QparenrightBigT ?sa?Q vextendsinglevextendsingle vextendsinglevextendsingle = vextendsinglevextendsingle?pa?davextendsinglevextendsingle vextendsinglevextendsingle vextendsinglevextendsingleparenleftBig?x?Q ?S?1? Q X?Q?s a? Q parenrightBigT ?sa?Q vextendsinglevextendsingle vextendsinglevextendsingle = vextendsinglevextendsingle?pa?davextendsinglevextendsingle vextendsinglevextendsingle vextendsinglevextendsingle?xT?Q?sa?Q ?parenleftBig?sa?QparenrightBigT X?QS?1? Q ?s a? Q vextendsinglevextendsingle vextendsinglevextendsingle ? vextendsinglevextendsingle vextendsinglexT?Q?sa?Q vextendsinglevextendsingle vextendsingle+ vextendsinglevextendsingle vextendsinglevextendsingleparenleftBig?sa?QparenrightBigT X?QS?1? Q ?s a? Q vextendsinglevextendsingle vextendsinglevextendsingle ? vextenddoublevextenddoublex?Qvextenddoublevextenddouble vextenddoublevextenddouble vextenddouble?sa?Q vextenddoublevextenddouble vextenddouble+ vextenddoublevextenddouble vextenddouble?sa?Q vextenddoublevextenddouble vextenddouble vextenddoublevextenddoubleX ?Q vextenddoublevextenddoublevextenddoublevextenddouble vextenddoubleS?1?Q vextenddoublevextenddouble vextenddouble vextenddoublevextenddouble vextenddouble?sa?Q vextenddoublevextenddouble vextenddouble ? bardblz?z?bardblbardbl?zabardbl+bardbl?zabardblbardblz?z?bardbl?1bardbl?zabardbl = (1+?1bardbl?zabardbl)bardblz?z?bardblbardbl?zabardbl, (4.38) where vextenddoublevextenddouble vextenddoubleS?1?Q vextenddoublevextenddouble vextenddouble??1 follows from (3.24). 74 Combining the bounds in 1-9, we have vextendsinglevextendsingle?a ??a Q vextendsinglevextendsingle ? 1 n bracketleftbigvextendsinglevextendsingle?d a vextendsinglevextendsinglebardblx Qbardbl vextenddoublevextenddouble?sa Q ???s a Q vextenddoublevextenddouble+vextendsinglevextendsingle?d a ?? d Q,a vextendsinglevextendsinglebardblx Qbardbl vextenddoublevextenddouble??sa Q vextenddoublevextenddouble +|?pa|bardblsQbardblvextenddoublevextenddouble?xaQ ???xaQvextenddoublevextenddouble+vextendsinglevextendsingle?pa ??pQ,avextendsinglevextendsinglebardblsQbardblvextenddoublevextenddouble??xaQvextenddoublevextenddouble +vextendsinglevextendsingle?pa?da ??pQ,a?dQ,avextendsinglevextendsinglevextenddoublevextenddouble?xaQvextenddoublevextenddoublevextenddoublevextenddouble?saQvextenddoublevextenddouble + vextendsinglevextendsingle vextendsingle?pQ,a?dQ,a bracketleftBigparenleftbig ?xaQparenrightbigT ?saQ ?parenleftbig??xaQparenrightbigT ??saQ bracketrightBigvextendsinglevextendsingle vextendsingle +vextendsinglevextendsingle?davextendsinglevextendsinglevextenddoublevextenddoublex?Qvextenddoublevextenddouble vextenddoublevextenddouble vextenddouble?sa?Q vextenddoublevextenddouble vextenddouble+|?pa| vextendsinglevextendsingle vextendsinglesT?Q?xa?Q vextendsinglevextendsingle vextendsingle+vextendsinglevextendsingle?pa?davextendsinglevextendsingle vextendsinglevextendsingle vextendsinglevextendsingleparenleftBig?xa?QparenrightBigT ?sa?Q vextendsinglevextendsingle vextendsinglevextendsingle bracketrightbigg ? 1n[C1?bardblz?z?bardblbardbl?zabardbl+coC1(1 +?bardblz?z?bardbl)bardblz?z?bardblbardbl?zabardbl C2?bardblz?z?bardblbardbl?zabardbl+coC2 (1+?bardblz?z?bardbl)bardblz?z?bardblbardbl?zabardbl +2co bardblz?z?bardblbardbl?zabardbl2 + (2+?bardblz?z?bardbl)(co +?)bardblz?z?bardblbardbl?zabardbl2 +bardblz?z?bardblbardbl?zabardbl+n?+bardblz?z?bardblbardbl?zabardbl +(1+?1 bardbl?zabardbl)bardblz?z?bardblbardbl?zabardbl]. We can simplify the last expression by factoring out the quantity bardblz?z?bardblbardbl?zabardbl 75 from some terms and grouping other terms. This gives vextendsinglevextendsingle?a ??a Q vextendsinglevextendsingle ? ?+ 1 n [C1?+coC1 (1 +?bardblz?z ?bardbl) +C2?+coC2 (1+?bardblz?z?bardbl) +2+ (1 +?1bardbl?zabardbl)]bardblz?z?bardblbardbl?zabardbl +1n[2co + (2 +?bardblz?z?bardbl)(co +?)]bardblz?z?bardblbardbl?zabardbl2 = ?+ 1n [(C1 +C2)?+ (C1 +C2)co (1+?bardblz?z?bardbl) +3+?1bardbl?zabardbl]bardblz?z?bardblbardbl?zabardbl +1n[2co + (2 +?bardblz?z?bardbl)(co +?)]bardblz?z?bardblbardbl?zabardbl2 = ?+ 1n [(C1 +C2)(co +?) + 3 +?1bardbl?zabardbl +(C1 +C2)co?bardblz?z?bardbl]bardblz?z?bardblbardbl?zabardbl +bardblz?z?bardbl??parenleftbigbardbl?zabardbl2parenrightbig. Let V1 = (C1 +C2)(co +?) + 3+?1bardbl?zabardbl and (4.39) V2 = (C1 +C2)?. (4.40) Then vextendsinglevextendsingle?a ??a Q vextendsinglevextendsingle ? ?+parenleftbiggV1 +coV2bardblz?z?bardbl n parenrightbigg bardblz?z?bardblbardbl?zabardbl +bardblz?z?bardbl??parenleftbigbardbl?zabardbl2parenrightbig. (4.41) The absolute value of the remaining pieces in (4.25) can be bounded by vextendsinglevextendsingle vextendsingleparenleftbig1/?2parenrightbig bracketleftBig (?a)2 +?a ??aQ +parenleftbig?aQparenrightbig2 bracketrightBigvextendsinglevextendsingle vextendsingle ? vextendsinglevextendsingle vextendsinglevextendsingle? a +?a Q ? vextendsinglevextendsingle vextendsinglevextendsingle 2 . (4.42) In the analysis that follows, we concentrate on bounding the square root of the expression on the right of (4.42). We have 76 vextendsinglevextendsingleparenleftbig?a +?a Q parenrightbig/?vextendsinglevextendsingle = (1/n?) bracketleftBig (x+?pa?xa)T parenleftbigs+?da?saparenrightbig+parenleftbigx+?pQ,a??xaparenrightbigT parenleftbigs+?dQ,a??saparenrightbig bracketrightBig = (1/n?) bracketleftBig 2 xTs+?daxT?sa +?pasT?xa +?pa?da (?xa)T ?sa + ?dQ,axT??sa +?pQ,asT??xa +?pQ,a?dQ,a (??xa)T ??sa bracketrightBig . By partitioning the right-hand side of the last expression into Q and ?Q, we can eliminate the terms involving ??xa?Q and ??sa?Q since ??xa?Q = ??sa?Q = 0. This gives vextendsinglevextendsingleparenleftbig?a +?a Q parenrightbig/?vextendsinglevextendsingle = 2+ (1/n?)bracketleftBig?d ax T Q?s a Q +? p as T Q?x a Q +? p a? d a parenleftbig?xa Q parenrightbigT ?sa Q +?daxT?Q?sa?Q +?pasT?Q?xa?Q +?pa?da parenleftBig ?xa?Q parenrightBigT ?sa?Q + ?dQ,axTQ??saQ +?pQ,asTQ??xaQ +?pQ,a?dQ,aparenleftbig??xaQparenrightbigT ??saQ bracketrightBig . The first term on the right-hand side of the last expression follows since ? = parenleftbigxTsparenrightbig/n. By adding in, subtracting out, and grouping certain terms, we have 77 vextendsinglevextendsingleparenleftbig?a +?a Q parenrightbig/?vextendsinglevextendsingle = 2+ (1/n?)bracketleftbig?daxTQ?saQ??dQ,axTQ??saQ +?dQ,axTQ??saQ +?dQ,axTQ??saQ +?pasTQ?xaQ??pQ,asTQ??xaQ +?pQ,asTQ??xaQ +?pQ,asTQ??xaQ +?pa?daparenleftbig?xaQparenrightbigT ?saQ??pQ,a?dQ,aparenleftbig?xaQparenrightbigT?saQ +?pQ,a?dQ,aparenleftbig?xaQparenrightbigT?saQ +?pQ,a?dQ,aparenleftbig??xaQparenrightbigT ??saQ +?daxT?Q?sa?Q +?pasT?Q?xa?Q +?pa?da parenleftBig ?xa?Q parenrightBigT ?sa?Q bracketrightbigg = 2+ (1/n?)bracketleftbigxTQparenleftbig?da?saQ ??dQ,a??saQparenrightbig+sTQparenleftbig?pa?xaQ ??pQ,a??xaQparenrightbig +2parenleftbig?pQ,asTQ??xaQ +?dQ,axTQ??saQparenrightbig+parenleftbig?pa?da ??pQ,a?dQ,aparenrightbigparenleftbig?xaQparenrightbigT ?saQ +?pQ,a?dQ,a bracketleftBigparenleftbig ?xaQparenrightbigT ?saQ +parenleftbig??xaQparenrightbigT ??saQ bracketrightBig +?daxT?Q?sa?Q +?pasT?Q?xa?Q +?pa?da parenleftBig ?xa?Q parenrightBigT ?sa?Q bracketrightbigg . Therefore, vextendsinglevextendsingleparenleftbig?a +?a Q parenrightbig/?vextendsinglevextendsingle ? 2+ vextendsinglevextendsingle vextendsinglevextendsingle 1 n? vextendsinglevextendsingle vextendsinglevextendsinglebracketleftbigbardblxQbardblvextenddoublevextenddouble?da?saQ ??dQ,a??saQvextenddoublevextenddouble+bardblsQbardblvextenddoublevextenddouble?pa?xaQ ??pQ,a??xaQvextenddoublevextenddouble +2vextendsinglevextendsingle?pQ,asTQ??xaQ +?dQ,axTQ??saQvextendsinglevextendsingle+vextendsinglevextendsingle?pa?da ??pQ,a?dQ,avextendsinglevextendsinglevextenddoublevextenddouble?xaQvextenddoublevextenddoublevextenddoublevextenddouble?saQvextenddoublevextenddouble +vextendsinglevextendsingle?pQ,a?dQ,avextendsinglevextendsingle vextendsinglevextendsingle vextendsingleparenleftbig?xaQparenrightbigT ?saQ +parenleftbig??xaQparenrightbigT ??saQ vextendsinglevextendsingle vextendsingle +vextendsinglevextendsingle?davextendsinglevextendsinglevextenddoublevextenddoublex?Qvextenddoublevextenddouble vextenddoublevextenddouble vextenddouble?sa?Q vextenddoublevextenddouble vextenddouble+|?pa| vextendsinglevextendsingle vextendsinglesT?Q?xa?Q vextendsinglevextendsingle vextendsingle+ vextendsinglevextendsingle?p a? d a vextendsinglevextendsinglevextendsinglevextendsinglevextendsingle vextendsingle parenleftBig ?xa?Q parenrightBigT ?sa?Q vextendsinglevextendsingle vextendsinglevextendsingle bracketrightbigg . We can bound the 8 terms on the right-hand side of the above inequality by using some of the previous results: I. bardblxQbardblvextenddoublevextenddouble?da?saQ ??dQ,a??saQvextenddoublevextenddouble? C1 (co +?)bardblz?z?bardblbardbl?zabardbl, by assumption (A7) and (4.34) with ?pa and ?pQ,a replaced by ?da and ?dQ,a, respectively. 78 II. bardblsQbardblvextenddoublevextenddouble?pa?xaQ ??pQ,a??xaQvextenddoublevextenddouble?C2 (co +?)bardblz?z?bardblbardbl?zabardbl, by assumption (A7) and (4.35). III. 2vextendsinglevextendsingle?pQ,asTQ??xaQ +?dQ,axTQ??saQvextendsinglevextendsingle = 2vextendsinglevextendsingle?pQ,asTQ??xaQ??pQ,asTQ?xaQ +?pQ,asTQ?xaQ +?dQ,axTQ??saQ??dQ,axTQ?saQ +?dQ,axTQ?saQvextendsinglevextendsingle = 2vextendsinglevextendsingle?pQ,asTQparenleftbig??xaQ ??xaQparenrightbig+?dQ,axTQparenleftbig??saQ ??saQparenrightbig +?pQ,asTQ?xaQ +?dQ,axTQ?saQvextendsinglevextendsingle ? 2bracketleftbigvextendsinglevextendsingle?pQ,avextendsinglevextendsinglebardblsQbardblvextenddoublevextenddouble??xaQ ??xaQvextenddoublevextenddouble+vextendsinglevextendsingle?dQ,avextendsinglevextendsinglebardblxQbardblvextenddoublevextenddouble??saQ ??saQvextenddoublevextenddouble +vextendsinglevextendsingle?pQ,asTQ?xaQ +?dQ,axTQ?saQvextendsinglevextendsinglebracketrightbig ? 2[C2?bardblz?z?bardblbardbl?zabardbl+C1?bardblz?z?bardblbardbl?zabardbl +vextendsinglevextendsinglemaxbraceleftbig?pQ,a,?dQ,abracerightbigvextendsinglevextendsinglevextendsinglevextendsinglesTQ?xaQ +xTQ?saQvextendsinglevextendsinglebracketrightbig ? 2[(C1 +C2)?bardblz?z?bardblbardbl?zabardbl+n?]. The fourthlinefollowsfrom (4.23). The fifthlinefollowssincevextendsinglevextendsinglemaxbraceleftbig?pQ,a,?dQ,abracerightbigvextendsinglevextendsingle? 1 and vextendsinglevextendsinglesTQ?xaQ +xTQ?saQvextendsinglevextendsingle=vextendsinglevextendsingle?xTQsQvextendsinglevextendsingle?n? by (2.4). IV. vextendsinglevextendsingle?pa?da ??pQ,a?dQ,avextendsinglevextendsinglevextenddoublevextenddouble?xaQvextenddoublevextenddoublevextenddoublevextenddouble?saQvextenddoublevextenddouble? 2co bardblz?z?bardblbardbl?zabardbl2, by (4.33). V. vextendsinglevextendsingle?pQ,a?dQ,avextendsinglevextendsingle vextendsinglevextendsingle vextendsingleparenleftbig?xaQparenrightbigT ?saQ +parenleftbig??xaQparenrightbigT ??saQ vextendsinglevextendsingle vextendsingle ? vextendsinglevextendsingle?pQ,avextendsinglevextendsinglevextendsinglevextendsingle?dQ,avextendsinglevextendsingle parenleftBigvextendsinglevextendsingle vextendsingleparenleftbig?xaQparenrightbigT ?saQ vextendsinglevextendsingle vextendsingle+ vextendsinglevextendsingle vextendsingleparenleftbig??xaQparenrightbigT ??saQ vextendsinglevextendsingle vextendsingle parenrightBig ? vextenddoublevextenddouble?xaQvextenddoublevextenddoublevextenddoublevextenddouble?saQvextenddoublevextenddouble+vextenddoublevextenddouble??xaQvextenddoublevextenddoublevextenddoublevextenddouble??saQvextenddoublevextenddouble ? bardbl?zabardbl2 + (1+?bardblz?z?bardbl)2bardbl?zabardbl2 = bracketleftbig1 + (1+?bardblz?z?bardbl)2bracketrightbigbardbl?zabardbl2. 79 The second line follows sincevextendsinglevextendsingle?pQ,avextendsinglevextendsingle? 1 andvextendsinglevextendsingle?dQ,avextendsinglevextendsingle? 1. The third line follows from (4.32) since the bound on vextenddoublevextenddouble??xaQvextenddoublevextenddoubleis the same as the bound on vextenddoublevextenddouble??saQvextenddoublevextenddouble. The bounds for the remaining terms (VI. - VIII.) follow from (4.36) - (4.38). Com- bining the bounds in I-VIII., we have vextendsinglevextendsingleparenleftbig?a +?a Q parenrightbig/?vextendsinglevextendsingle ? 2+ vextendsinglevextendsingle vextendsinglevextendsingle 1 n? vextendsinglevextendsingle vextendsinglevextendsinglebracketleftbigbardblxQbardblvextenddoublevextenddouble?da?saQ ??dQ,a??saQvextenddoublevextenddouble+bardblsQbardblvextenddoublevextenddouble?pa?xaQ ??pQ,a??xaQvextenddoublevextenddouble +2vextendsinglevextendsingle?pQ,asTQ??xaQ +?dQ,axTQ??saQvextendsinglevextendsingle+vextendsinglevextendsingle?pa?da ??pQ,a?dQ,avextendsinglevextendsinglevextenddoublevextenddouble?xaQvextenddoublevextenddoublevextenddoublevextenddouble?saQvextenddoublevextenddouble +vextendsinglevextendsingle?pQ,a?dQ,avextendsinglevextendsingle vextendsinglevextendsingle vextendsingleparenleftbig?xaQparenrightbigT ?saQ +parenleftbig??xaQparenrightbigT ??saQ vextendsinglevextendsingle vextendsingle +vextendsinglevextendsingle?davextendsinglevextendsinglevextenddoublevextenddoublex?Qvextenddoublevextenddouble vextenddoublevextenddouble vextenddouble?sa?Q vextenddoublevextenddouble vextenddouble+|?pa| vextendsinglevextendsingle vextendsinglesT?Q?xa?Q vextendsinglevextendsingle vextendsingle+ vextendsinglevextendsingle?p a? d a vextendsinglevextendsinglevextendsinglevextendsinglevextendsingle vextendsingle parenleftBig ?xa?Q parenrightBigT ?sa?Q vextendsinglevextendsingle vextendsinglevextendsingle bracketrightbigg ? 2+ vextendsinglevextendsingle vextendsinglevextendsingle 1 n? vextendsinglevextendsingle vextendsinglevextendsingle[C1(co +?)bardblz?z?bardblbardbl?zabardbl+C2 (co +?)bardblz?z?bardblbardbl?zabardbl +2[(C1 +C2)?bardblz?z?bardblbardbl?zabardbl+n?] + 2co bardblz?z?bardblbardbl?zabardbl2 +bracketleftbig1+ (1+?bardblz?z?bardbl)2bracketrightbigbardbl?zabardbl2 +bardblz?z?bardblbardbl?zabardbl +n?+bardblz?z?bardblbardbl?zabardbl+ (1+?1bardbl?zabardbl)bardblz?z?bardblbardbl?zabardbl]. By factoring out bardblz?z?bardblbardbl?zabardbl from select terms, grouping other terms, and ap- plying (4.39) and (4.40), we have vextendsinglevextendsingleparenleftbig?a +?a Q parenrightbig/?vextendsinglevextendsingle ? 2+ vextendsinglevextendsingle vextendsinglevextendsingle 1 n? vextendsinglevextendsingle vextendsinglevextendsingle([2 + (C1 +C2)(co +?) + (1+?1bardbl?zabardbl) +2(C1 +C2)?]bardblz?z?bardblbardbl?zabardbl+ 3n? +parenleftbig2cobardblz?z?bardbl+bracketleftbig1 + (1+?bardblz?z?bardbl)2bracketrightbigparenrightbigbardbl?zabardbl2parenrightbig = 2+ vextendsinglevextendsingle vextendsinglevextendsingle 1 n? vextendsinglevextendsingle vextendsinglevextendsingle(V1 + 2V2)bardblz?z?bardblbardbl?zabardbl+ 3+ ?parenleftbigbardbl?zabardbl2parenrightbig = 5+ vextendsinglevextendsingle vextendsinglevextendsingle 1 n? vextendsinglevextendsingle vextendsinglevextendsingle(V1 + 2V2)bardblz?z?bardblbardbl?zabardbl+ ?parenleftbigbardbl?zabardbl2parenrightbig. 80 Squaring the last result gives vextendsinglevextendsingleparenleftbig?a +?a Q parenrightbig/?vextendsinglevextendsingle2 ? 25 + vextendsinglevextendsingle vextendsinglevextendsingle10 n? vextendsinglevextendsingle vextendsinglevextendsingle(V1 + 2V2)bardblz?z?bardblbardbl?zabardbl+ ?parenleftbigbardbl?zabardbl2parenrightbig. (4.43) Combining the bounds in (4.41) and (4.43), we have the final result: |(???Q)?| = vextendsinglevextendsingle?a ??aQvextendsinglevextendsingle vextendsinglevextendsingle vextendsinglevextendsingle vextendsingle (?a)2 +?a ??aQ +parenleftbig?aQparenrightbig2 ?2 vextendsinglevextendsingle vextendsinglevextendsingle vextendsingle ? vextendsinglevextendsingle?a ??aQvextendsinglevextendsingle vextendsinglevextendsingle vextendsinglevextendsingle?a +?aQ ? vextendsinglevextendsingle vextendsinglevextendsingle 2 ? bracketleftbigg ?+ parenleftbiggV 1 +coV2bardblz?z?bardbl n parenrightbigg bardblz?z?bardblbardbl?zabardbl +bardblz?z?bardbl??parenleftbigbardbl?zabardbl2parenrightbigbracketrightbig ? bracketleftbigg 25 + vextendsinglevextendsingle vextendsinglevextendsingle10 n? vextendsinglevextendsingle vextendsinglevextendsingle(V1 + 2V2)bardblz?z?bardblbardbl?zabardbl+ ?parenleftbigbardbl?zabardbl2parenrightbig bracketrightbigg = 25?+ 25n (V1 +coV2 bardblz?z?bardbl)bardblz?z?bardblbardbl?zabardbl +10n (V1 + 2V2)bardblz?z?bardblbardbl?zabardbl +bardblz?z?bardbl??parenleftbigbardbl?zabardbl2parenrightbig+???parenleftbigbardbl?zabardbl2parenrightbig ? 25?+V3bardblz?z?bardblbardbl?zabardbl +max{bardblz?z?bardbl,?}??parenleftbigbardbl?zabardbl2parenrightbig ? 25?+V3bardblz?z?bardblbardbl?zabardbl+???parenleftbigbardbl?zabardbl2parenrightbig, (4.44) where ? = max{bardblz?z?bardbl,?} and V3 = 1n [25(V1 +coV2bardblz?z?bardbl) + 10(V1 + 2V2)]. [] 81 Lemma 4.2 : Suppose assumption (A8) and the assumptions of Lemma 4.1 hold. Then, |??|? 25?+ (V3 +?)?bardbl?zabardbl+???parenleftbigbardbl?zabardbl2parenrightbig. Proof: |??| = |????Q?+?Q?| ? |(???Q)?|+|?Q?| ? 25?+V3 bardblz?z?bardblbardbl?zabardbl+???parenleftbigbardbl?zabardbl2parenrightbig +??bardbl?zabardbl ? 25?+ (V3 +?)max{bardblz?z?bardbl,?}bardbl?zabardbl +???parenleftbigbardbl?zabardbl2parenrightbig = 25?+ (V3 +?)?bardbl?zabardbl+???parenleftbigbardbl?zabardbl2parenrightbig. [] The next two lemmas give bounds on the error components, ?Xa?Q?Sa?Qe?Q and parenleftBig ?XaQ?SaQ ?? ?XaQ??SaQ parenrightBig eQ. Lemma 4.3 : Suppose (3.24) holds (i.e. vextenddoublevextenddouble vextenddoubleS?1?Q vextenddoublevextenddouble vextenddouble??1 where ?1 > 0). Then vextenddoublevextenddouble vextenddouble?Xa?Q?Sa?Qe?Q vextenddoublevextenddouble vextenddouble? (1 +?1bardbl?zabardbl)bardblz?z?bardblbardbl?zabardbl. Proof: The ?Xa?Q component of the standard affine-scaling direction is given by ?Xa?Q = 82 ?X?Q ?S?1?Q X?Q?Sa?Q. Therefore, we have vextenddoublevextenddouble vextenddouble?Xa?Q?Sa?Qe?Q vextenddoublevextenddouble vextenddouble = vextenddoublevextenddouble vextenddouble parenleftBig ?X?Q ?S?1?Q X?Q?Sa?Q parenrightBig ?Sa?Qe?Q vextenddoublevextenddouble vextenddouble ? vextenddoublevextenddoubleX?Qvextenddoublevextenddouble vextenddoublevextenddouble vextenddouble?sa?Q vextenddoublevextenddouble vextenddouble+ vextenddoublevextenddouble vextenddoubleS?1?Q vextenddoublevextenddouble vextenddouble vextenddoublevextenddoubleX ?Q vextenddoublevextenddoublevextenddoublevextenddouble vextenddouble?Sa?Q vextenddoublevextenddouble vextenddouble vextenddoublevextenddouble vextenddouble?sa?Q vextenddoublevextenddouble vextenddouble ? bardblz?z?bardblbardbl?zabardbl+?1bardblz?z?bardblbardbl?zabardbl2 = (1+?1bardbl?zabardbl)bardblz?z?bardblbardbl?zabardbl. The third line follows since vextenddoublevextenddoubleX?Qvextenddoublevextenddouble= vextenddoublevextenddouble vextenddoubleX?Q ?X??Q vextenddoublevextenddouble vextenddouble? bardblz?z?bardbl and vextenddoublevextenddouble vextenddoubleS?1?Q vextenddoublevextenddouble vextenddouble??1. [] Lemma 4.4 : Suppose (4.23) and (4.32) holds. Then vextenddoublevextenddouble vextenddouble parenleftBig ?XaQ?SaQ ?? ?XaQ??SaQ parenrightBig eQ vextenddoublevextenddouble vextenddouble? (2+?bardblz?z?bardbl)?bardblz?z?bardblbardbl?zabardbl2. Proof: The analysis to bound vextenddoublevextenddouble??xaQvextenddoublevextenddouble is similar to the analysis on the bound of vextenddoublevextenddouble??saQvextenddoublevextenddouble which is shown in (4.32). Therefore, vextenddoublevextenddouble vextenddouble parenleftBig ?XaQ?SaQ ?? ?XaQ??SaQ parenrightBig eQ vextenddoublevextenddouble vextenddouble = vextenddoublevextenddouble vextenddouble parenleftBig ?XaQ?SaQ???XaQ?SaQ +??XaQ?SaQ ?? ?XaQ??SaQ parenrightBig eQ vextenddoublevextenddouble vextenddouble = vextenddoublevextenddouble vextenddouble parenleftBig ?XaQ ?? ?XaQ parenrightBig ?saQ + ? ?XaQparenleftbig?saQ ???saQparenrightbig vextenddoublevextenddouble vextenddouble ? vextenddoublevextenddouble vextenddouble?XaQ ?? ?XaQ vextenddoublevextenddouble vextenddoublevextenddoublevextenddouble?saQvextenddoublevextenddouble+ vextenddoublevextenddouble vextenddouble? ?XaQ vextenddoublevextenddouble vextenddoublevextenddoublevextenddouble?saQ ???saQvextenddoublevextenddouble ? ?bardblz?z?bardblbardbl?zabardblbardbl?zabardbl+ (1 +?bardblz?z?bardbl)bardbl?zabardbl?bardblz?z?bardblbardbl?zabardbl = (2+?bardblz?z?bardbl)?bardblz?z?bardblbardbl?zabardbl2, where the fourth line follows from (4.23) and (4.32). [] 83 Before we state the theorem to show (4.24), we present a lemma to show the standand centering-corrector direction is bounded by the standard affine-scaling direction. Lemma 4.5 : Suppose (3.24), Assumptions (A7) - (A8) and the assumptions of Lemma (4.2) hold. Let vextenddoublevextenddouble(A QS?1Q XQATQ)?1 vextenddoublevextenddouble?C 3 where 0 0, s>0} where ? > 0. Suppose Ja(AQ,xQ,sQ) is nonsingular for all Q. Let ?za be the Newton direction at z. Then, there exists ??> 0 such that vextenddoublevextenddouble??zcc Q ??z cc Q vextenddoublevextenddouble? ????(z) for all z ? ? with ?(z) ? 0 as the optimal solution is approached. Proof. Let z = (x,y,s) where x> 0 and s> 0. Then, ??ycc and ??xccQ are given by ? ?? ? 0 AQ ?XQATQ SQ ? ?? ? ? ?? ? ??ycc ??xccQ ? ?? ? = ? ?? ? 0 ??Q ? ?? ?, (4.45) 86 where ??Q = ?Q?eQ ? ? ?XaQ??SaQeQ. The components of the centering-corrector direction, ?ycc and ?xcc satisfy ? ?? ? 0 A ?XAT S ? ?? ? ? ?? ? ?ycc ?xcc ? ?? ? = ? ?? ? 0 ? ? ?? ?, where ? = ??e? ?Xa?Sae. Partitioning the previous expression in terms of Q and ?Q gives ? ?? ?? ?? ? 0 AQ A?Q ?XQATQ SQ 0 ?X?QAT?Q 0 S?Q ? ?? ?? ?? ? ? ?? ?? ?? ? ?ycc ?xccQ ?xcc?Q ? ?? ?? ?? ? = ? ?? ?? ?? ? 0 ?Q ??Q ? ?? ?? ?? ? . Premultiplying the third row by ?A?QS?1?Q and adding to the first row, we can elim- inate ?xcc?Q to obtain the following system of equations, ? ?? ? A?QS?1?Q X?QAT?Q AQ ?XQATQ SQ ? ?? ? ? ?? ? ?ycc ?xccQ ? ?? ? = ? ?? ? ?A?QS?1?Q ??Q ?Q ? ?? ?. Adding ??Q to both sides of the last expression, we have ? ?? ? A?QS?1?Q X?QAT?Q AQ ?XQATQ SQ ? ?? ? ? ?? ? ?ycc ?xccQ ? ?? ? + ? ?? ? A?QS?1?Q ??Q ??Q ??Q ? ?? ? = ? ?? ? 0 ??Q ? ?? ?. (4.46) The right-hand side of (4.46) is precisely the right-hand side of (4.45). Let Ja(AQ,xQ,sQ) = ? ?? ? 0 AQ ?XQATQ SQ ? ?? ?. 87 Equating the left-hand sides of (4.45) and (4.46) gives ? ?? ? A?QS?1?Q X?QAT?Q AQ ?XQATQ SQ ? ?? ? ? ?? ? ?ycc ?xccQ ? ?? ? + ? ?? ? A?QS?1?Q ??Q ??Q ??Q ? ?? ? = Ja(AQ,xQ,sQ) ? ?? ? ??ycc ??xccQ ? ?? ?. Notice that ? ?? ? A?QS?1?Q X?QAT?Q AQ ?XQATQ SQ ? ?? ? ? ?? ? ?ycc ?xccQ ? ?? ? = Ja(AQ,xQ,sQ) ? ?? ? ?ycc ?xccQ ? ?? ?+ ? ?? ? A?QS?1?Q X?QAT?Q 0 0 0 ? ?? ? ? ?? ? ?ycc ?xccQ ? ?? ?. Now solving for ??zccQ ??zccQ, we have ? ?? ? ??ycc ??xccQ ? ?? ?? ? ?? ? ?ycc ?xccQ ? ?? ? = Ja(AQ,xQ,sQ)?1 ? ?? ? A?QS?1?Q X?QAT?Q 0 0 0 ? ?? ? ? ?? ? ?ycc ?xccQ ? ?? ? + Ja(AQ,xQ,sQ)?1 ? ?? ? A?QS?1?Q ??Q ??Q ??Q ? ?? ?. Taking norms of both sides, we have 88 vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble ? ?? ? ??ycc ??xccQ ? ?? ?? ? ?? ? ?ycc ?xccQ ? ?? ? vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble ? vextenddoublevextenddoubleJa(AQ,xQ,sQ)?1vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble ? ?? ? A?QS?1?Q X?QAT?Q 0 0 0 ? ?? ? vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble ? ?? ? ?ycc ?xccQ ? ?? ? vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble (4.47) +vextenddoublevextenddoubleJa(AQ,xQ,sQ)?1vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble ? ?? ? A?QS?1?Q ??Q ??Q ??Q ? ?? ? vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble . (4.48) The bound on (4.47) is given by bardblJa(AQ,xQ,sQ)?1bardbl vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble ? ?? ? A?QS?1?Q X?QAT?Q 0 0 0 ? ?? ? vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble ? ?? ? ?ycc ?xccQ ? ?? ? vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble ? ?2bardblz?z?bardblvextenddoublevextenddouble?zccQvextenddoublevextenddouble ? ?2bardblz?z?bardbl?3 max{bardbl?zabardbl,?}. (4.49) The first inequality follows from (3.25) and the last inequality follows from Lemma 4.5. We will now show a bound for vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble ? ?? ? A?QS?1?Q ??Q ??Q ??Q ? ?? ? vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble . Let M =A?QS?1?Q ; thenvextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddouble ? ?? ? A?QS?1?Q ??Q ??Q ??Q ? ?? ? vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble 2 = vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble ? ?? ? M parenleftBig ??e?Q ??Xa?Q?Sa?Qe?Q parenrightBig bracketleftBig (?Q ??)?eQ ? parenleftBig ? ?XaQ??SaQ ??XaQ?SaQ parenrightBigbracketrightBig eQ ? ?? ? vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble 2 . 89 Expanding the right-hand side of this norm, we havevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddouble ? ?? ? A?QS?1?Q ??Q ??Q ??Q ? ?? ? vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble 2 ? vextenddoublevextenddouble vextenddoubleM?Xa?Q?Sa?Qe?Q vextenddoublevextenddouble vextenddouble 2 + (??)2vextenddoublevextenddoubleMe ?Q vextenddoublevextenddouble2 +vextenddoublevextenddouble vextenddouble parenleftBig ?XaQ?SaQ ?? ?XaQ??SaQ parenrightBig eQ vextenddoublevextenddouble vextenddouble 2 +[(???Q)?]2bardbleQbardbl2 + 2|??| vextenddoublevextenddouble vextenddouble?Xa?Q?Sa?Qe?Q vextenddoublevextenddouble vextenddouble vextenddoublevextenddoubleMe ?Q vextenddoublevextenddouble +2|(???Q)?|bardbleQbardbl vextenddoublevextenddouble vextenddouble parenleftBig ?XaQ?SaQ ?? ?XaQ??SaQ parenrightBig eQ vextenddoublevextenddouble vextenddouble ? bardblMbardbl2 vextenddoublevextenddouble vextenddouble?Xa?Q?Sa?Qe?Q vextenddoublevextenddouble vextenddouble 2 + (n?k)bardblMbardbl2 (??)2 + vextenddoublevextenddouble vextenddouble parenleftBig ?XaQ?SaQ ?? ?XaQ??SaQ parenrightBig eQ vextenddoublevextenddouble vextenddouble 2 +k[(??? Q)?] 2 +2?n?kbardblMbardbl|??| vextenddoublevextenddouble vextenddouble?Xa?Q?Sa?Qe?Q vextenddoublevextenddouble vextenddouble +2 ? k|(???Q)?| vextenddoublevextenddouble vextenddouble parenleftBig ?XaQ?SaQ ?? ?XaQ??SaQ parenrightBig eQ vextenddoublevextenddouble vextenddouble. The six terms on the right-hand side of the last inequality can be bounded by previous results within this section. Let ? = max{bardblz?z?bardbl,?}. The bounds are as follows: i. bardblMbardbl2 vextenddoublevextenddouble vextenddouble?Xa?Q?Sa?Qe?Q vextenddoublevextenddouble vextenddouble 2 ? bardblMbardbl2 (1+?1bardbl?zbardbl)2bardblz?z?bardbl2bardbl?zabardbl2 ? ?2 ??parenleftbigbardbl?zabardbl2parenrightbig. The second line follows from Lemma 4.3. 90 ii. (n?k)bardblMbardbl2 (??)2 ? (n?k)bardblMbardbl2bracketleftbig25?+ (V3 +?)?bardbl?zabardbl+???parenleftbigbardbl?zabardbl2parenrightbigbracketrightbig2 = (n?k)bardblMbardbl2bracketleftbig625?2 + (V3 +?)2?2bardbl?zabardbl2 + 50?(V3 +?)?bardbl?zabardbl +50?? ??parenleftbigbardbl?zabardbl2parenrightbigbracketrightbig ? (n?k)bardblMbardbl2bracketleftbig625?2 +?2 ??(bardbl?zabardbl) +?2 ??parenleftbigbardbl?zabardbl2parenrightbigbracketrightbig ? ?2bracketleftbig625(n?k)bardblMbardbl2 + ?(bardbl?zabardbl)bracketrightbig. The bound on ?? follows from Lemma 4.2. iii. vextenddoublevextenddouble vextenddouble parenleftBig ?XaQ?SaQ ?? ?XaQ??SaQ parenrightBig eQ vextenddoublevextenddouble vextenddouble 2 ? (2 +?bardblz?z?bardbl)2?2vextenddoublevextenddoublez?z2vextenddoublevextenddouble2bardbl?zabardbl4 ? ?2 ??parenleftbigbardbl?zabardbl4parenrightbig. The second line follows from Lemma 4.4. iv. k[(???Q)?]2 ? kbracketleftbig25?+V3bardblz?z?bardblbardbl?zabardbl+???parenleftbigbardbl?zabardbl2parenrightbigbracketrightbig2 ? kbracketleftbig625?2 +V23 ?2bardbl?zabardbl2 +?2 ??parenleftbigbardbl?zabardbl4parenrightbig +50?V3?bardbl?zabardbl+ 50?? ??parenleftbigbardbl?zabardbl2parenrightbig+ 2V3???parenleftbigbardbl?zabardbl3parenrightbigbracketrightbig ? kbracketleftbig625?2 +?2 ??(bardbl?zabardbl)??2 ??parenleftbigbardbl?zabardbl2parenrightbigbracketrightbig ? ?2 [625k + ?(bardbl?zabardbl)]. The second line follows from Lemma 4.1. 91 v. 2?n?kbardblMbardbl|??| vextenddoublevextenddouble vextenddouble?Xa?Q?Sa?Qe?Q vextenddoublevextenddouble vextenddouble ? 2?n?kbardblMbardblbracketleftbig25?+ (V3 +?)?bardbl?zabardbl+? ??parenleftbigbardbl?zabardbl2parenrightbigbracketrightbig? [(1 +?1bardbl?zabardbl)bardblz?z?bardblbardbl?zabardbl] ? ?2 ??(bardbl?zabardbl). The second line follows from Lemmas 4.2 and 4.3. vi. 2?k|(???Q)?| vextenddoublevextenddouble vextenddouble parenleftBig ?XaQ?SaQ ?? ?XaQ??SaQ parenrightBig eQ vextenddoublevextenddouble vextenddouble ? 2 ? kbracketleftbig25?+V3bardblz?z?bardblbardbl?zabardbl+???parenleftbigbardbl?zabardbl2parenrightbigbracketrightbig? bracketleftbig(2+?bardblz?z?bardbl)?bardblz?z?bardblbardbl?zabardbl2bracketrightbig ? ?2 ??parenleftbigbardbl?zabardbl2parenrightbig+?2 ??parenleftbigbardbl?zabardbl3parenrightbig+?2 ??parenleftbigbardbl?zabardbl4parenrightbig ? ?2 ??parenleftbigbardbl?zabardbl2parenrightbig. The second line follows from Lemmas 4.1 and 4.4. Combining i.-vi., we have 92 vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble ? ?? ? A?QS?1?Q ??Q ??Q ??Q ? ?? ? vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble 2 ? bardblMbardbl2 vextenddoublevextenddouble vextenddouble?Xa?Q?Sa?Qe?Q vextenddoublevextenddouble vextenddouble 2 + (n?k)bardblMbardbl2 (??)2 + vextenddoublevextenddouble vextenddouble parenleftBig ?XaQ?SaQ ?? ?XaQ??SaQ parenrightBig eQ vextenddoublevextenddouble vextenddouble 2 +k[(??? Q)?] 2 +2?n?kbardblMbardbl|??| vextenddoublevextenddouble vextenddouble?Xa?Q?Sa?Qe?Q vextenddoublevextenddouble vextenddouble +2 ? k|(???Q)?| vextenddoublevextenddouble vextenddouble parenleftBig ?XaQ?SaQ ?? ?XaQ??SaQ parenrightBig eQ vextenddoublevextenddouble vextenddouble ? ?2 ?parenleftbigbardbl?zabardbl2parenrightbig+?2bracketleftbig625(n?k)bardblMbardbl2 + ?(bardbl?zabardbl)bracketrightbig +?2 ??parenleftbigbardbl?zabardbl4parenrightbig+?2 [625k + ?(bardbl?zabardbl)] +?2 ??(bardbl?zabardbl) +?2 ??parenleftbigbardbl?zabardbl2parenrightbig. Factoring out ?2 and combining like terms givesvextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddouble ? ?? ? A?QS?1?Q ??Q ??Q ??Q ? ?? ? vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble 2 ? ?2bracketleftbig625parenleftbig(n?k)bardblMbardbl2 +kparenrightbig+ ?(bardbl?zabardbl) + ?parenleftbigbardbl?zabardbl2parenrightbig +?parenleftbigbardbl?zabardbl4parenrightbigbracketrightbig ? ?2bracketleftbig625parenleftbig(n?k)bardblMbardbl2 +kparenrightbig+ ?(bardbl?zabardbl)bracketrightbig. Taking the square root of both sides, we have vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble ? ?? ? A?QS?1?Q ??Q ??Q ??Q ? ?? ? vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble ? ? radicalbig ?3 + ?(bardbl?zabardbl), (4.50) where ?3 = 625(n?k)bardblMbardbl2. 93 Combining (4.47), (4.48), (4.49), and (4.50), we have vextenddoublevextenddouble??zcc Q ??z cc Q vextenddoublevextenddouble ? vextenddoublevextenddoubleJ a(AQ,xQ,sQ)?1 vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble ? ?? ? A?QS?1?Q X?QAT?Q 0 0 0 ? ?? ? vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble ? ?? ? ?ycc ?xccQ ? ?? ? vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble +vextenddoublevextenddoubleJa(AQ,xQ,sQ)?1vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble ? ?? ? A?QS?1?Q ??Q ?Q ? ??Q ? ?? ? vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble vextenddoublevextenddouble ? ?2bardblz?z?bardbl?3 ?max{bardbl?zabardbl,?}+??0 radicalbig ?3 + ?(bardbl?zabardbl) ? ???(max{bardbl?zabardbl,?}) +??0 radicalbig ?3 + ?(bardbl?zabardbl) ? ???? where ?? = ?0 max braceleftBig ?(max{bardbl?zabardbl,?}),radicalbig?3 + ?(bardbl?zabardbl) bracerightBig =radicalbig?3 + ?(bardbl?zabardbl) > 0 since ?(max{bardbl?zabardbl,?}) ? 0 and ?3 > 0. Furthermore, let ?(z) = ?. Then, ?(z) = max{bardblz?z?bardbl,?} ? 0 as the optimal solution is approached. [] Global and local quadratic convergence follow from Winternitz et al. [28] with five specific changes to the redMPC algorithm. The differences between the redMPC algorithm and its modified version are summarized in Table 4.1. The first modification is maintaining dual-feasibility at each iteration. This impliessi = c ? ATyi > 0 and rid = 0 for all i. Consequently, the algorithm for updating (x+,s+) and rd has no effect on this dual-feasible algorithm and can be discarded. Furthermore, equations (4.8), (4.12), (4.9), and (4.16) must be computed with the fulldata tosatisfy the above requirementonthe slackvariables. Asaresult, equation (4.19) must be eliminated from the algorithm. The second difference is the formula 94 redMPC Modified redMPC algorithm algorithm Feasibility: Accepts dual-infeasible initial point; Requires dual-feasible strives to achieve dual-feasibility point at each iteration Terms AQD2QATQ, ??xaQ, ??saQ, AQD2QATQ, ??xaQ, xQ with Q: ??xccQ, ??sccQ, xQ, sQ, (rd)Q Centering ?Q =parenleftbig?aQ/?Qparenrightbig3 where ?Q = (1??Q,a)? parameter: ?aQ = (xQ+? p Q,a??x a Q) T(s Q+?dQ,a??saQ) n where ?? 2 ??xQ = ??xaQ + ??xccQ ??xQ = ??xaQ +???xccQ MPC ??y = ??ya + ??ycc ??y = ??ya +???ycc direction: ??sQ = ??saQ + ??sccQ ??s= ??sa +???scc where 0 0 implies bTy+ >bTy), (2) if ?ya is ?small? then ?y and ??Q?Q are also ?small?, and (3) ??dQ ???dQ,a where ? ? (0,1). See Winternitz et al. [28] for details. The fourth modification involvesincorporating lower bounds into the predictor-corrector step. The equations in (4.17) can be replaced with ?pQ = maxbraceleftbig???pa,??pQ ?bardbl?yabardblbracerightbig (4.51) and ?dQ = maxbraceleftbig???dQ, ??dQ ?bardbl?yabardblbracerightbig (4.52) respectively, where the primal (??pQ ?bardbl?yabardbl) and dual (??pQ ?bardbl?yabardbl) lower bounds allow local quadratic convergence. The fifth change to the redMPC algorithm is in 96 the update scheme for the primal variables. The ??xaQ and ??xccQ components of the normal equations remain the same while the update in (4.20) - (4.22) is replaced by parenleftbig?x Q,y+,s+ parenrightbig = parenleftbigx Q +? p Q??x,y+? d Q??y,s+? d Q??s parenrightbig, where ?pQ and ?dQ are given in (4.51) and (4.52), respectively. Let x> 0, xj ?xmax for all j ?N, and k = |Q|. The primal update, x+, is defined as follows: For all j ?Q, x+j = minbraceleftbigmaxparenleftbigminbraceleftbigbardbl?yabardbl? +vextenddoublevextenddouble?xa?vextenddoublevextenddouble? ,xbracerightbig,?xjparenrightbig,xmaxbracerightbig, (4.53) where ?xaj = ? ??? ??? xj + ??xaj j ?Q 0 j ? ?Q , parenleftbig?xa ? parenrightbig j = min braceleftbig?xa j,0 bracerightbig. Using (4.53), we can express ?+Q as ?+Q = parenleftbigx+ Q parenrightbigT s+ Q k . Thus, for all j ? ?Q, we have ?xj = ? + Q s+j , x+j = min{?xj,xmax}. The dual-feasible redMPC algorithm is presented below: ???????????????- The Dual-Feasible redMPC Algorithm Input: (x,y,s) with x> 0,xmax > 0 such that xj ?xmax ?j ?N and s = c?ATy> 97 0, x> 0, ?? 2, ? ? 2, ubnd ? 3m, 0 ? Select the most promising dual constraints 4 Q = Q? ?Q, k = |Q|, ?Q = x T QsQ k . Compute the affine-scaling direction: ??ya = parenleftbigAQS?1Q XQATQparenrightbig?1b, ??sa = ?AT??ya, ??xaQ = ?xQ ?S?1Q XQ??saQ. Compute the affine step: ??pQ,a = ? ??? ?? 1 ifparenleftbig??xaQparenrightbigj ? 0, ?j min(??xa Q)j<0 bracketleftBig ?(xQ)j/parenleftbig??xaQparenrightbigj bracketrightBig otherwise , ??dQ,a = ? ??? ??? 1 if ??saj ? 0, ?j min??saj<0bracketleftbig?(sQ)j/??sajbracketrightbig otherwise , ?Q,a = minbraceleftbig??pQ,a, ??dQ,abracerightbig. Compute the centering parameter: ?Q = (1??Q,a)?. 4The set Q is determined by the algorithm in Section 2.2.2: Selection of Q 98 Compute the centering-corrector direction: ?Q = ?S?1Q ? ?XaQ??SaQeQ ??Q?QS?1Q eQ, ??ycc = ?(AQS?1Q XQATQ)?1AQ?Q, ??scc = ?AT??ycc, ??xccQ = ?Q ?S?1Q XQ??sccQ. Compute the predictor-corrector direction: ??xQ = ??xaQ +???xccQ, ??y = ??ya +???ycc, ??s = ??sa +???scc, where ? ? (0,1] ensures (1)?(3) hold. Compute the predictor-corrector step: ??pQ = ? ??? ??? 1 if (??xQ)j > 0, ?j min(??xQ)j<0 [?(xQ)j/(??xQ)j] otherwise , ??dQ = ?? ?? ??? 1 if ??sj > 0, ?j min??sj<0 [?(sQ)j/??sj] otherwise , ?pQ = maxbraceleftbig???pa, ??pQ ?bardbl?yabardblbracerightbig, ?pQ = maxbraceleftbig???pQ, ??pQ ?bardbl?yabardblbracerightbig. 99 Update the variables: ?xQ = xQ +?pQ??x, y+ = y+?dQ??y, s+ = s+?dQ??s. For all j ?Q, x+j = minbraceleftbigmaxparenleftbigminbraceleftbigbardbl?yabardbl? +vextenddoublevextenddouble?xa?vextenddoublevextenddouble? ,xbracerightbig,?xjparenrightbig,xmaxbracerightbig, where ?xaj = ? ?? ??? xj + ??xaj j ?Q 0 j ? ?Q , parenleftbig?xa ? parenrightbig j = min braceleftbig?xa j,0 bracerightbig. Set ?+Q = parenleftbigx+ Q parenrightbigT s+ Q k . For all j ? ?Q, ?xj = ? + Q s+j , x+j = min{?xj,xmax}. ???????????????- The key to the global convergence analysis is the availability of a dual-feasible point at every iteration, the definition of the mixing parameter ?, and the lower bound condition on the primal updates. A local quadratic convergence analysis follows provided the above conditions hold and the bounds ??pQ ?bardbl?yabardbl and ??dQ ?bardbl?yabardbl 100 are imposed on the predictor-corrector step in equation (4.17). We provided a brief outline of the convergence proof for the modifiedredPDAS algorithm in Chapter 3. The proof for the modified redMPC algorithm follows a similar format. We refer the reader to Winternitz et al. [28] for specific details. 4.5 Summary In this chapter, we introduced theredMPC algorithm and stated how specific changes to the algorithm provide global and local q-quadratic convergence results. With further research, we hope to prove similar results for the redMPC algorithm. In the next chapter, we test the performance of the redPDAS and redMPC algo- rithms against their counterparts without constraint reduction. 101 Chapter 5 Numerical Experiments 5.1 Overview Algorithms redPDAS and redMPC were implemented in MATLAB (v.6.5, R13) and run on an Intel(R) Pentium(R) M Processor CPU 1.60GHz Laptop ma- chine with 512 MB of RAM. Each algorithm was tested against the general version of the algorithm (see Sections 3.2 and 4.2) with reduced normal equations, ?ya = parenleftbigAQS?1Q XQATQparenrightbig?1bracketleftbigb?AS?1Xrdbracketrightbig, ?sa = ?rd ?AT?ya, ?xa = ?x+S?1X?sa. For simplicity,the general version of the algorithms with this slight modification will be referred to as PDAS and MPC. Algorithm redMPC was also tested against a modified version of the reduced MPC algorithm (ipas35) in Tits et. al. [24]. In our numerical experiments, we simply refer to this algorithm by ipas35. The redMPC and ipas35 algorithms require the same initial point condition, (x,s) > 0, which can easily be accommodated using the algorithm of Section 2.2.2. Unfortunately, the redPDAS algorithm could not be tested with the same starting point as the reduced PDAS (rPDAS) from [24] since redPDAS requires s> 0 at each iteration and rPDAS requires s = c?ATy but not s > 0 at each iteration. Although the 102 combination s = c ? ATy > 0 is required for optimality, an initial point which satisfies both conditions simultaneously is difficult to achieve. Therefore, the redP- DAS algorithm was tested just with the PDAS algorithm by using the initial point algorithm of Section 2.2.2. The parameters for redPDAS and redMPC were chosen as ? = .99 and ubnd = ?k where ?k was selected from 25 linearly spaced vector values (rounded to the nearest integer) ranging from 3m to n. The value of ?k is fixed, however k varies from iteration to iteration such that 3m ? k ? ?k ? n. The stopping criterion was adapted from [17] and is based on the error in the duality gap, vextendsinglevextendsinglecTx?bTyvextendsinglevextendsingle/maxparenleftbigcTx,1parenrightbig 10?5, the time for the redPDAS algorithm to solve an LP nearly triples relative to the PDAS algorithm. This is most likely caused by an insufficient number of dual constraints in the working set at later iterations. The subroutine for determining Q guarantees the set contains at least m indices at each iteration and that its size is between lbnd = min(ubnd,3m) and ubnd = ?k where ?k is an element from 25 linearly spaced vector values (rounded to the nearest integer) ranging from 3m to n. However, as the algorithm approaches the optimal solution, mxj/sj ratios tend to infinity since sj ? 0 for all j in the optimal set. If C is large enough, the selection of ratios xj/sj greater than C ?max(xj/sj) (i.e. j ? Q) may be too small at early iterations. If too few (or zero) ratios are chosen, theredPDAS algorithm definesQto contain the indicesj fromn?lbnd+1 tonof the largest ratios where lbnd = min(3m,ubnd). Since ubnd = n, m= 50, and n = 20000, the number of dual constraints in the working set may be as small as 149 (out of 20000) in early iterations. This can cause the algorithm to take a large number of iterations before reaching the optimal solution. According to Figure 5.1, the redPDAS algorithm performs best against the PDAS algorithm when C is set 10?8. In the experiments that follow, the redPDAS and PDAS algorithms solve 25 randomly generated problems from each test set (TAW, RAND, and SCSD) over various values of 3m< ?k ?n with C fixed to 10?8. A comparison of the total CPU time, the CPU time to form and solve the normal matrix, and the mean number of iterations is shown. Each one is plotted against various ?k/n ratios. In many cases, the graphs are missing data values for small ?k/n values. These missing values are caused by one of two scenarios: (i) the algorithm fails to terminate in a ?reasonable? 108 amount of time (maximum time alloted to solve an LP is set to 106 seconds) or (ii) MATLAB generates NaN (not a number) values for the solution as a result of the normal matrix becoming numerically singular. We begin by examining the experiments based on the first set of test problems (TAW1 - TAW5). 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 5 10 CPU time k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 2 4 6 CPU time: normal mat k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 50 100 Number of iterations k^/n redPDAS PDAS Figure 5.2: Comparison of the redPDAS algorithm versus PDAS algorithm using 25 randomly generated test problems from TAW1 (constraints tangent to the unit sphere) with m = 50, n = 20000, and ubnd = ?k where 150 ? ?k ? 20000. 109 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 5 10 15 20 25 CPU time k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 5 10 15 20 CPU time: normal mat k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 150 100 150 200 Number of iterations k^/n redPDAS PDAS Figure 5.3: Comparison of the redPDAS algorithm versus PDAS algorithm using 25 randomly generated test problems from TAW2 (random [normal] constraints) with m = 50, n = 20000, and ubnd = ?k where 150 ? ?k ? 20000. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 5 10 15 20 25 CPU time k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 5 10 15 20 25 CPU time: normal mat k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 50 100 150 200 Number of iterations k^/n redPDAS PDAS Figure 5.4: Comparison of the redPDAS algorithm versus PDAS algorithm using 25 randomly generated test problems from TAW3 (Random as in WT-18July2003) with m = 50, n = 20000, and ubnd = ?k where 150 ? ?k ? 20000. 110 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 5 10 15 20 25 CPU time k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 5 10 15 20 CPU time: normal mat k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 150 100 150 200 250 300 Number of iterations k^/n redPDAS PDAS Figure 5.5: Comparison of the redPDAS algorithm versus PDAS algorithm using 25 randomly generated test problems from TAW4 (RandomLP) with m = 50, n = 20000, and ubnd = ?k where 150 ? ?k ? 20000. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 5 10 CPU time k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 2 4 6 CPU time: normal mat k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 50 100 150 Number of iterations k^/n redPDAS PDAS Figure 5.6: Comparison of the redPDAS algorithm versus PDAS algorithm using 25 randomly generated test problems from TAW5 (SIPND) withm = 50,n = 20000, and ubnd = ?k where 150 ? ?k ? 20000. 111 For the first test set (TAW), the PDAS algorithm displayed (in general) an increase in CPU time and a constant number of iterations to solve an LP over increasing values of the ratio ?k/n. On the other hand, the redPDAS algorithm re- mained (in general) constant in CPU time while the number of iterations fluctuated and often times exceeded the general algorithm. The time difference between the algorithms became increasing large as ?k/n increased, with the redPDAS algorithm outperforming PDAS for every test problem with ?k/n>.2. We now examine the experiments based on the second set of test problems (RAND1 - RAND5). 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 0.01 0.02 0.03 0.04 CPU time k^/n 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 2 4 6 x 10 ?3 CPU time: normal mat 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 110 20 30 40 Number of iterations k^/n redPDAS PDAS k^/n Figure 5.7: Comparison of the redPDAS algorithm versus PDAS algorithm using 25 randomly generated test problems from RAND1 with m = 5, n = 100, and ubnd = ?k where 15 ? ?k? 100. 112 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 0.1 0.2 0.3 0.4 CPU time k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 0.05 0.1 CPU time: normal mat k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 100 200 300 Number of iterations k^/n redPDAS PDAS__mod Figure 5.8: Comparison of the redPDAS algorithm versus PDAS algorithm using 25 randomly generated test problems from RAND2 with m = 12, n = 500, and ubnd = ?k where 36 ? ?k? 500. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 0.5 1 1.5 2 2.5 CPU time k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 0.2 0.4 0.6 CPU time: normal mat k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 500 1000 Number of iterations k^/n redPDAS PDAS Figure 5.9: Comparison of the redPDAS algorithm versus PDAS algorithm using 25 randomly generated test problems from RAND3 with m = 24, n = 780, and ubnd = ?k where 72 ? ?k? 780. 113 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 0.5 1 CPU time k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 0.1 0.2 0.3 0.4 CPU time: normal mat k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 50 100 150 200 Number of iterations k^/n redPDAS PDAS Figure 5.10: Comparison of the redPDAS algorithm versusPDAS algorithm using 25 randomly generated test problems from RAND4 with m = 58, n = 1004, and ubnd = ?k where 174 ? ?k ? 1004. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 2 4 6 8 CPU time k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 1 2 3 4 5 CPU time: normal mat k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 100 200 300 400 Number of iterations k^/n redPDAS PDAS Figure 5.11: Comparison of the redPDAS algorithm versusPDAS algorithm using 25 randomly generated test problems from RAND5 with m = 75, n = 2016, and ubnd = ?k where 225 ? ?k ? 2016. 114 The performance of the algorithms on the second test set was quite different. The graphs show missing data values for ?k/n<.15. For the remaining ?k/n values, thePDAS algorithm was generally constant in CPU time and number of iterations. The redPDAS displayed a very erratic behavior on almost all of the test sets in this category. Experiments using data sets RAND4 and RAND5 show more favorably for the redPDAS algorithm than those using data sets RAND1 - RAND3, possibly due to mlessmuchn in the first case. Finally,weexaminethe experimentsbasedon theNetlibtestproblems(SCSD1, SCSD6, and SCSD8). 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 0.05 0.1 0.15 0.2 CPU time k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 0.02 0.04 CPU time: normal mat k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 110 15 20 Number of iterations k^/n redPDAS PDAS Figure 5.12: Comparison of the redPDAS algorithm versusPDAS algorithm using the Netlib test problem SCSD1 with m = 77, n = 760, and ubnd = ?k where 231 ? ?k ? 760. 115 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10.2 0.25 CPU time k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.04 0.06 0.08 0.1 CPU time: normal mat k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 10 20 Number of iterations k^/n redPDAS PDAS Figure 5.13: Comparison of the redPDAS algorithm versusPDAS algorithm using the Netlib test problem SCSD6 with m = 147, n = 1350, and ubnd = ?k where 441 ? ?k? 1350. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 11.5 2 2.5 CPU time k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10.05 0.1 0.15 0.2 CPU time: normal mat k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 110 20 30 40 Number of iterations k^/n redPDAS PDAS Figure 5.14: Comparison of the redPDAS algorithm versusPDAS algorithm using the Netlib test problem SCSD8 with m = 397, n = 2750, and ubnd = ?k where 1191 ? ?k ? 2750. 116 For the third test set (Netlib problems), a large portion of the data is missing for the early ?k/nvalues. ThePDAS algorithm remainsconsistent initsperformance on CPU time and number of iterations on the SCSD problems. In addition, it outperforms the redPDAS algorithm for some ?k/n values between 0.3 and 0.7. However, the redPDAS and PDAS algorithms performance is almost identical in CPU time and number of iterations as ?k/n? 1. In the next section, we examine experiments based on Mehrotra?s Predictor- Corrector Method. 5.3 redMPC Experiments The redMPC experiments test the performance of the redMPC algorithm against the reduced MPC algorithm in Tits et. al [24] (ipas35) and a general MPC algorithm with reduced normal equations (MPC). Figure (5.15) shows the average time (in seconds) to solve 50 randomly generated problems over varying values of C using the test problem TAW4 with m = 50, n = 20000, and |Q| ? [3m,ubnd] where ubnd is fixed to n. 117 10?15 10?10 10?5 100 2 3 4 5 6 7 8 C Average Time to Solve 50 Problems (seconds) Reduced MPC on RandomLP, m = 50, n = 20000 redMPC MPC Figure 5.15: Performance of the redMPC algorithm against the MPC using test problem TAW4 with m = 50,n = 20000, |Q| ? [3m,ubnd] where ubnd is fixed to n. The average time (in seconds) to solve 50 randomly generated problems from TAW4 is shown over varying values of C ranging from 10?16 to 10?1. 118 The redMPC outperformed the MPC algorithm for all values of C between 10?16 and 10?1. In the experiments that follow, the redMPC, MPC, and ipas35 algorithms solve 25 randomly generated problems from each test set (TAW, RAND, and SCSD) over various values of 3m< ?k ?n with C fixed to 10?4. A comparison of the total CPU time, the CPU time to form and solve the normal matrix, and the mean number of iterations is shown. Each one is plotted against various ?k/n ratios. We begin by examining the experiments based on the first set of test problems (TAW1 - TAW5). 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 2 4 6 8 CPU time k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 1 2 3 4 5 CPU time: normal mat k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 110 15 20 25 Number of iterations k^/n redMPC ipas35 MPC Figure 5.16: Comparison of the redMPC algorithm versus the MPC and ipas35 algorithms using 25 randomly generated test problems from TAW1 (constraints tangent to the unit sphere) with m = 50, n = 20000, and ubnd = ?k where 150 ? ?k ? 20000. 119 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 12 4 6 8 10 CPU time k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 2 4 6 CPU time: normal mat k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 20 25 30 35 Number of iterations k^/n redMPC ipas35 MPC Figure 5.17: Comparison of the redMPC algorithm versus the MPC and ipas35 algorithmsusing 25 randomly generatedtest problemsfrom TAW2(random [normal] constraints) with m= 50, n = 20000, and ubnd = ?k where 150 ? ?k? 20000. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 12 4 6 8 10 CPU time k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 2 4 6 CPU time: normal mat k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 20 25 30 35 Number of iterations k^/n redMPC ipas35 MPC Figure 5.18: Comparison of the redMPC algorithm versus the MPC and ipas35 algorithms using 25 randomly generated test problems from TAW3 (Random as in WT-18July2003) with m= 50, n= 20000, and ubnd = ?k where 150 ? ?k ? 20000. 120 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 2 4 6 8 10 CPU time k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 2 4 6 CPU time: normal mat k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 20 25 30 35 Number of iterations k^/n redMPC ipas35 MPC Figure 5.19: Comparison of the redMPC algorithm versus the MPC and ipas35 algorithms using 25 randomly generated test problems from TAW4 (RandomLP) with m = 50, n = 20000, and ubnd = ?k where 150 ? ?k ? 20000. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 2 4 6 8 CPU time k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 1 2 3 4 5 CPU time: normal mat k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 15 20 25 Number of iterations k^/n redMPC ipas35 MPC Figure 5.20: Comparison of the redMPC algorithm versus the MPC and ipas35 algorithms using 25 randomly generated test problems from TAW5 (SIPND) with m = 50, n = 20000, and ubnd = ?k where 150 ? ?k ? 20000. 121 For problem sets TAW2-TAW4, theMPC andipas35 algorithms displayed(in general) an increase in CPU time and a constant number of iterations to solve an LP over increasing values of the ratio ?k/n. TheredMPC algorithm, however, remained (in general) constant in CPU time while the number of iterations fluctuated and exceeded the competition. In problem sets TAW1 and TAW5, theMPC and ipas35 algorithms show an increase in CPU time as ?k/n increases from .1 to .8, then a small decrease in time for ?k/n values between .8 to 1. The number of iterations to solve an LP fluctuates for all of the algorithms. In all of the problem sets (TAW1 - TAW5), the time difference between the MPC and ipas35 algorithms and the redMPC algorithm became increasing large as ?k/n increased, with the redMPC algorithm outperforming the other algorithms in every test set for all ?k/n>.2. We now examine the MPC experiments based on the second set of test prob- lems (RAND1 - RAND5). 122 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 0.01 0.02 0.03 0.04 CPU time k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 2 4 6 x 10 ?3 CPU time: normal mat k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 10 15 20 Number of iterations k^/n redMPC ipas35 MPC Figure 5.21: Comparison of the redMPC algorithm versus the MPC and ipas35 algorithms using 25 randomly generated test problems from RAND1 with m = 5, n = 100, and ubnd = ?k where 15 ? ?k ? 100. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.02 0.04 0.06 0.08 0.1 CPU time k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 0.01 0.02 0.03 0.04 CPU time: normal mat k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 15 20 25 Number of iterations k^/n redMPC ipas35 MPC Figure 5.22: Comparison of the redMPC algorithm versus the MPC and ipas35 algorithms using 25 randomly generated test problems from RAND2 with m = 12, n = 500, and ubnd = ?k where 36 ? ?k ? 500. 123 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 0.05 0.1 0.15 0.2 CPU time k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 0.05 0.1 CPU time: normal mat k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 15 20 25 30 35 Number of iterations k^/n redMPC ipas35 MPC Figure 5.23: Comparison of the redMPC algorithm versus the MPC and ipas35 algorithms using 25 randomly generated test problems from RAND3 with m = 24, n = 780, and ubnd = ?k where 72 ? ?k ? 780. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 0.1 0.2 0.3 0.4 CPU time k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 0.1 0.2 CPU time: normal mat k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 15 20 25 30 35 Number of iterations k^/n redMPC ipas35 MPC Figure 5.24: Comparison of the redMPC algorithm versus the MPC and ipas35 algorithms using 25 randomly generated test problems from RAND4 with m = 58, n = 1004, and ubnd = ?k where 174 ? ?k ? 1004. 124 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.5 1 1.5 CPU time k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.2 0.4 0.6 0.8 1 CPU time: normal mat k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 15 20 25 30 35 Number of iterations k^/n redMPC ipas35 MPC Figure 5.25: Comparison of the redMPC algorithm versus the MPC and ipas35 algorithms using 25 randomly generated test problems from RAND5 with m = 75, n = 2016, and ubnd = ?k where 225 ? ?k ? 2016. The performance of the algorithms on the second test set was quite different from the performance on the TAW test set. It can be observed that the graphs show missing data values for ?k/n < .15. For the remaining ?k/n values, the performance of the algorithms varied. Unlike the redPDAS algorithm, the redMPC algorithm generally solved an LP in less time than the MPC and ipas35 algorithms. In fact, as the problem size increased, the total CPU time between the MPC and ipas35 algorithms and the redMPC algorithm increased. With the exception of the RAND1 problem set, the same trend occurred when determining the CPU time to form and solve the normal matrix. In all of the problem sets, the algorithms generally solved an LP in the same number of iterations for ?k/n close to 1. For all other ?k/n values, the number of iterations to solve an LP varied for each algorithm. 125 Lastly, we examinethe experimentsbased on the Netlibtest problems(SCSD1, SCSD6, and SCSD8). 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 0.05 0.1 0.15 0.2 CPU time k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10 0.05 0.1 0.15 CPU time: normal mat k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 10 15 20 Number of iterations k^/n redMPC ipas35 MPC Figure 5.26: Comparison of the redMPC algorithm versus the MPC and ipas35 algorithms using the Netlibtest problem SCSD1 withm= 77,n = 760, andubnd = ?k where 231 ? ?k ? 760. 126 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10.2 0.4 0.6 0.8 CPU time k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.2 0.4 CPU time: normal mat k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 110 15 20 25 Number of iterations k^/n redMPC ipas35 MPC Figure 5.27: Comparison of the redMPC algorithm versus the MPC and ipas35 algorithms using the Netlib test problem SCSD6 with m = 147, n = 1350, and ubnd = ?k where 441 ? ?k ? 1350. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 2 3 4 CPU time k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.5 1 1.5 2 2.5 CPU time: normal mat k^/n 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 10 15 20 Number of iterations k^/n redMPC ipas35 MPC Figure 5.28: Comparison of the redMPC algorithm versus the MPC and ipas35 algorithms using the Netlib test problem SCSD8 with m = 397, n = 2750, and ubnd = ?k where 1197 ? ?k ? 2750. 127 For the third test set (Netlib problems), a large portion of the data is missing for the early ?k/n values. The redMPC and MPC outperform ipas35 in CPU time. As ?k/n increases, the time to solve an LP with ipas35 increases whereas the time to solve an LP with the other algorithms is small and fixed. The number of iterations to solve test problem SCSD8 for the redMPC algorithm is somewhat larger than for the other algorithms. For test problems SCSD1 and SCSD6, the number of iterations to solve is almost identical as ?k/n? 1. 5.4 Discussion The experiments in the previous sections demonstrate that the redPDAS and redMPC, in general, outperform their general counterparts in terms of CPU time. The redMPC and MPC algorithms also perform well against ipas35. Further- more, with the exception of a few cases (i.e. the small values of ?k/n where the algorithm(s) did not display information), the algorithms in this chapter outper- formed MATLAB?s LP solver, LINPROG, in total CPU time. In this section, we will give reason or speculation as to a few of the outcomes. Figures (5.1) and (5.15) show the average time (in seconds) to solve 50 ran- domly generated problems over varying values of C. These experiments were con- ductedusing the test problem TAW4 withm = 50,n = 20000, andubnd fixedton. It was shown that the ?reduced? algorithms performed exceptionally well against their general counterparts. We also performed these experiments on other test problems with m lessmuch n. Although there was some fluctuation in the value of C, it remained 128 within 1/100 of the value we found in each case. Therefore, we felt confident the ?reduced? algorithms would perform well on other test problems with mlessmuchn based on our specific choice of C determined by one of the problem sets used in this thesis, TAW4. The redPDAS and redMPC algorithms (in general) displayed small CPU times and a large number of iterations over the various ?k/n values. The small CPU times can be attributed to the initial point calculation, the formation of ?reduced? matrices and vectors, and temporary matrices/vectors defined for AQ, DQ, and the dual residual, (rd)Q. On the other hand, the large number of iterations are the result of the update strategy used for the dual residual and solution. Since a small number of constraints (lessmuchn) are chosen at each iteration, the algorithms have the potential of cycling through a large number of solutions before reaching the optimal solution. MATLAB?s LP Interior-Point Solver, LIPSOL, is capable of solving a wide range of problems including large-scale LPs, see [33], [34]. On our test problems (i.e. problems with m lessmuch n), the ?reduced? algorithms outperformed LIPSOL in CPU time. In experimentation, we discovered by first converting a sparse matrix A to a full matrix and then using a QR-factorization of A to solve for the initial point, the time to formulate the initial point is significantly reduced. Prior to this conversion, approximately80% of theCPU timewas devotedtocalculatingthe initialpoint, thus causing the time to solve the ?reduced? algorithms to exceedthe time for LIPSOL to solve the same LP. All of the algorithms (including LIPSOL), calculate the normal matrix using MATLAB?s Cholesky-Infinity factorization (cholinc function). This is necessary to preserve full rank and calculate efficient search directions. 129 5.5 Summary Inthischapter, we havedemonstratedthat algorithmsredPDAS andredMPC are more effective than their general counterparts in solving LPs with mlessmuchn. This is clearly shown by the results displayed using the TAW problem sets (with m= 50 and n = 20000). In the second and third test sets, the difference between the m and n values was not as large. As a result, the general algorithms performed just as well and in some cases even better than the ?reduced? algorithms. The ipas35 algorithm was least effective in solving LPs compared to the redMPC and MPC algorithms. 130 Chapter 6 Conclusions and Further Study We have presented theredPDAS algorithm and the redMPC algorithm, new column generation algorithms for solving linear programming problems (LPs). Ex- periments conducted on these algorithms demonstrate that they are, in general, more effective for solving LPs with mlessmuchn than their general counterparts (PDAS and MPC). Although there is no known convergence proof at this time for the redPDAS and redMPC algorithms, we believe our column generation strategy will prove to be beneficial in solving other optimization problems as well. In this dissertation, we demonstrated that by reducing the m ? m matrix AD2AT, an essential component for solving for the search direction on large-scale LPs, we can reduce the amount of computations per iteration and significantly re- duce the time to solve a LP problem. This was illustrated by the results of the experiments conducted using the random problem sets from TAW. The time differ- ence between the ?reduced? and general algorithms became increasing large as k/n increased, with the ?reduced? algorithms outperforming their general counterparts on every test problem from this set with k/n > .2. Although the total number of iterations to solve the ?reduced? LP often exceeded that of the general problem, the computational cost per iteration of the ?reduced? problem was significantly lower due to its size. The ?reduced? algorithms were not as effective in reducing solution 131 time for the other problem sets. In these problem sets, the difference between m and n was not significantly large. As a result, the general algorithms performed just as well and in some cases even better than the ?reduced? algorithms. Based on the results of the experiments conducted using the random problem sets with mlessmuchn, we believe our column generation scheme can be applied to other algorithms with similar success. One algorithm we have begun to investigate is Potra?s algorithm [22]. Potra?s algorithm is of particular interest because he allows for a primal-dual infeasible starting point and proves his algorithm has polynomial complexity under certain conditions. Potra uses a three-step method to improve feasibility, optimality, and centrality of the iterates. He combines the two steps of Mizuno-Todd-Ye [18] with a third to achieve feasibility and optimality at the same rate. We developed a ?reduced? version of Potra?s algorithm called redPC, presented in Appendix B. The redPC algorithm also uses this three-step method, however feasibility and optimality may not necessarily be achieved at the same rate. Despite this, we aim to show global convergence and investigate the complexity of our algorithm. In addition to further investigating the redPC algorithm, there are computa- tional issues that can be researched in all of the ?reduced? algorithms. One topic of research is the method in which the set Q is chosen. We chose Q based on the k ? [m,n] largest xj/sj ratios at each iteration. Tits et al. [24], however, selects a fixed index set Q associated with the most ?promising? dual constraints based on the smallest slack values, sj, at each iteration. Dantzig and Ye [6] start with m constraints and consider building up the constraint set based on the index set 132 associated with all sj < 2?epsilon1 at each iteration. There are a number of ways to select the index set Q and determining if one technique saves time and computational ef- fort over others is worth investigating. Another topic of investigation is the update strategy used within our ?reduced? algorithms. Our update strategy is based on the amount of progress we are making towards satisfying dual feasibility. However, Potra?s algorithm uses an update strategy based on improving feasibility, optimal- ity, and centrality of the iterates at the same rate. Experimentation on the redPC algorithm may show a reduction in the total number of iterations using this update concept. One other focus for research is determining a ?good? initial point. The initial point we used in the redPDAS and redMPC algorithms is the same initial point used in Mehrotra?s algorithm [17]. However, there are many papers in the field of interior point methods that concentrate on starting-point strategies (see [7] or [14]) Our algorithms perform quite well overall against standard algorithms and we may be able to improve our results by using a better starting point. Finally, the use of interior-point methods in column generation settings has been extended to applications in semi-definite programming problems, network de- sign problems, and second order cone programming problems, to name a few. We believe our strategy will be very beneficial for solving other LPs and eventually for solving more general problems. 133 Appendix A Topics in Numerical Optimization A.1 Steepest-Descent Method Consider the n-dimensional problem min x?Rn {f(x) : x?S} (A.1) where S is the set of feasible points defined by a set of constraints (or Rfracturn). If (A.1) is optimal at the current solution x, then x solves the problem. Otherwise, assume x is feasible and let x+ = x+??x be the updated solution to (A.1) with search direction ?x and step length ? > 0. Using only the first two terms of the Taylor?s series, we can approximate f(x+) by f(x+) = f(x+??x), ? f(x) +??xT?f(x) where ?f(x) is the gradient of f at x. Therefore, for small ?, f(x+??x) 0 is then computed to determine x+ so that x+ ?S. In the case of a linear program min x?Rfracturn braceleftbigcTx : Ax= b, x? 0bracerightbig, (A.2) 134 where A is an m?n matrix of full row rank, the steepest descent method computes ?x = ?c as the search direction if the current solution is not optimal. Assuming the current solution x is feasible, the updated solution x+ is determined to maintain feasibility (i.e. x+ ?S = {Ax= b, x? 0}). Therefore we require Ax+ = Ax+?A?x= b (A.3) and x+ =x+??x>0. (A.4) The equations in (A.3) imply that we need A?x = 0 since x ?S and ? > 0. That is, ?x must lie in the null space of A. By premultiplying ?x = ?c by the n?n matrix PA = I ?AT parenleftbigAATparenrightbig?1A we have a projected steepest descent direction (i.e. ?x = ?PAc) which simultane- ously decreases the objective function value and satisfies A?x = 0. The matrix PA has the effect of projecting any nonzero vector in Rfracturn onto the null space of A. Further discussion of this topic is given in the next section. The inequalitiesin (A.4) are satisfied by computing an appropriate step length ?> 0 along the search direction ?x= ?PAc. A.2 Orthogonal Projection Matrix The n?n matrix PA =I ?AT parenleftbigAATparenrightbig?1A, 135 which was presented in the previous section, is also termed an orthogonal projection matrix. Its name comes from the fact that premultiplying any n-dimensional vector x byPA produces an orthogonal projection of x onto the null space of A. See Figure A.1. Figure A.1: A n-dimensional vectorx is projected onto the null space of A [denoted N(A)] by the orthogonal projection matrix P = I ?AT(AAT)?1A. The orthogonal projection matrix is symmetric(i.e. PTA = PA) and idempotent (i.e. P2A = PA). In addition, it can be formed from any m?n matrix of full row rank. For example, the matrix PAD = I?DAT parenleftbigAD2ATparenrightbig?1AD, is an orthogonal projection matrix into the null space of AD where D is a diagonal matrix. 136 Appendix B The redPC Algorithm Potra?s three-step method involves solving three linear systems to improve feasibility, optimality, and centrality of the iterates. Our method follows Potra?s method by solving three ?reduced? linear systems inan attempt to reachthe optimal solution for the primal-dual pair. In solving the first linear system, we aim to improve optimality while keeping feasibility the same. In the second linear system, we strive to improve feasibility while keeping optimality the same. Finally, in the third linear system, we aim to ?centralize? the iterates for improved movement within the next iteration. Our ?reduced? version is similar to the redPDAS and redMPC algorithms in that we consider only those columns (components) of a matrix (vector) associated with Q. B.1 Background Let rp =Ax?b and rd = ATy+s?c be the primal and dual residuals, respec- tively. The searchdirectioncomputedtoimproveoptimalityor decrease?isgivenby ? ?? ?? ?? ? A 0 0 0 AT I S 0 X ? ?? ?? ?? ? ? ?? ?? ?? ? ?xa ?ya ?sa ? ?? ?? ?? ? = ? ? ?? ?? ?? ? 0 0 XSe ? ?? ?? ?? ? . (B.1) 137 The expressions for the components of ?za = (?xa,?ya,?sa) are given by ?ya = (AS?1XAT)?1Ax, ?sa = ?AT?ya, ?xa = ?x?S?1X?sa. This is precisely the affine-scaling direction in (3.9) with rp = rd = 0. On the other hand, the search direction computed to improve feasibility or decrease the residuals is given by ? ?? ?? ?? ? A 0 0 0 AT I S 0 X ? ?? ?? ?? ? ? ?? ?? ?? ? ?xa ?ya ?sa ? ?? ?? ?? ? = ? ? ?? ?? ?? ? rp rd 0 ? ?? ?? ?? ? . (B.2) This is the affine-scaling direction in (3.9) with XSe = 0. Solving for ?za = parenleftbig?xa,?ya,?saparenrightbig, we have ?ya = ?(AS?1XAT)?1(rp +AS?1Xrd), ?sa = ?rd ?AT?ya, ?xa = ?S?1X?sa. The sum of ?za and ?za is the affine-scaling direction. Let z = (x,y,s). Step lengths ?? and ?? are chosen so that the point ?z = z+ ???za + ???za ?N? where N? = braceleftBig (tildewidex,tildewides) :tildewidex>0,tildewides> 0, vextenddoublevextenddouble vextenddoubletildewideXtildewides?tildewide?e vextenddoublevextenddouble vextenddouble??tildewide? bracerightBig 138 with 0 epsilon1,bardblrpbardbl>epsilon1, or bardblrdbardbl>epsilon1 Compute affine-scaling directions: I. Compute ?za = (?xa,?sa,?ya): ?ya = parenleftbigAS?1XATparenrightbig?1Ax, ?sa = ?AT?ya, ?xa = ?x?S?1X?sa. 140 II. Compute ?za =parenleftbig?xa,?sa,?yaparenrightbig : ?ya = ?(AS?1XAT)?1(rp +AS?1Xrd), ?sa = ?rd ?AT?ya, ?xa = ?S?1X?sa. Compute affine steps: Set ? = (?x a)T ?sa + (?sa)T ?xa n? , ? = parenleftbig?xaparenrightbigT ?sa n? , and compute ?? = minparenleftbigbraceleftbig?? (0,1) : ??R(?) ?parenleftbig?2?2 ?bardblfbardbl2parenrightbigbracerightbig?{1}parenrightbig, ?? = parenleftbigg1+??? 1???? parenrightbigg ??. Compute ?z ?N?: Compute ?z = z+ ???za+ ???za. If ?? = 1, then ?z ?F? (Optimal solution set). Terminate. Compute centering-corrector direction: ?ycc = (AtildewideS?1 tildewideXAT)?1A parenleftBig tildewidex?tildewide?tildewideS?1e parenrightBig , ?scc = ?AT?ycc, ?xcc = ?tildewidex+ tildewideS?1 parenleftBig tildewide?e? tildewideX?scc parenrightBig . Update solution Compute z+ = ?z+ ?zcc. 141 end(while) ???????????????- B.3 The redPC Algorithm The redPC algorithm is simply a ?reduced? version of Potra?s algorithm. This algorithm is similar to the redPDAS and redMPC algorithms in that we consider only those columns (components) of a matrix (vector) associated with Q. We let ?rp = AQxQ ?b be the primal residual based on Q and define ??? to be a parameter such that 0 ? ??? < ??? 1. The redPC is presented below: ???????????????- Input: (x,y,s) withx = ?e? bardblA?bbardbl? e, s= ?e? bardblcbardbl? e, andy = 0;mepsilon1,bardblrpbardbl>epsilon1, or bardblrdbardbl>epsilon1 Select the most promising dual constraints. Compute affine-scaling directions: 142 I. Compute ?zaQ =parenleftbig?xaQ,?saQ,?yaparenrightbig ?ya = parenleftbigAQS?1Q XQATQparenrightbig?1AQxQ, ?saQ = ?ATQ?ya, ?xaQ = ?xQ ?S?1Q XQ?saQ. II. Compute ?zaQ =parenleftbig?xaQ,?saQ,?yaparenrightbig ?ya = ?parenleftbigAQS?1Q XQATQparenrightbig?1bracketleftbig?rp +AQS?1Q XQ(rd)Qbracketrightbig, ?saQ = ?(rd)Q ?ATQ?ya, ?xaQ = ?S?1Q XQ?saQ. Compute affine steps: Set ? = parenleftbig?xa Q parenrightbigT ?sa Q + parenleftbig?sa Q parenrightbigT ?xa Q n? , ? = parenleftbig?xa Q parenrightbigT ?sa Q n? , ? = parenleftbig?xa Q parenrightbigT ?sa Q n? , ? = (1? ?? ?)xT? Q?s a ?Q n? , ? = x T QsQ n? , and compute ?? = minparenleftbigbraceleftbig?? (0,1) : ?? ?S(?) ??4parenleftbig?2?2 ?bardblfbardbl2parenrightbigbracerightbig?{1}parenrightbig, ??? = ???, ?? = parenleftbigg1 +? + [??(1??)?] ?? ????? parenrightbigg ??. 143 Compute ?z: Set ?xa?Q = 0, ?xa?Q = ?????? x?Q, ?sa?Q = ?AT?Q?ya, ?sa?Q = (rd)?Q ?AT?Q?ya and compute ?z = z + ???za + ???za. If ?? = 1, then ?z ? F? (Optimal solution set). Terminate. Compute centering-corrector direction: ?ycc = (AQtildewideS?1Q tildewideXQATQ)?1AQ parenleftBig tildewidexQ ?tildewide?tildewideS?1Q eQ parenrightBig , ?sccQ = ?ATQ?ycc, ?xccQ = ?tildewidexQ + tildewideS?1Q parenleftBig tildewide?eQ ? tildewideXQ?sccQ parenrightBig . Update solution Set ?xcc?Q = 0, ?scc?Q = 0 and compute z+ = ?z+ ?zcc. end(while) ???????????????- In Potra?s algorithm, the affine-scaling directionsare computedin two separate steps to improve feasibility and optimality. Using the separate components of the affine-scaling direction, Potra derives and incorporates the affine steps, ??and ??, into an intermediate solution (?z = z+ ???za + ???za) to determine optimality. Although the details surrounding the affine steps have been eliminated in this thesis, they can be easily followedin Potra [22]. If the solution is not optimal, the centering-corrector direction is computed to produce a new solution. There are signficant differences between Potra?s algorithm and the redPC al- gorithm. The main difference is the ?reduced? size of the matrices and vectors based on the set Q. Reducing the size of the data, however, drastically increased 144 the amount of mathematical computation and detail in deriving the affine steps for the intermediate solution. Although the redPC algorithm can be shown to exhibit global convergence, we have been unsuccessful in proving polynomial complexity. Since the components of the directions in ?Q can be defined in different ways based on the choice of ???, variants of the redPC algorithm will be considered in future research. 145 BIBLIOGRAPHY [1] M. Achache, H. Roumili, and A. Keraghel. A numerical study of an infeasible primal-dual path-following algorithm for linear programming. Applied Mathe- matics and Computation, 186:1472?1479, 2007. [2] Various Authors. Netlib lp test problems. http://www-fp.mcs.anl.gov/OTC/ Guide/TestProblems/LPtest/, 1985. Retrieved July 2004. [3] M. S. Bazaraa, H. D. Sherali, and C. M. Shetty. Nonlinear Programming: Theory and Algorithms 2nd Ed. John Wiley and Sons, Hobeken, NJ, 1993. [4] DimitrisBertsimasand John N. Tsitsiklis. Introduction to Linear Optimization. Athena Scientific, Belmont, MA, 1997. [5] George B. Dantzig. Linear Programming and Extensions. Princeton University Press, Princeton, NJ, 1963. [6] George B. Dantzig and Yinyu Ye. A build-up interior-point method for linear programming: Affinescalingform. Working Paper, Department of Management Science, University of Iowa, 1991. [7] Michael Gertz, Jorge Nocedal, and Annick Sartenaer. A starting-point strategy for nonlinear interior methods. Applied Math Letters, 17:945?952, 2004. [8] P.C. Gilmore and R.E. Gomory. A linear programming approach to the cutting stock problem. Operations Research, 9:849?859, 1961. [9] J.-Louis Goffin, Z.-Quan Luo, and Y. Ye. On the Complexity of a Column Generation Algorithm for Convex or Quasiconvex Feasibility Problems. Large Scale Optimization: State of the Art, W. W. Hager, D. W. Hearn and P.M. Pardalos, Editors, Kluwer Academic Publishers B.V, 1994. [10] G. H. Golub and C. F. Van Loan. Matrix Computations. John Hopkins Uni- versity Press, Baltimore, MD, 1989. [11] D.den Hertog, Cornelis Roos, and T?amas Terlaky. A build-up variant of the logarithmic barrier method for lp. Operations Research Letters, 12:181?186, 1992. [12] D.den Hertog, Cornelis Roos, and Tamas Terlaky. Adding and deleting con- straints in the logarithmic barrier method for lp. Advances in Optimization and Approximation, pages 166?185, 1994. [13] Wikipedia Foundation Inc. Matrix norm. http://en.wikipedia.org/wiki/ Matrix\_norm, 2002. Retrieved May 2007. 146 [14] Elizabeth John and E. Alper Yildirim. Implementation of warm-start strate- gies in interior-point methods for linear programming in fixed dimension. http://www.optimization-online.org/DB_FILE/2006/05/1389.pdf, 2006. Retrieved April 2008. [15] Narendra Karmarkar. A new polynomial-time algorithm for linear program- ming. Combinatorica, 4:373?395, 1984. [16] Masakazu Kojima, Nimrod Megiddo, and Shinji Mizuno. Polynomiality of infeasible-interior-point algorithms for linear programming. Mathematical Pro- gramming, 67:109?119, 1994. [17] Sanjay Mehrotra. On the implementation of a primal-dual interior point method. SIAM Journal on Optimization, 2:575?601, 1992. [18] S. Mizuno, Michael J. Todd, and Yinyu Ye. On adaptive-step primal-dual interior-point algorithms for linear programming. Mathematical Operations Re- search, 18:964?981, 1993. [19] Renato D.C. Monteiro, Ilan Adler, and Mauricio G.C. Resende. A polynomial- time primal-dual affine scaling algorithm for linear and convex quadratic pro- gramming and its power series extension. SIAM Journal on Optimization, 2(1):7?20, 1990. [20] Stephen G. Nash and Ariela Sofer. Linear and Nonlinear Programming. McGraw-Hill, New York, NY, 1996. [21] E.R. Panier, Andr?e L. Tits, and J.N. Herskovits. A qp-free globally convergent, globally superlinearlyconvergent algorithm for inequalityconstrainedoptimiza- tion. SIAM Journal on Control and Optimization, 26(4):788?811, 1988. [22] Florian A. Potra. An infeasible-interior-point predictor-corrector algorithm for linear programming. SIAM Journal on Optimization, 6:19?32, 1996. [23] Andr?e L. Tits. An interior point method for linear programming, with an active set flavor. technicalreport TR-99-47, Institute for SystemsResearch, University of Maryland, College Park, MD, 1999. [24] Andr?e L. Tits, P.-A. Absil, and William P. Woessner. Constraint reduction for linear programs with many inequality constraints. SIAM Journal on Optimiza- tion, 17:119?146, 2006. [25] Andr?e L. Tits and J. L. Zhou. A simple, quadratically convergent algorithm for linear and convex quadratic programming. In W.W. Hager, D.W. Hearn, and P.M. Pardalos, editors, Large Scale Optimization: State of the Art, pages 411 ? 427. Kluwer Academic Publishers, 1994. [26] Robert J. Vanderbei. Linear Programming: Foundations and Extensions. Kluwer Academic Publishers, Boston, MA, 1996. 147 [27] David S. Watkins. Fundamentals of Matrix Computations. Wiley, New York, NY, 2002. [28] Luke B. Winternitz, Stacey O. Nicholls, Andr?e L. Tits, and Dianne P. O?Leary. A constraint-reduced variant of the Mehrotra predictor-corrector algo- rithm. http://www.optimization-online.org/ARCHIVE\_DIGEST/2007-07. html, 2007. Retrieved October 2007. [29] Steven J. Wright. Primal-Dual Interior-Point Methods. SIAM, Philadelphia, PA, 1997. [30] Yinyu Ye. A build-down scheme for linear programming. Mathematical Pro- gramming, 46:61?72, 1990. [31] Yinyu Ye. A potential reduction algorithm allowing column generation. SIAM Journal on Optimization, 2(1):7?20, 1992. [32] Yinyu Ye. Complexity analysis of the analytic center cutting plane method that uses mulitple cuts. Mathematical Programming, 78:85?104, 1997. [33] Yin Zhang. Solving large-scale linear programs by interior-point methods under the MATLAB environment. technical report TR96-01, Department of Math- ematics and Statistics, University of Maryland Baltimore County, Baltimore, MD, 1996. [34] Yin Zhang. User?s guide to LIPSOL linear-programming interior point solvers v0.4. Optimization Methods and Software, 11 & 12:385?396, 1999. 148