A polynomial-time interior-point method for conic
optimization, with inexact barrier evaluations?
Simon P. Schurr? Dianne P. O?Leary? Andr?e L. Tits?
University of Maryland
Technical Report CS-TR-4912 & UMIACS-TR-2008-10
April 2008
Abstract
We consider a primal-dual short-step interior-point method for conic convex opti-
mization problems for which exact evaluation of the gradient and Hessian of the primal
and dual barrier functions is either impossible or prohibitively expensive. As our main
contribution, we show that if approximate gradients and Hessians of the primal barrier
function can be computed, and the relative errors in such quantities are not too large,
then the method has polynomial worst-case iteration complexity. (In particular, polyno-
mial iteration complexity ensues when the gradient and Hessian are evaluated exactly.)
In addition, the algorithm requires no evaluation?or even approximate evaluation?of
quantities related to the barrier function for the dual cone, even for problems in which
the underlying cone is not self-dual.
Keywords. Convex cone, conic optimization, self-concordant barrier function, interior-point
method, inexact algorithm, polynomial complexity.
1 Introduction
Interior-point methods (IPMs) are the algorithms of choice for solving many convex opti-
mization problems, including semidefinite programs (SDPs) and second-order cone programs
?This work was supported by NSF grant DMI0422931 and DoE grant DEFG0204ER25655. Any opinions,
findings, and conclusions or recommendations expressed in this paper are those of the authors and do not
necessarily reflect the views of the National Science Foundation or those of the US Department of Energy.
?Department of Combinatorics and Optimization, University of Waterloo, Waterloo, Ontario, Canada N2L
3G1; email: spschurr@uwaterloo.ca .
?Computer Science Department and Institute for Advanced Computer Studies, University of Maryland,
College Park, MD 20742; email: oleary@cs.umd.edu .
?Department of Electrical and Computer Engineering and Institute for Systems Research, University of
Maryland, College Park, MD 20742; email: andre@umd.edu . Correspondence should be addressed to this
author.
1
(SOCPs) (e.g., [NN94, BTN01, BV04]). The bulk of the work at each iteration of an IPM
consists of the evaluation of the first and second derivatives of a suitably selected barrier
function and the solution of the linear system of equations formed by Newton?s method. The
most efficient algorithms require barrier functions for both the primal and dual problems.
The success of IPMs on SDPs and SOCPs has not been matched for more general classes
of conic convex optimization problems. The primary hindrance is the difficulty of explicitly
evaluating the derivatives of the associated barrier functions, particularly those for the dual
problem. The construction of the universal barrier function [NN94] provides a definition for
these derivatives but does not necessarily prescribe a polynomial-time evaluation procedure.
Even for well-studied problem classes, evaluation of the derivatives can be quite expen-
sive. For example, in SDP, where the variables are symmetric positive semidefinite matrices,
derivatives of the barrier function are usually evaluated by computing inverses or Cholesky
decompositions of matrices. Since these linear algebra tasks can be expensive when the ma-
trices are large or dense, it might be preferable to instead compute approximate inverses or
approximate Cholesky decompositions, and hence an approximate gradient and Hessian of the
barrier function, if doing so does not greatly increase the number of IPM iterations.
Therefore, in this work, we consider applying a simple primal-dual short-step IPM to conic
convex optimization problems for which exact evaluation of the gradient and Hessian is either
impossible or prohibitively expensive. Note that this algorithm does not require evaluation
of the gradient and Hessian of the dual barrier function. (This property is shared by an
algorithm recently proposed by Nesterov [Nes06].) As our main contribution, we show that
if estimates of the gradient and Hessian of the primal barrier function are known, and their
relative errors are not too large, then the algorithm has the desirable property of polynomial
worst-case iteration complexity.
There is an extensive body of literature on inexact IPMs for linear optimization problems,
and a smaller body of work on inexact methods for nonlinear convex optimization problems.
We notethepaper[Tun01], inwhichit was shownthat asimple primal-dualpotential reduction
method for linear optimization problems can in principle be applied to conic optimization
problems. It is shown how the assumptions made on the availability of a self-concordant
barrier function and its derivatives are crucial to the convergence and complexity properties
of the method. The inverse Hessian of an appropriate self-concordant barrier function is
approximated by a matrix that may be updated according to a quasi-Newton scheme. We
take a different approach to such approximations.
In the IPM literature, the term ?inexact algorithm? typically refers to algorithms in which
approximate right-hand sides of the linear system are computed. In some schemes approx-
imate coefficient matrices are also used. The inexact algorithm we consider here can be
interpreted in this way, although our primary concern is to estimate the gradient and Hes-
sian of a suitable self-concordant barrier function, rather than the coefficient matrix and
right-hand-side of the linear system per se. In, e.g., [MJ99, FJM99, Kor00] infeasible-point
methods for linear optimization are presented in which the linear system is solved approx-
imately. In [LMO06] an inexact primal-dual path-following method is proposed for convex
quadratic optimization problems. In [BPR99] an inexact algorithm is presented for mono-
tone horizontal linear complementarity problems. (This class of complementarity problems
includes as special cases linear optimization problems and convex quadratic optimization
problems.) Inexact methods for other classes of nonlinear convex problems have also been
2
studied. In [ZT04] an inexact primal-dual method is presented for SDP. We also mention
the paper [TTT06] in which an inexact primal-dual path-following algorithm is proposed
for a class of convex quadratic SDPs. Only the equations resulting from the linearization
of the complementarity condition are perturbed, and a polynomial worst-case complexity
result is proven. Another source of inexactness is the use of an iterative method such as
the preconditioned conjugate gradient method to solve the linear system, and this has been
proposed (e.g., in [NFK98, KR91, Meh92, CS93, GMS+86, WO00]) and analyzed (e.g., in
[BCL99, BGN00, GM88]) for certain classes of convex optimization problems.
This paper is organized as follows. We first present some relevant properties of barrier
functions in Section 2. In Section 3 we consider a primal-dual short-step algorithm that
uses inexact values of the primal barrier gradient and Hessian. We show in Section 4 that if
the gradient and Hessian estimates are accurate enough, then the algorithm converges in a
polynomial number of iterations. Finally, Section 5 is devoted to a few concluding remarks.
Much of the work reported here is derived from the dissertation [Sch06]. Additional details,
including results obtained by assuming the errors in the gradient and Hessian approximations
are dependent, which can occur in practice, may be found there.
2 Notation and Preliminaries
A cone K is a nonempty set such that ?x ? K for all scalars ? ? 0 and x ? K. Hence K
always includes the origin. A cone that is closed, convex, solid (i.e., has nonempty interior),
and pointed (i.e., contains no lines) is said to be full.1 Given an inner product ??,?? on Rn,
the dual of a set S is given by
S? = {y : ?x,y? ? 0 ?x ? S}.
It can be easily verified that the dual of any set is a closed convex cone, so we may refer to S? as
the dual cone of S. Denote by bardbl?bardbl the vector norm induced by ??,??, and given a linear mapping
A, let bardblAbardbl be the operator norm induced by this vector norm: bardblAbardbl = supxnegationslash=0 bardblAxbardbl/bardblxbardbl. Note
that for any inner product, bardblAbardbl is the largest singular value of A, i.e., bardblAbardbl is the spectral
norm. The adjoint of A is denoted by A?, and its inverse by A??. It will be clear from the
context whether ? denotes a dual cone or an adjoint map.
Given a set S ? Rn and a smooth function F : S ? R, let F(k)(x) denote the kth
derivative of F at x ? int(S). Following standard practice in the optimization literature, given
directions h1,??? ,hj, let F(k)(x)[h1,??? ,hj] denote the result of applying F(k)(x) successively
to h1,??? ,hj. For example,
F??(x)[h1,h2] = (F??(x)h1)h2 = ?F??(x)h1,h2? ? R, (2.1)
and
F???(x)[h1,h2] = (F???(x)h1)h2 ? L(Rn), (2.2)
where L(Rn) is the space of linear functionals over Rn.
In the remainder of this section, we establish properties of barrier functions needed in our
analysis.
1Some authors call such a cone proper or regular.
3
2.1 Self-concordant barrier functions
Let S ? Rn be a closed convex set with nonempty interior. The function F : int(S) ? R is
said to be a ?-self-concordant barrier function (or simply a self-concordant barrier) for S if
it is closed (i.e., its epigraph is a closed set), is three times continuously differentiable, and
satisfies
|F???(x)[h,h,h]| ? 2(F??(x)[h,h])3/2 ? x ? int(S) and h ? Rn, (2.3)
and
? = sup
x?int(S)
?(F,x)2 < ?,
where
?(F,x) = inft {t : |F?(x)[h]| ? t(F??(x)[h,h])1/2 ?h ? Rn}.2
If furthermore F??(x) is positive definite for all x ? int(S), then F is said to be nondegenerate.
Note that property (2.3) implies the convexity of F, and that convexity and closedness of F
imply that F satisfies the barrier property, i.e., for every sequence {xi} ? int(S) converging
to a boundary point of S, F(xi) ? ?. (See [Nes04, Theorem 4.1.4] for details.)
It is known that a self-concordant barrier for a full cone is necessarily nondegenerate
[Nes04, Theorem 4.1.3], but for the sake of clarity we will sometimes refer to such functions
as nondegenerate self-concordant barriers.
The quantity ? is known as the complexity parameter of F. If S ? Rn is a closed convex
set with nonempty interior, and F is a ?-self-concordant barrier for S, then ? ? 1 [NN94,
Corollary 2.3.3]. Let F be a nondegenerate self-concordant barrier for S ? Rn, and let
x ? int(S). Then F??(x) and its inverse induce the following ?local? norms on Rn:
bardblhbardblx,F = ?h,F??(x)h?1/2, bardblhbardbl?x,F = ?h,F??(x)?1h?1/2, h ? Rn. (2.4)
These norms are dual to each other. The following result?which uses and builds upon [NN94,
Theorem 2.1.1]?characterizes the local behavior of the Hessian of a self-concordant barrier.
Lemma 2.1. Let F be a nondegenerate self-concordant barrier for the closed convex set S
having nonempty interior.
(a) If x,y ? int(S) are such that r := bardblx?ybardblx,F < 1, then for all h,
(1?r)bardblhbardblx,F ? bardblhbardbly,F ? 11?rbardblhbardblx,F, (2.5)
(1?r)bardblhbardbl?x,F ? bardblhbardbl?y,F ? 11?rbardblhbardbl?x,F . (2.6)
(b) For every x ? int(S), bardblx?ybardblx,F < 1 implies that y ? int(S).
Proof. The inequalities in (2.5) and the result in (b) are from [NN94, Theorem 2.1.1]. The
inequalities in (2.6) follow from those in (2.5); see e.g., [Sch06, Lemma 3.2.9].
2The quantity ?(F,x) is often referred to as the Newton decrement.
4
Although Lemma 2.1 holds for self-concordant barrier functions on general convex sets,
we will be most interested in barriers on full cones, so subsequent results will be phrased with
this in mind. Given a full cone K and a function F : int(K) ? R, the conjugate function,
F? : int(K?) ? R, of F is given by3
F?(s) := sup
x?int(K)
{??x,s??F(x)}, s ? int(K?). (2.7)
Since F?(s) is the pointwise supremum of linear functions of s, F? is convex on int(K?). It was
shown in, e.g., [Ren01, Theorem 3.3.1], that if F is a nondegenerate self-concordant barrier
for K, then F? is a nondegenerate self-concordant barrier for K?. In light of this, the next
result follows directly from Lemma 2.1.
Lemma 2.2. Let F be a nondegenerate self-concordant barrier for the full cone K.
(a) If t,s ? int(K?) are such that r := bardblt?sbardblt,F? < 1, then for all h,
(1?r)bardblhbardblt,F? ? bardblhbardbls,F? ? 11?rbardblhbardblt,F?.
(b) For every s ? int(K?), bardbls?tbardbls,F? < 1 implies that t ? int(K?).
The following technical results will be used in the analysis of the interior-point method to
be presented later.
Lemma 2.3. Let F be a nondegenerate self-concordant barrier for the full cone K, and let
x ? int(K). Then for all h1 and h2,
bardblF???(x)[h1,h2]bardbl?x,F ? 2bardblh1bardblx,Fbardblh2bardblx,F.
Proof. Since F is a nondegenerate self-concordant barrier function, it follows from [NN94,
Proposition 9.1.1] that |F???(x)[h1,h2,h3]| ? 2bardblh1bardblx,Fbardblh2bardblx,Fbardblh3bardblx,F, for all x ? int(K) and all
h1,h2,h3. (In fact, only property (2.3) in the definition of a self-concordant barrier function
is used to prove this result.) Now by definition of a dual norm,
bardblF???(x)[h1,h2]bardbl?x,F = max
y:bardblybardblx,F=1
?y,F???(x)[h1,h2]?
= max
y:bardblybardblx,F=1
F???(x)[h1,h2,y]
? max
y:bardblybardblx,F=1
2bardblh1bardblx,Fbardblh2bardblx,Fbardblybardblx,F
= 2bardblh1bardblx,Fbardblh2bardblx,F.
Corollary 2.4. Let F be a nondegenerate self-concordant barrier for the full cone K, let
x ? int(K) and h be such that ? := bardblhbardblx,F < 1, and let ? ? [0,1]. Then
bardblF???(x+?h)[h,h]bardbl?x+h,F ? 2?
2
(1??)(1???).
3Strictly speaking, in the classical definition of a conjugate function, the domain of F? is not restricted to
int(K?). We include such a restriction here so that F? is finite, which is the only case of interest to us. The
definition of a conjugate function used here is found in, e.g., [NT97, NT98, Ren01]. It is slightly different from
that found elsewhere in the literature, including [NN94]. The difference is the minus sign in front of the ?x,s?
term, which turns the domain of F? from int(?K?) into int(K?).
5
Proof. Since ?? < 1, it follows from Lemma 2.1(b) that x+?h ? int(K), and then from (2.5)
that
bardblhbardblx+?h,F ? ?1???. (2.8)
Moreover, bardbl(1 ? ?)hbardblx+?h,F ? (1 ? ?)?/(1 ? ??) < 1, so (2.6) followed by Lemma 2.3 and
then (2.8) yields
bardblF???(x +?h)[h,h]bardbl?x+h,F ? 11?bardbl(1??)hbardbl
x+?h,F
bardblF???(x+?h)[h,h]bardbl?x+?h,F
? 11?(1??)bardblhbardbl
x+?h,F
2bardblhbardbl2x+?h,F
? 11?(1??) ?
1???
2
parenleftbigg ?
1???
parenrightbigg2
= 2?
2
(1??)(1???).
2.2 Logarithmically homogeneous barrier functions
An important class of barrier functions for convex cones is that of logarithmically homoge-
neous barriers. (The terminology reflects the fact that the exponential of such a function is
homogeneous.) These barrier functions were first defined in [NN94, Definition 2.3.2] and occur
widely in practice. Given a full cone K, a convex twice continuously differentiable function
F : int(K) ? R is said to be a ?-logarithmically homogeneous barrier for K if ? ? 1, F
satisfies the barrier property for K, and the logarithmic-homogeneity relation
F(tx) = F(x)?? log(t) ? x ? int(K), t > 0 (2.9)
holds. The following well-known properties of logarithmically homogeneous barriers are the
same or similar to those in [NN94, Proposition 2.3.4], so we omit proofs.
Lemma 2.5. Let K be a full cone, and F : int(K) ? R be a ?-logarithmically homogeneous
barrier for K. For all x ? int(K) and t > 0:
(a) F?(tx) = 1tF?(x) and F??(tx) = 1t2F??(x);
(b) ?F?(x),x? = ??;
(c) F??(x)x = ?F?(x);
(d) ?F??(x)x,x? = ?;
(e) If in addition F??(x) is positive definite for all x ? int(K), then bardblF?(x)bardbl?x,F = ?1/2.
It was noted in [NN94, Corollary 2.3.2] that if F is a ?-logarithmically homogeneous
barrier for the full cone K, and F is three times differentiable and satisfies (2.3), then F
is a ?-self-concordant barrier for K. Such an F is called a ?-logarithmically homogeneous
(nondegenerate) self-concordant barrier for K.
The following result relates the norm induced by the Hessian of F to that induced by the
Hessian of F?. Here and throughout, we write F??? (?) as shorthand for (F?(?))??.
6
Lemma 2.6. For some ?, let F be a ?-logarithmically homogeneous self-concordant barrier
for the full cone K, and let x ? int(K). Then for all ? > 0 and all h,
bardblhbardbl??F?(x),F? = 1?bardblhbardbl?x,F .
Proof. It follows from, e.g., [Ren01, Theorem 3.3.1] that F? is a ?-logarithmically homogeneous
self-concordant barrier for K?. Using Lemma 2.5(a), we obtain that for all x ? int(K) and
? > 0,
F??? (??F?(x)) = 1?2F??? (?F?(x)) = 1?2F??(x)?1,
where the last equality follows from the relation F??? (?F?(x)) = F??(x)?1; see e.g., [Ren01,
Theorem 3.3.4]. (This relation holds for all x ? int(K), and uses the fact that F?(x) ?
?int(K?).) So for all h, bardblhbardbl??F?(x),F? = bardblhbardblx,F?1/?2 = 1?bardblhbardblx,F?1 = 1?bardblhbardbl?x,F .
To our knowledge, the following key relation between the norms induced by the Hessians of
F and F? is new. It shows that in a certain region (that we will later identify as a neighborhood
of the central path), the dual norm induced by the Hessian of F is comparable to the norm
induced by the Hessian of F?.
Lemma 2.7. For some ?, let F be a ?-logarithmically-homogeneous self-concordant barrier
for the full cone K. Let ? ? (0,1), and suppose that x ? int(K) and s ? int(K?) satisfy
bardbls+?F?(x)bardbl?x,F ? ??, (2.10)
where ? = ?x,s?/?. Then for any h,
1??
? bardblhbardbl
?
x,F ? bardblhbardbls,F? ?
1
(1??)?bardblhbardbl
?
x,F .
Proof. As noted previously, F?(x) ? ?int(K?) for all x ? int(K). Moreover, ? > 0. So
t := ??F?(x) ? int(K?). Since x ? int(K) and ? > 0, we may invoke Lemma 2.6. Following
this, we use (2.10) to obtain
r := bardblt?sbardblt,F? = bardbl?F?(x) +sbardbl??F?(x),F? = 1?bardbl?F?(x) +sbardbl?x,F ? ? < 1.
Now applying Lemma 2.2(a) and using r ? ?, we obtain for any h,
(1??)bardblhbardbl??F?(x),F? ? bardblhbardbls,F? ? 11??bardblhbardbl??F?(x),F?. (2.11)
Applying Lemma 2.6 again gives the required result.
3 An inexact primal-dual interior-point method for conic
optimization
In this section we define our problem, give an overview of the short-step primal-dual IPM and
then present an algorithm that does not require exact evaluation of the primal or dual barrier
functions or their derivatives.
7
3.1 Conic optimization problem
Any primal-dual pair of convex optimization problems can be written in the following conic
form:
vP = infx {?c,x? : Ax = b, x ? K}, (3.1)
vD = sup
w,s
{?b,w? : A?w +s = c, s ? K?}. (3.2)
where A : Rn ? Rm, b ? Rm, c ? Rn, and K ? Rn is a closed convex cone. A point
x is said to be strongly feasible for (3.1) if Ax = b and x ? int(K), and a pair (w,s) is
said to be strongly feasible for (3.2) if A?w + s = c and s ? int(K?). In the case that K
is the nonnegative orthant, (3.1) and (3.2) are called linear optimization problems. If K is
the second-order cone {x : xn ? bardblx1:n?1bardbl2}, then (3.1) and (3.2) are called second-order cone
optimization problems.4 If K is the full cone of positive semidefinite matrices, considered
as a subset of the space of symmetric matrices, then (3.1) and (3.2) are called semidefinite
programs.5
Assumption 3.1.
(a) m > 0 and A is onto;
(b) K is a full cone;
(c) The optimization problems (3.1) and (3.2) each satisfy the generalized Slater constraint
qualification; i.e., there exists a strongly feasible x for (3.1) and a strongly feasible pair (w,s)
for (3.2).
It follows from Assumption 3.1(b) that K? is also a full cone. Assumption 3.1(c) can be
made without loss of generality, since if it fails to hold, one can embed (3.1)?(3.2) in a
higher-dimensional conic problem for which the assumption does hold; see e.g., [NTY99]. Un-
der Assumption 3.1, the optimal primal and dual solution sets are nonempty and bounded
(see [Nes97, Theorem 1]), and the duality gap is zero, i.e., vP = vD; see e.g., [Roc97, Corol-
lary 30.5.2].
Consider the barrier problem associated with (3.1), viz.,
vP(?) = infx {?c,x?+?F(x) : Ax = b}, (3.3)
where ? > 0 is called the barrier parameter, and the function F : int(K) ? R satisfies the
following assumption.
Assumption 3.2. The function F is a ?-logarithmically homogeneous self-concordant barrier
for K.
Such an F is known to exist for every full cone K [NN94, Theorem 2.5.1]. As mentioned
previously, the conjugate function F? of such an F is a ?-logarithmically homogeneous self-
concordant barrier for K?, and hence is a suitable barrier for (3.2). The resulting dual barrier
problem is
vD(?) = sup
w,s
{?b,w???F?(s) : A?w +s = c}. (3.4)
4The second-order cone is also known as the Lorentz cone or ice-cream cone.
5The space of symmetric matrices of order n is, of course, isomorphic to Rn(n+1)/2.
8
It was shown in [Nes97, Lemma 1] that under the generalized Slater constraint qualification,
(3.3) and (3.4) have a unique minimizer for each ? > 0.
3.2 Primal-dual interior-point methods for conic problems
Of special importance in primal-dual interior-point algorithms is the set of triples (x,w,s)
such that for some ? > 0, x is a minimizer of the primal barrier problem and (w,s) is a
minimizer of the dual barrier problem. This set is called the primal-dual central path. To
exploit the fact that the primal-dual central path is a curve culminating at a point in the
primal-dual optimal solution set, we use an iterative algorithm whose iterates stay close to
the primal-dual central path while also converging to the primal-dual optimal solution set.
It is well known that under Assumption 3.1 the primal-dual central path consists of the
triples satisfying the following system:
Ax = b
A?w +s = c
?F?(x) +s = 0
x ? int(K), s ? int(K?).
(3.5)
For any (x,w,s) on the primal-dual central path the pointwise duality gap is
?c,x???b,y? = ?x,s? = ?x,??F?(x)? = ???x,F?(x)? = ??, (3.6)
where the last equality follows from Lemma 2.5(b). Therefore to follow the primal-dual central
path to the optimal primal-dual solution set, we decrease the positive duality measure
? = ?x,s?? , (3.7)
of the triple (x,w,s) to zero, at which point, under Assumption 3.1, x solves (3.1) and (w,s)
solves (3.2).
Applying Newton?s method to the system of equations in (3.5), we obtain the linear system
?
?
A 0 0
0 A? I
?F??(x) 0 I
?
?
?
?
?x
?w
?s
?
? =
?
?
b?Ax
c?A?w ?s
???F?(x)?s
?
?, (3.8)
with ? = 1.6
It is well known that when F is a logarithmically homogeneous barrier for K, the primal-
dual central path is unchanged when the condition ?F?(x) + s = 0 in (3.5) is replaced by
?F??(s)+x = 0. Such a replacement gives rise to a ?dual linearization? analogous to the ?primal
linearization? in (3.8). In general these two approaches yield different Newton directions. The
advantage of using (3.8) is that evaluation of F? and its derivatives is not required. On the
6Strictly speaking, A, A?, and F??(x), and hence the block matrix on the left-hand side of (3.8), are linear
mappings rather than matrices, but the meaning of (3.8) is clear.
9
other hand, the direction obtained from (3.8) is biased in favor of solving the primal barrier
problem.
Since we intend for the algorithm stated below to produce iterates staying close to the
primal-dual central path while making progress towards the optimal primal-dual solution set,
it is necessary to quantify what it means for a point to lie close to the central path. Given
? ? (0,1), define the N(?) neighborhood of the primal-dual central path by
N(?) :=
braceleftbigg
(x,w,s) : (x,w,s) is strongly feasible for (3.1)?(3.2), and
bardbls+?F?(x)bardbl?x,F ? ??, where ? = ?x,s??
bracerightbigg
.
(3.9)
The neighborhood N(?) defined in (3.9) was used in [NT98, Section 6] for optimization over
the class of self-scaled cones. In the case that K is the nonnegative orthant and F(x) =
?summationtexti log(xi) is the standard logarithmic barrier function, we have bardbls + ?F?(x)bardbl?x,F = bardblXs?
?ebardbl2, so N(?) is the familiar N2(?) neighborhood used in linear optimization; see e.g., [Wri97,
p. 9]. Note that points in the set N(?) satisfy all conditions in (3.5) with the possible exception
of the system of equations s+?F?(x) = 0, whose residual is small (as measured in the bardbl?bardbl?x,F
norm).
We will use a primal-dual short-step algorithm to solve (3.1) and (3.2). Short-step al-
gorithms date back to an important paper of Renegar [Ren88], in which a polynomial-time
primal algorithm for linear optimization was given. The name ?short-step? arises from the
fact that this class of algorithms generates at each iteration Newton steps that are ?short?
enough to be feasible without the need for a line search. This is an advantage in conic opti-
mization, since line searches may be expensive and difficult for many classes of cones K. The
major downside is that such Newton steps are usually too conservative; in practice larger steps
are often possible, leading to a faster reduction in the duality measure, and hence a smaller
number of iterations.
A standard short-step primal-dual feasible-point algorithm uses (3.8) to compute a se-
quence of steps for a sequence of ? values converging to zero. The algorithm uses two param-
eters. One is ? ? (0,1), which stipulates the width of the neighborhood N(?) inside which the
iterates are constrained to lie, the other is a centering parameter ? ? (0,1). (See e.g., [Wri97,
p. 8], where the parameter is denoted by ?.)
We terminate the iteration upon finding an ?-optimal solution of (3.1)?(3.2), which is a
feasible primal-dual point whose duality measure is no greater than ?. We show in Section 4
that by choosing ? and ? appropriately, it can be ensured that all iterates of the algorithm
indeed stay in the neighborhood N(?), even if F?(x) and F??(x) in (3.8) and (3.9) are not
computed exactly, provided close enough approximations are used. Moreover, the duality
measure decreases geometrically at each iteration, insuring convergence to an optimal solution
in a reasonable number of iterations.
3.3 An inexact interior-point method
The interior-point method we propose uses inexact barrier function evaluations; i.e., for a
given x ? int(K), the quantities F?(x) and F??(x) in (3.8) are computed inexactly. Denote
10
these estimates by F1(x) and F2(x) respectively. Our analysis is independent of the estimation
algorithm and depends only on the magnitude of the approximation errors. The core of the
algorithm is the following update procedure (cf. (3.8)).
Iteration Short step
Input: ?,? ? (0,1) and a triple (x,w,s) ? N(?).
(1) Let ? = ?x,s?/? and solve the linear system
?
?
A 0 0
0 A? I
?F2(x) 0 I
?
?
?
?
?x
?w
?s
?
? =
?
?
0
0
???F1(x)?s
?
? (3.10)
where F2(x) is self-adjoint and positive definite.
(2) Set x+ = x+ ?x, w+ = w + ?w, and s+ = s+ ?s.
Remark 3.3. Iteration Short step requires the availability of an initial triple (x,w,s) in N(?).
This requirement is inherited from other short-step methods. One method of finding such
a point is to repeatedly perform Iteration Short step with ? = 1, but with ? kept constant
at, say, its initial value. In other words, we seek to move directly towards a point on the
central path. However, to our knowledge, even given a strictly feasible initial triple and using
exact computation of the gradient and Hessian of the barrier function, no algorithm has been
proposed that reaches an appropriate neighborhood of the central path within a polynomial
iteration bound independent of the nearness of the initial point to the boundary of the cone.
Accordingly, this question requires further study beyond the scope of this paper.
A noteworthy feature of Iteration Short step is that its implementation does not require
evaluating, or even estimating, numerical values of the dual barrier function F? or its deriva-
tives. For suitable values of ? and ?, and with F1(x) and F2(x) taken to be F?(x) and F??(x)
respectively, it is known (at least in the SDP case) to produce an algorithm with polynomial
iteration complexity; see, e.g., [BTN01, Section 6.5]. As we show below, such a property
is preserved when one uses inexact evaluations of the derivatives of F, which can yield a
substantial computational advantage.
We now study the effect of errors in our estimates of F?(x) and F??(x) on the convergence
of the interior-point method.
4 Convergence analysis of the inexact algorithm
In this section we prove that the inexact algorithm produces a strongly feasible set of iterates
and converges in a polynomial number of iterations.
4.1 Feasibility and convergence of the iterates
Denote the errors in the known estimates of the gradient and Hessian of F(x) by
E1(x) = F1(x)?F?(x), E2(x) = F2(x)?F??(x).
11
By definition E2(x) is a self-adjoint linear mapping. Throughout this section, we will
assume that the errors E1(x) and E2(x) are ?small enough?. Specifically, for some epsilon11 and epsilon12,
bardblE1(x)bardbl?x,F
bardblF?(x)bardbl?x,F ?
bardblE1(x)bardbl?x,F
?1/2 ? epsilon11 < 1 ? x ? int(K), (4.1)
bardblF??(x)?1/2E2(x)F??(x)?1/2bardbl ? epsilon12 < 1 ? x ? int(K). (4.2)
The quantity bardblE1(x)bardbl?x,FbardblF?(x)bardbl?
x,F
is the relative error in F1(x), measured in the bardbl ? bardbl?x,F norm, and
bardblF??(x)?1/2E2(x)F??(x)?1/2bardbl is the absolute error in F2(x) relative to the Hessian of F. It is
also an upper bound on the relative error in F2(x), measured in the norm bardbl?bardbl:
bardblE2(x)bardbl
bardblF??(x)bardbl =
bardblF??(x)1/2F??(x)?1/2E2(x)F??(x)?1/2F??(x)1/2bardbl
bardblF??(x)bardbl
? bardblF
??(x)1/2bardbl bardblF??(x)?1/2E2(x)F??(x)?1/2bardbl bardblF??(x)1/2bardbl
bardblF??(x)bardbl
= bardblF
??(x)1/2bardbl2
bardblF??(x)bardbl bardblF
??(x)?1/2E2(x)F??(x)?1/2bardbl
= bardblF??(x)?1/2E2(x)F??(x)?1/2bardbl.
The last equality follows from the relation bardblB2bardbl = ?max(B2) = ?max(B)2 = bardblBbardbl2 for a positive
semidefinite self-adjoint operator B, where ?max(?) denotes the largest eigenvalue.
We now show that under the assumption in (4.2), the eigenvalues of F2(x) are close to
those of F??(x). We will use precedesequal to denote a partial ordering with respect to the cone of positive
semidefinite matrices. Viz., for symmetric matrices M1 and M2, M1 precedesequal M2 if and only if
M2 ?M1 is positive semidefinite.
Lemma 4.1. Suppose that F?? and its estimate F2 satisfy (4.2). Then
(1?epsilon12)F??(x) precedesequal F2(x) precedesequal (1 +epsilon12)F??(x) ?x ? int(K).
Moreover F2(x) is positive definite for all x ? int(K).
Proof. The assumption in (4.2) and the definition of a norm induced by an inner product im-
pliesthatforallx ? int(K), eacheigenvalueoftheself-adjointoperatorF??(x)?1/2E2(x)F??(x)?1/2
is bounded by epsilon12, so
?epsilon12I precedesequal F??(x)?1/2E2(x)F??(x)?1/2 precedesequal epsilon12I.
Multiplying this matrix inequality on the left and right by the positive definite matrix F??(x)1/2
preserves the partial ordering precedesequal:
?epsilon12F??(x) precedesequal E2(x) precedesequal epsilon12F??(x).
Adding F??(x) to each quantity in this matrix inequality gives the required result. Since F??(x)
is positive definite, so F2 is also.
12
Since F2(x) is positive definite for all x ? int(K), it follows that F2(x)?1 is well defined.
Due to this and A being onto (Assumption 3.1), (3.10) has a unique solution.
The outline for the remainder of this section is as follows. After giving some technical
results, we prove that for a primal-dual iterate (x,w,s) lying inside a neighborhood N(?)
of the central path, the Newton steps ?x and ?s in (3.10) are bounded by a constant and
a constant times the duality measure ?, respectively, in the appropriate norms (Lemma 4.3
and Corollary 4.4). We then show that if these constants are appropriately bounded, then
a full Newton step produces a strongly feasible primal-dual point (Lemma 4.5). Next we
derive lower and upper bounds on the rate of decrease of the duality measure at each iteration
(Lemma 4.6). The lower bound is used to show that all iterates stay in the N(?) neighborhood
of the central path (Lemmas 4.7, 4.8, and 4.9). Thus Iteration Short step can be repeated
ad infinitum, and the resulting algorithm belongs to the class of ?path-following? methods.
It also belongs to the class of feasible-point algorithms: all iterates are (strongly) feasible.
Finally, using the upper bound on the rate of decrease of the duality measure, we show that
the algorithm has polynomial iteration complexity.
Since F??(x) and F2(x) are positive definite on int(K), the positive definite square roots of
these two matrices are well defined. Define
D(x) = F??(x)1/2F2(x)?1/2 = F??(x)1/2(F??(x) +E2(x))?1/2 (4.3)
for all x ? int(K). In this section we will use the following implication of the Cauchy-Schwarz
inequality: given x ? int(K) and y,z ? Rn,
|?y,z?| ? bardblybardbl?x,F bardblzbardblx,F.
The following technical results will be used in our analysis.
Lemma 4.2. Suppose that F1 and F2 satisfy (4.1) and (4.2), and let x ? int(K) and z ? Rn.
Let D(x) be defined as in (4.3). Then
bardblD(x)bardbl2 ? 11?epsilon1
2
, (4.4a)
bardblD(x)?1bardbl2 ? 1 +epsilon12, (4.4b)
|?F?(x),z?| ? ?1/2bardblzbardblx,F, (4.4c)
|?E1(x),z?| ? epsilon11?1/2bardblzbardblx,F, (4.4d)
|?E1(x),x?| ? epsilon11?, (4.4e)
|?x,E2(x)z?| ? epsilon12?1/2bardblzbardblx,F . (4.4f)
Proof. First, since bardblD(x)bardbl is the square root of the largest eigenvalue of D(x)D(x)?, we have
bardblD(x)bardbl2 = bardblD(x)D(x)?bardbl = bardblF??(x)1/2parenleftbigF??(x) +E2(x)parenrightbig?1F??(x)1/2bardbl
= bardblparenleftbigI +F??(x)?1/2E2(x)F??(x)?1/2parenrightbig?1bardbl
? 11?bardblF??(x)?1/2E
2(x)F??(x)?1/2bardbl
? 11?epsilon1
2
,
13
where the inequalities follow from (4.2). This proves (4.4a). Next, the invertible mapping D
satisfies
bardblD(x)?1bardbl2 = bardblD(x)??D(x)?1bardbl = bardblF??(x)?1/2parenleftbigF??(x) +E2(x)parenrightbigF??(x)?1/2bardbl
= bardblI +F??(x)?1/2E2(x)F??(x)?1/2bardbl
? 1 +bardblF??(x)?1/2E2(x)F??(x)?1/2bardbl
? 1 +epsilon12,
where the inequalities again follow from (4.2), proving (4.4b). Further,
|?F?(x),z?| ? bardblF?(x)bardbl?x,F bardblzbardblx,F
= ?1/2bardblzbardblx,F,
where we have used Lemma 2.5(e), proving (4.4c). Now, using (4.1), we have
|?E1(x),z?| ? bardblE1(x)bardbl?x,F bardblzbardblx,F
? epsilon11?1/2bardblzbardblx,F,
where the last inequality follows from (4.1), proving (4.4d). The inequality in (4.4e) follows
from (4.4d) with z = x, where Lemma 2.5(d) has also been used. Finally, using Lemma 2.5(d)
and (4.2), we have
|?x,E2(x)z?| ? bardblE2(x)?xbardbl?x,F bardblzbardblx,F
= bardblE2(x)?F??(x)?1/2 ?F??(x)1/2xbardbl?x,F bardblzbardblx,F
? bardblF??(x)?1/2E2(x)F??(x)?1/2bardbl bardblF??(x)1/2xbardbl bardblzbardblx,F
? epsilon12?1/2bardblzbardblx,F,
proving (4.4f).
Define the following three constants, which depend on the IPM parameters ? and ?, the
maximum errors epsilon11 and epsilon12, and the complexity parameter ? ? 1.
?0 := ? + (1?? +?epsilon11)?
1/2
(1?epsilon12)1/2 ,
?1 := ?0(1?epsilon1
2)1/2
, (4.5)
?2 := max
braceleftbigg
?1,?0(1 +epsilon12)
1/2
1??
bracerightbigg
.
Lemma 4.3. Suppose that F1 and F2 satisfy (4.1) and (4.2). Let ? ? (0,1), and suppose that
(x,w,s) ? N(?) for some w, and that x,s,?x, and ?s satisfy (3.10) for some ?w. Then
?2bardblF2(x)1/2?xbardbl2 +bardblF2(x)?1/2?sbardbl2 ? ?2?20.
14
Proof. Premultiplying the third block equation in (3.10) by (?F2(x))?1/2 yields:
(?F2(x))1/2?x + (?F2(x))?1/2?s = ?(?F2(x))?1/2(??F1(x) +s). (4.6)
It is seen from (3.10) that ?x lies in the nullspace of A, while ?s lies in the range of A?.
Therefore ?x is orthogonal to ?s. Taking the square of the norm of each side of (4.6) and
using this orthogonality, we obtain
bardbl(?F2(x))1/2?xbardbl2 +bardbl(?F2(x))?1/2?sbardbl2 = bardbl(?F2(x))?1/2(??F1(x) +s)bardbl2.
Multiplying this equation by ? yields
?2bardblF2(x)1/2?xbardbl2 +bardblF2(x)?1/2?sbardbl2 = bardblF2(x)?1/2(??F1(x) +s)bardbl2. (4.7)
Let us now bound the right-hand side of (4.7). In the following, (4.8a) follows from (4.3), and
(4.8b) follows from (4.4a). The inequality in (4.8c) follows from (x,w,s) ? N(?), (4.1), and
Lemma 2.5(e).
bardblF2(x)?1/2(??F1(x) +s)bardbl
= bardblD(x)?F??(x)?1/2(??F1(x) +s)bardbl (4.8a)
? 1(1?epsilon1
2)1/2
bardbl??F1(x) +s)bardbl?x,F (4.8b)
= 1(1?epsilon1
2)1/2
bardbl(s+?F?(x)) +??E1(x)??(1??)F?(x)bardbl?x,F
? 1(1?epsilon1
2)1/2
parenleftbigbardbls+?F?(x)bardbl?
x,F +??bardblE1(x)bardbl
?
x,F +?(1??)bardblF
?(x)bardbl?
x,F
parenrightbig
? 1(1?epsilon1
2)1/2
parenleftbig?? +??epsilon1
1?1/2 +?(1??)?1/2
parenrightbig (4.8c)
= ??0.
Combining this with (4.7) yields the required result.
Corollary 4.4. Suppose that F1 and F2 satisfy (4.1) and (4.2). Let ? ? (0,1), and suppose
that (x,w,s) ? N(?) for some w, and that x,s,?x, and ?s satisfy (3.10) for some ?w. Then
bardblF2(x)1/2?xbardbl ? ?0,
bardblF2(x)?1/2?sbardbl ? ??0,
bardbl?xbardblx,F ? ?1,
bardbl?sbardbl?x,F ? (1 +epsilon12)1/2??0.
Proof. The first two bounds follow immediately from Lemma 4.3. Further, using (4.3) and
(4.4a), we have
bardbl?xbardblx,F = bardblD(x)F2(x)1/2?xbardbl ? bardblD(x)bardbl bardblF2(x)1/2?xbardbl ? (1?epsilon12)?1/2?0 = ?1.
This proves the third inequality. To prove the last, similarly use (4.3) and (4.4b) to obtain
bardbl?sbardbl?x,F = bardblD(x)??F2(x)?1/2?sbardbl ? (1 +epsilon12)1/2bardblF2(x)?1/2?sbardbl ? (1 +epsilon12)1/2??0.
15
Iteration Short step takes a full primal-dual Newton step from a triple (x,w,s) ? N(?). In
the next series of lemmas we show under a condition on the parameters ?,?,epsilon11, and epsilon12, not only
that such a step is strongly feasible, justifying step (2) in Iteration Short step, but further, that
the new iterate remains in the N(?) neighborhood of the central path, so Iteration Short step
can be repeated ad infinitum. We also characterize how the duality measure changes after
taking a full Newton step. The following condition, in which dependence of ?2 on ?,?,epsilon11, and
epsilon12 is emphasized, will play a key role:
?2(?,?,epsilon11,epsilon12,?) < 1. (4.9)
Lemma 4.5. Suppose that F1 and F2 satisfy (4.1) and (4.2). Let ?,?,epsilon11, and epsilon12 be such
that (4.9) holds, and let (x,w,s) ? N(?). Then the point (x+,w+,s+) generated by Iteration
Short step is strongly feasible.
Proof. The equality constraints in (3.1) and (3.2) are satisfied by (x+,w+,s+), since they are
satisfied by (x,w,s) (by virtue of this triple lying in N(?)), and any step from (x,w,s) in
the direction (?x,?w,?s) will satisfy these constraints due to the first two block equations
in (3.10). We now show that x+ ? int(K) and s+ ? int(K?). Since F is a nondegenerate
self-concordant barrier (Assumption 3.2), Lemma 2.1(b) is applicable: if bardbl?xbardblx,F < 1, then
x+?x ? int(K). By Corollary 4.4, a sufficient condition for bardbl?xbardblx,F < 1 is ?1 < 1, and this
holds since ?2 < 1. Similarly, in light of Lemma 2.2(b), if bardbl?sbardbls,F? < 1, then s+?s ? int(K?).
It remains to show that bardbl?sbardbls,F? < 1 under the assumption (4.9). Since (x,w,s) ? N(?), we
may apply Lemma 2.7 with h = ?s to obtain
bardbl?sbardbls,F? ? bardbl?sbardbl
?
x,F
(1??)? ?
(1 +epsilon12)1/2?0
1?? ,
where the last inequality follows from Corollary 4.4. By the definition of ?2 in (4.5), we have
bardbl?sbardbls,F? ? ?2 < 1.
It follows from Lemma 4.5 that a full step in the Newton direction for (3.10) is strongly
feasible. We now study the change in the duality measure after taking a full primal-dual
Newton step.
Lemma 4.6. Suppose that F1 and F2 satisfy (4.1) and (4.2). Let ?,?,epsilon11, and epsilon12 be such that
(4.9) holds, and let (x,w,s) ? N(?). Then the duality measure ?+ of (x+,w+,s+) (see (3.7))
in Iteration Short step satisfies
?? ? ?+ ? ???, (4.10)
where
? = ? ??epsilon11 ? 1?? +?epsilon11 +epsilon12?1/2 ?1 ? 1?epsilon12? ?21, (4.11)
?? = ? +?epsilon11 + 1?? +?epsilon11 +epsilon12
?1/2 ??
1?epsilon12
? ?
2, (4.12)
with
? = min
braceleftbigg
?1, (1?? +?epsilon11 +epsilon12)?
1/2
2(1?epsilon12)
bracerightbigg
. (4.13)
16
Proof. Recalling that for the direction obtained from (3.8), ?x is orthogonal to ?s, we have
??+ = ?x+,s+?
= ?x+ ?x,s + ?s?
= ?x,s + ?s?+??x,s?. (4.14)
From the third block equation in (3.10), we obtain
s+ ?s = ???F1(x)??F2(x)?x, (4.15)
giving
?x,s + ?s? = ??x,??F1(x) +?F2(x)?x?
= ????x,F?(x) +E1(x)????x,(F??(x) +E2(x))?x?
= ??? ????x,E1(x)?+??F?(x),?x????x,E2(x)?x?, (4.16)
where we have used Lemma 2.5(b),(c). Since ??x,?s? = 0, it also follows from (4.15) that
??x,s? = ??x,??F2(x)?x???F1(x)?
= ????x,F2(x)?x?????F?(x) +E1(x),?x?. (4.17)
Combining (4.14), (4.16), and (4.17) we have
??+ = ??? ????x,E1(x)?+ (1??)??F?(x),?x?????E1(x),?x?
???x,E2(x)?x?????x,F2(x)?x? ,
i.e.,
?+
? = ? ?
?
??x,E1(x)?+
1??
? ?F
?(x),?x?? ?
??E1(x),?x?
?1??x,E2(x)?x?? 1???x,F2(x)?x? . (4.18)
To reduce clutter, let t := bardbl?xbardblx,F. In view of Lemma 4.1, we have??x,F2(x)?x? ? (1?epsilon12)t2.
Using this inequality and appealing to Lemma 4.2, we obtain the following upper bound on
?+/?:
?+
? ? ? +
?
?|?x,E1(x)?|+
1??
? |?F
?(x),?x?|+ ?
?|?E1(x),?x?|
+1?|?x,E2(x)?x?|? 1?epsilon12? t2
? ? +?epsilon11 + 1??? ?1/2t+ ??epsilon11?1/2t+ 1?epsilon12?1/2t? 1?epsilon12? t2
=: u(t).
It follows from Corollary 4.4 that t ? ?1, so a uniform upper bound on ?+/? is max{u(t) :
0 ? t ? ?1}. The unconstrained maximizer is given by
t? = (1?? +?epsilon11 +epsilon12)?
1/2
2(1?epsilon12) .
17
If this nonnegative solution satisfies t? ? ?1, then the constrained maximum of u(t) is u(t?).
Otherwise the maximum is u(?1). Hence
?+
? ? u(min{?1,t
?}),
and the latter bound is ?? as given in (4.12) and (4.13). This proves the second inequality in
(4.10).
In view of Corollary 4.4, ??x,F2(x)?x? ? ?20 = (1 ? epsilon12)?21, so from (4.18) we can also
obtain a lower bound on ?+/?:
?+
? ? ? ??epsilon11 ?
1??
? ?
1/2t? ?
?epsilon11?
1/2t? 1
?epsilon12?
1/2t? 1?epsilon12
? ?
2
1
= ? ??epsilon11 ? 1?epsilon12? ?21 ?
parenleftbigg1?? +?epsilon1
1 +epsilon12
?1/2
parenrightbigg
t
? ? ??epsilon11 ? 1?epsilon12? ?21 ?
parenleftbigg1?? +?epsilon1
1 +epsilon12
?1/2
parenrightbigg
?1
= ?,
proving the first inequality in (4.10).
Lemma 4.7. The quantities ? and ?? defined in Lemma 4.6 satisfy ? < ? < ??.
Proof. It is clear from (4.11) that ? < ?. The inequality ? < ?? is easily seen after rewriting ??
as
?? = ? +?epsilon11 + ?(1?epsilon12)
?
bracketleftBiggparenleftbig
1?? +?epsilon11 +epsilon12parenrightbig?1/2
1?epsilon12 ??
bracketrightBigg
,
and noting that the term in the square brackets must be positive, due to (4.13).
We will now show that condition (4.9) together with the following two conditions, where
dependence on ?, ?, epsilon11, epsilon12, and ? is again emphasized, insures that the next iterate lies in
N(?):
?(?,?,epsilon11,epsilon12,?) > 0, (4.19)
?+(?,?,epsilon11,epsilon12,?) is well defined, and ?+(?,?,epsilon11,epsilon12,?) ? ?, (4.20)
where
?+ := ?epsilon11?
1/2 +epsilon12?1
?(1??1) +
parenleftbigg?
? ?1
parenrightbigg
?1/2 + (1??)?1?(1??
1)
+ 2??
parenleftbigg
log(1??1) + ?11??
1
parenrightbigg
. (4.21)
The next lemma will prove useful in the sequel.
Lemma 4.8. If ?,?,epsilon11,epsilon12, and ? satisfy conditions (4.19) and (4.20), then ?1,?2 < 1.
18
Proof. If ?+ is well defined, then clearly ?1 < 1. Moreover, for all ?1 ? (0,1), log(1 ? ?1) +
?1
1??1 > 0. It follows from Lemma 4.7 and (4.19) that ?/? > 1. Hence
2
parenleftbigg
log(1??1) + ?11??
1
parenrightbigg
< 2??
parenleftbigg
log(1??1) + ?11??
1
parenrightbigg
< ?+ ? ? < ?1. (4.22)
The inequality 2log(1??1) + 2?11??1 < ?1 implies ?1 < 0.46. Now from (4.5) we have
?2 = max
braceleftbigg
?1,?0(1 +epsilon12)
1/2
1??
bracerightbigg
= ?1 max
braceleftbigg
1, (1?epsilon1
2
2)
1/2
1??
bracerightbigg
? ?11??
< ?11??
1
< 0.461?0.46 < 1.
Lemma 4.9. Let ?,?,epsilon11,epsilon12, and ? satisfy conditions (4.19) and (4.20). If (x,w,s) ? N(?),
then Iteration Short step produces a triple (x+,w+,s+) ? N(?).
Proof. By Lemma 4.8, ?1,?2 < 1, so (4.9) holds. Applying Lemma 4.5, we conclude that
(x+,w+,s+) is strongly feasible. It remains to show that
bardbls+?F?(x)bardbl?x,F ? ?? =? bardbls+ +?+F?(x+)bardbl?x+,F ? ??+ . (4.23)
From the third block equation in the linear system (3.10), we have
s+ +?+F?(x+) = ???F1(x)??F2(x)?x+?+F?(x+) = D1 +D2 +D3, (4.24)
where
D1 = ???E1(x)??E2(x)?x,
D2 = (?+ ???)F?(x+) +?(? ?1)F??(x)?x,
D3 = ??(F?(x+)?F?(x)?F??(x)?x).
We now appropriately bound the norms of D1, D2, and D3. First we bound bardblD1bardbl?x+,F. Since
bardblx+ ?xbardblx,F = bardbl?xbardblx,F ? ?1 (Corollary 4.4) and ?1 < 1, we can use (2.6) to bound bardblD1bardbl?x+,F
in terms of bardblD1bardbl?x,F; the result is (4.25a). The inequality (4.25b) follows from the definition of
D1, and (4.25c) follows from (4.1). The inequality (4.25d) follows from (4.2) and Corollary 4.4.
19
bardblD1bardbl?x+,F ? 11??
1
bardblD1bardbl?x,F (4.25a)
? 11??
1
parenleftbig??bardblE
1(x)bardbl?x,F +?bardblE2(x)?xbardbl?x,F
parenrightbig (4.25b)
? 11??
1
parenleftbig??epsilon1
1?1/2 +?bardblF??(x)?1/2E2(x)F??(x)?1/2F??(x)1/2?xbardbl
parenrightbig (4.25c)
? 11??
1
parenleftbig??epsilon1
1?1/2 +?bardblF??(x)?1/2E2(x)F??(x)?1/2bardbl bardblF??(x)1/2?xbardbl
parenrightbig
? 11??
1
(??epsilon11?1/2 +?epsilon12?1). (4.25d)
Now in Lemma 4.6 we established that ?? ? ?+, where ? is given in (4.11). Moreover, by
assumption, ? > 0. It follows that
bardblD1bardbl?x+,F ? ?epsilon11?
1/2 +epsilon12?1
?(1??1) ?+
=: d1?+. (4.26)
Next we bound bardblD2bardbl?x+,F. First note that for ? and ?? given in (4.11) and (4.12),
? + ?? = 2? + (???1)1?? +?epsilon11 +epsilon12?1/2 ? 1?epsilon12? (?2 +?21)
< 2?,
where the inequality follows from ? ? ?1; see (4.13). It follows that ???? < ? ??. Using this,
the fact that ? > 0, and the result ? < ?? from Lemma 4.7, we have
?
? ?1 =
? ??
? ?
? ??
?? >
?? ??
?? = 1?
?
?? > 0.
Hence
max
????+????
vextendsinglevextendsingle
vextendsinglevextendsingle1?? ?
?+
vextendsinglevextendsingle
vextendsinglevextendsingle = max
braceleftbigg?
? ?1,1?
?
??
bracerightbigg
= ?? ?1. (4.27)
Inthefollowing, (4.28a)followsfromthedefinitionofD2, and(4.28b)followsfromLemma2.5(e)
and (2.6). Furthermore, (4.28c) follows from (4.27) and Corollary 4.4.
bardblD2bardbl?x+,F ? |?+ ???|bardblF?(x+)bardbl?x+,F +?(1??)bardblF??(x)?xbardbl?x+,F (4.28a)
? ?+
vextendsinglevextendsingle
vextendsinglevextendsingle1?? ?
?+
vextendsinglevextendsingle
vextendsinglevextendsingle?1/2 +?(1??)bardblF??(x)?xbardbl?x,F
1?bardbl?xbardblx,F (4.28b)
? ?+
parenleftbigg?
? ?1
parenrightbigg
?1/2 + ?+? (1??) ?11??
1
(4.28c)
=: d2?+. (4.28d)
20
Finally, we bound bardblD3bardbl?x+,F. In what follows, we will be working with integrals of vectors and
matrices. All such integrals are meant componentwise. Applying the Fundamental Theorem
of Calculus to F?, then to F??, yields
D3 = ??
integraldisplay 1
0
F??(x+t?x)dt?F??(x)?x = ??
integraldisplay 1
0
integraldisplay t
0
F???(x+u?x)[?x,?x] dudt,
giving
bardblD3bardbl?x+,F ? ??
integraldisplay 1
0
integraldisplay t
0
bardblF???(x+u?x)[?x,?x]bardbl?x+,F dudt.
Since ?1 < 1, Corollary 2.4 can be applied, yielding
bardblF???(x+u?x)[?x,?x]bardbl?x+,F ? 2?
2
1
(1??1)(1?u?1),
and7
bardblD3bardbl?x+,F ? ??
integraldisplay 1
0
integraldisplay t
0
2?21
(1??1)(1?u?1) dudt
= ??
parenleftbigg
2log(1??1) + 2?11??
1
parenrightbigg
(4.29)
? 2??
parenleftbigg
log(1??1) + ?11??
1
parenrightbigg
?+
=: d3?+. (4.30)
Combining the bounds in (4.26), (4.28d), and (4.30) with (4.24), we obtain
bardbls+ +?+F?(x+)bardbl?x+,F ? bardblD1bardbl?x+,F +bardblD2bardbl?x+,F +bardblD3bardbl?x+,F
? ?+(d1(?) +d2(?) +d3(?)),
where
d1(?) := ?epsilon11?
1/2 +epsilon12?1
?(1??1) , (4.31a)
d2(?) :=
parenleftbigg?
? ?1
parenrightbigg
?1/2 + (1??)?1?(1??
1)
, (4.31b)
d3(?) := 2??
parenleftbigg
log(1??1) + ?11??
1
parenrightbigg
. (4.31c)
From (4.21), we see that ?+ ? d1(?) + d2(?) + d3(?). Since ?+ ? ? (4.20), the proof is
complete.
7Our bound on 1
??bardblD3bardbl
?
x+,F of 2log(1??1) +
2?1
1??1 = ?
21 + 4
3?
31 +??? in (4.29) below can be improved in
the case that K is a self-scaled cone: in [NT97, Theorem 4.3] a bound of no greater than ?21 is derived using
special properties of self-scaled barriers and self-scaled cones.
21
The significance of?+ is that it specifies a neighborhoodN(?+) of the central path such that
(x+,w+,s+) ? N(?+) ? N(?). An important consequence of this is that Iteration Short step
may be repeated. In what follows we will refer to ?Algorithm Short step?, which is simply
an iterative algorithm in which at iteration k, Iteration Short step is invoked with ?k as the
duality measure of the current iterate (x,w,s) = (xk,wk,sk). The algorithm terminates when
an ?-optimal solution is obtained, i.e., a feasible solution with ?k ? ?, for some prescribed ?.
4.2 Polynomial complexity of an inexact IPM
For convenience let us gather the conditions previously imposed on the algorithm parameters.
The conditions (4.32a) and (4.32b) in the box below are copied from (4.19) and (4.20). (Due to
Lemma 4.8 it is unnecessary to include the condition (4.9).) We also impose a new condition
(4.32c) that will ensure the duality measure decreases at a fast enough rate. Since we are
interested in a complexity result that holds regardless of the complexity parameter ?, we ask
that these conditions be satisfied for all ? ? 1.
Algorithm parameters ?,? ? (0,1), and epsilon11,epsilon12 ? [0,1), are selected as functions
of ? with the property that there exists a constant ? ? (0,1) independent of ?
such that for every ? ? 1:
?(?,?,epsilon11,epsilon12,?) > 0, (4.32a)
?+(?,?,epsilon11,epsilon12,?) is well defined, and ?+(?,?,epsilon11,epsilon12,?) ? ?, (4.32b)
??(?,?,epsilon11,epsilon12,?) ? 1? ?
?1/2.(4.32c)
We now give the main result regarding the worst-case complexity of Algorithm Short step.
Theorem 4.10. Let F be a ?-self-concordant barrier for the cone K in (3.1). Let ?,? ? (0,1)
and epsilon11,epsilon12 ? [0,1) vary with ? in such a way that for some constant ? > 0, they satisfy
conditions (4.32a)?(4.32c) for all ? ? 1. Suppose that the initial primal-dual iterate lies in
N(?), and that for all primal iterates xk, F1(xk) and F2(xk) satisfy the inequalities in (4.1)
and (4.2). Then for any ? > 0, an ?-optimal solution to (3.1)?(3.2) is obtained in a polynomial
number (in ? and log(?0/?)) of iterations. Specifically, given ? > 0, there exists a number
k? = O(?1/2 log(?0/?)) such that k ? k? implies ?k ? ?.
Proof. Since (4.32a) and (4.32b) hold, and the starting iterate lies in N(?), it follows from
Lemma 4.9 that all iterates generated by Iteration Short step lie in N(?). In fact for each
k, they remain in N(?) restricted to the set S(?k) := {(x,w,s) : 0 ? ? ? ?k}. Now
from Lemma 4.6 the ratio of successive duality measures ?k+1/?k is bounded above by the
constant ??, which by (4.32c) is bounded away from 1. Hence ?k ? 0+ and all limit points
of the sequence {(xk,wk,sk)} lie in N(?) restricted to the set S(limk ?k) = S(0), which is
the primal-dual optimal solution set. Now the particular bound on ?? in (4.32c) implies that
?k+1 ? (1 ? ?/?1/2)?k for k = 0,1,???. Our polynomial iteration bound is readily obtained
from this inequality. Specifically, if k ? ?1??1/2 log(?0/?)? then ?k ? ?.
22
Remark 4.11. A tradeoff exists between the distance from a given quadruple (?,?,epsilon11,epsilon12) to
the boundary of the set of quadruples satisfying (4.32a)-(4.32b), and the largest possible ? in
(4.32c). Given ? and ?, it is natural to want to choose epsilon11 and epsilon12 as large as possible. This
reduces ?, which according to the proof of Theorem 4.10, increases the order constant in the
complexity of the algorithm.
It remains to verify that Theorem 4.10 is not vacuous, i.e., that there exists some ? ? (0,1)
and (possibly ?-dependent) values ?,? ? (0,1) and epsilon11,epsilon12 ? [0,1), such that inequalities (4.32a)-
(4.32c) hold for all ? ? 1. To that effect, let ? = 10?4, ? = 0.1, ? = 1 ? 0.02??1/2,
epsilon11 = 0.015??1/2, and epsilon12 = 0.035. The slack in the three inequalities (4.32a)-(4.32c) is plotted
in Figure 1, suggesting that the conditions hold indeed. (That they hold for arbitrarily large
? can be formally verified with little effort.)
Remark 4.12. The worst-case complexity bound obtained in Theorem 4.10, O(?1/2 log(?0/?)),
matches the best worst-case bound currently available for convex optimization algorithms, even
for linear optimization algorithms that use exact evaluations of the barrier gradient and Hes-
sian.
Corollary 4.13. Given a family {Kn} of cones, where Kn has dimension n, and associated
barrier functions Fn : int(Kn) ? R, if the complexity parameter ?(n) of Fn is polynomial
in n, and if the gradient and Hessian estimates Fn1 (?) and Fn2 (?) can each be computed in a
polynomial (in n) number of arithmetic operations, then Algorithm Short step with (possibly
?-dependent) values of ?, ?, epsilon11, and epsilon12 chosen to satisfy inequalities (4.32a)-(4.32c), generates
an ?-optimal solution in a polynomial (in n) number of arithmetic operations.
As a special case, Theorem 4.10 provides sufficient conditions on the parameters ? and ?
for polynomiality of Algorithm Short Step under exact evaluation of the gradient and Hessian
of F. Letting ? = 10?4 again, it is readily verified that, when ? = 0.1 is again selected,
? can now be as small as 1 ? 0.069??1/2, and when the larger value ? = 1 ? 0.02??1/2 is
maintained, ? can be as large as 0.28, i.e., the size of the neighborhood of the central path
can be significantly increased.
5 Conclusion
A simple interior-point method for general conic optimization was proposed. It allows for
inexact evaluation of the gradient and Hessian of the primal barrier function, and does not
involve any evaluation of the dual barrier function or its derivatives. It was shown that as
long as the relative evaluation errors remain within certain fixed, prescribed bounds, under
standard assumptions, an ?-optimal solution is obtained in a polynomial number of iterations.
References
[BCL99] Mary Ann Branch, Thomas F. Coleman, and Yuying Li, A subspace, interior, and conjugate
gradient method for large-scale bound-constrained minimization problems, SIAM Journal on Sci-
entific Computing 21 (1999), no. 1, 1?23.
23
100 101 102 103 104
0.93
0.94
0.95
0.96
0.97
0.98
0.99
1
?
?
100 101 102 103 104
0
0.005
0.01
0.015
0.02
0.025
0.03
0.035
?
? ? ?+
100 101 102 103 104
0
0.5
1
1.5
2
2.5
3
3.5
4 x 10
?3
?
1 ? ??? ? ??
Figure 1: Plot of the slacks in conditions (4.32a), (4.32b), and (4.32c), for ? = 10?4, ? = 0.1,
? = 1?0.02??1/2, epsilon11 = 0.015??1/2, and epsilon12 = 0.035.
24
[BGN00] R.H. Byrd, J.C. Gilbert, and J. Nocedal, A trust region method based on interior point techniques
for nonlinear programming, Mathematical Programming 89 (2000), no. 1, 149?185.
[BPR99] J. F. Bonnans, C. Pola, and R. R?ebai, Perturbed path following predictor-corrector interior
point algorithms, Optim. Methods Softw. 11/12 (1999), no. 1-4, 183?210. MR MR1777457
(2001d:90124)
[BTN01] A. Ben-Tal and A. Nemirovski, Lectures on modern convex optimization. analysis, algorithms,
and engineering applications, MPS/SIAM Series on Optimization, Society for Industrial and
Applied Mathematics, Philadelphia, PA, 2001.
[BV04] S. Boyd and L. Vandenberghe, Convex optimization, Cambridge University Press, 2004.
[CS93] Tamra J. Carpenter and David F. Shanno, An interior point method for quadratic programs based
on conjugate projected gradients, Computational Optimization and Applications 2 (1993), 5?28.
[FJM99] R. W. Freund, F. Jarre, and S. Mizuno, Convergence of a class of inexact interior-point algorithms
for linear programs, Math. Oper. Res. 24 (1999), no. 1, 50?71. MR MR1679054 (2000a:90033)
[GM88] D. Goldfarb and S. Mehrotra, A relaxed version of Karmarkar?s method, Mathematical Program-
ming 40 (1988), no. 3, 289?315.
[GMS+86] Philip E. Gill, Walter Murray, Michael A. Saunders, J. A. Tomlin, and Margaret H. Wright,
On projected Newton barrier methods for linear programming and an equivalence to Karmarkar?s
projective method, Mathematical Programming 36 (1986), 183?209.
[Kor00] J. Korzak, Convergence analysis of inexact infeasible-interior-point algorithms for solving linear
programming problems, SIAM J. Optim. 11 (2000), no. 1, 133?148. MR MR1785643 (2002f:90161)
[KR91] N. K. Karmarkar and K. G. Ramakrishnan, Computational results of an interior point algorithm
for large scale linear programming, Mathematical Programming 52 (1991), 555?586.
[LMO06] Z. Lu, R. D. C. Monteiro, and J. W. O?Neal, An iterative solver-based infeasible primal-dual
path-following algorithm for convex quadratic programming, SIAM J. on Optimization 17 (2006),
no. 1, 287?310.
[Meh92] Sanjay Mehrotra, Implementation of affine scaling methods: Approximate solutions of systems of
linear equations using preconditioned conjugate gradient methods, ORSA Journal on Computing
4 (1992), no. 2, 103?118.
[MJ99] S. Mizuno and F. Jarre, Global and polynomial-time convergence of an infeasible-interior-point
algorithm using inexact computation, Math. Program. 84 (1999), no. 1, Ser. A, 105?122. MR
MR1687272 (2000b:90026)
[Nes97] Yu. Nesterov, Long-step strategies in interior-point primal-dual methods, Math. Programming 76
(1997), no. 1, Ser. B, 47?94.
[Nes04] Yu. Nesterov, Introductory lectures on convex optimization. a basic course, Kluwer Academic
Publishers, Boston/Dordrecht/London, 2004.
[Nes06] , Towards nonsymmetric conic optimization, Tech. Report : CORE Discussion Paper
2006/28, Center for Operations Research and Econometrics, Catholic University of Louvain,
March 2006.
[NFK98] K. Nakata, K. Fujisawa, and M. Kojima, Using the conjugate gradient method in interior-point
methods for semidefinite programs, Proceedings of the Institute of Statistical Mathematics 46
(1998), 297?316.
[NN94] Yu. Nesterov and A. Nemirovskii, Interior-point polynomial algorithms in convex programming,
SIAM Studies in Applied Mathematics, vol. 13, Society for Industrial and Applied Mathematics,
Philadelphia, PA, 1994. MR 94m:90005
[NT97] Yu. E. Nesterov and M. J. Todd, Self-scaled barriers and interior-point methods for convex pro-
gramming, Math. Oper. Res. 22 (1997), no. 1, 1?42. MR 98c:90093
25
[NT98] , Primal-dual interior-point methods for self-scaled cones, SIAM J. Optim. 8 (1998), no. 2,
324?364. MR 99f:90056
[NTY99] Yu. Nesterov, M. J. Todd, and Y. Ye, Infeasible-start primal-dual methods and infeasibility de-
tectors for nonlinear programming problems, Math. Program. 84 (1999), no. 2, Ser. A, 227?267.
MR 2000d:90055
[Ren88] J. Renegar, A polynomial?time algorithm, based on Newton?s method, for linear programming,
Mathematical Programming 40 (1988), 59?93.
[Ren01] J. Renegar, A mathematical view of interior-point methods in convex optimization, MPS/SIAM
Series on Optimization, Society for Industrial and Applied Mathematics, Philadelphia, PA, 2001.
MR 2002g:90002
[Roc97] R. T. Rockafellar, Convex analysis, Princeton Landmarks in Mathematics, Princeton University
Press, Princeton, NJ, 1997, Reprint of the 1970 original. MR 97m:49001
[Sch06] S. P. Schurr, An inexact interior-point algorithm for conic convex optimization problems, Ph.D.
thesis, University of Maryland, College Park, MD 20472, USA, 2006.
[TTT06] K.-C. Toh, R. H. T?ut?unc?u, and M. J. Todd, Inexact primal-dual path-following algorithms for a
special class of convex quadratic SDP and related problems, Tech. report, Department of Math-
ematics, National University of Singapore, May 2006, Submitted to the Pacific Journal of Opti-
mization.
[Tun01] L. Tuncel, Generalization of primal-dual interior-point methods to convex optimization problems
in conic form, Foundations of Computational Mathematics 1 (2001), no. 3, 229?254.
[WO00] W. Wang and D. P. O?Leary, Adaptive use of iterative methods in predictor-corrector interior
point methods for linear programming, Numer. Algorithms 25 (2000), no. 1-4, 387?406. MR
MR1827167 (2002a:90099)
[Wri97] S. J. Wright, Primal?dual interior?point methods, SIAM Publications, SIAM, Philadelphia, PA,
USA, 1997.
[ZT04] G. Zhou and K.-C. Toh, Polynomiality of an inexact infeasible interior point algorithm for
semidefinite programming, Math. Program. 99 (2004), no. 2, Ser. A, 261?282. MR MR2039040
(2005b:90090)
26