University of Maryland College Park
Institute for Advanced Computer Studies TR{99{24
Department of Computer Science TR{4015
An Analysis of the Rayleigh{Ritz Method for
Approximating Eigenspaces

Zhongxiao Jia
y
G. W. Stewart
z
May 1999
ABSTRACT
This paper concerns the Rayleigh{Ritz method forcomputing an approxima-
tion to an eigenspace X of a general matrix A from a subspace W that con-
tains an approximation to X. The method produces a pair (N;
~
X) that pur-
ports to approximate a pair (L;X), where X is a basis forX and AX = XL.
In this paper we consider the convergence of (N;
~
X) as the sine  of the angle
between X and W approaches zero. It is shown that under a natural hy-
pothesis|called the uniform separation condition|the Ritz pairs (N;
~
X)
converge to the eigenpair (L;X). When one is concerned with eigenvalues
and eigenvectors, one can compute certain rened Ritz vectors whose con-
vergence is guaranteed, even when the uniform separation condition is not
satised. An attractive feature of the analysis is that it does not assume
that A has distinct eigenvalues or is diagonalizable.

This report is available by anonymous ftp from thales.cs.umd.edu in the directory pub/reports
or on the web at http://www.cs.umd.edu/stewart/.
y
Department of Applied Mathematics, Dalian University of Technology, Dalian 116024, P.R. China,
(zxjia@dlut.edu.cn). Work supported by the China State Major Key Project for Basic Researches,
the National Natural Science Foundation of China, the Foundation for Excellent Young Scholars of the
Ministry of Education and the Doctoral Point Program of the Ministry of Education, China.
z
Department of Computer Science and Institute for Advanced Computer Studies, University of Mary-
land, College Park, MD 20742, USA (stewart@cs.umd.edu). Work supported by the National Science
Foundation under Grant No. 970909-8562
An Analysis of the Rayleigh{Ritz Method for
Approximating Eigenspaces
Zhongxiao Jia

G. W. Stewart
y
Abstract
This paper concerns the Rayleigh{Ritz method for computing an approxima-
tion to an eigenspace X of a general matrix A from a subspace W that contains
an approximation to X. The method produces a pair (N;
~
X) that purports to ap-
proximate a pair (L;X), where X is a basis for X and AX = XL. In this paper
we consider the convergence of (N;
~
X) as the sine  of the angle between X and
W approaches zero. It is shown that under a natural hypothesis|called the uni-
form separation condition|the Ritz pairs (N;
~
X) converge to the eigenpair (L;X).
When one is concerned with eigenvalues and eigenvectors, one can compute certain
rened Ritz vectors whose convergence is guaranteed, even when the uniform sepa-
ration condition is not satised. An attractive feature of the analysis is that it does
not assume that A has distinct eigenvalues or is diagonalizable.
1. Introduction
Many methods for nding eigenvalues and eigenvectors of a large matrix A proceed by
generating a sequence of subspaces W
k
containing increasingly accurate approximations
to the desired eigenvectors. There are a number of methods for accomplishing this|
e.g., the Arnoldi method, the nonsymmetric Lanczos method, subspace iteration, and
the Jacobi{Davidson method (for more on these methods see [11, 12]).
A central problem in all these methods is how to extract approximations to the
desired eigenvalues and eigenvectors from the subspace W
k
. A widely used technique
for accomplishing this is called the Rayleigh{Ritz procedure (it is also an example of

Department of Applied Mathematics, Dalian University of Technology, Dalian 116024, P.R. China,
(zxjia@dlut.edu.cn). Work supported by the China State Major Key Project for Basic Researches,
the National Natural Science Foundation of China, the Foundation for Excellent Young Scholars of the
Ministry of Education and the Doctoral Point Program of the Ministry of Education, China.
y
Department of Computer Science and Institute for Advanced Computer Studies, University of Mary-
land, College Park, MD 20742, USA (stewart@cs.umd.edu). Work supported by the National Science
Foundation under Grant No. 970909-8562
1
2 Analysis of the Rayleigh{Ritz Method
the more general Galerkin technique). In its simplest form the technique retrieves
an approximation to a simple eigenpair (;x) as follows (here we drop the iteration
subscript).
1: Compute an orthonormal basis W for W.
2: Compute B = W
H
AW.
3: Let (;z) be an eigenpair of B, where 

=
.
4: Take (; ~x) = (;Wz) as the approximate eigenpair.
(1.1)
Of course, steps 3 and 4 can be repeated to extract approximations to other eigenpairs.
The matrix B is called a Rayleigh quotient. The number  is called a Ritz value and
the vector ~x = Wz is called a Ritz vector. The informal justication for the method is
that if x 2 W then there is an eigenpair (;z) of B with x = Wz. Continuity suggests
that if x is nearly in W then there should be an eigenpair (;z) of B with  near  and
Wz near x.
When A is non-Hermitian, one of the authors (Jia [5, 6, 7, 8]) has established a priori
error bounds for Ritz values and Ritz vectors in terms of the deviation of x from W.
The results show that Ritz values converge. The Ritz vectors, on the other hand, behave
more erratically and may even fail to converge. This led the rst author to introduce
certain rened Ritz vectors for which the continuity argument is valid [4, 6, 8]. The
rened Ritz vectors have been used in some other cases [9, 10]. Unfortunately, the
results just cited were proved under the restrictive hypothesis that the eigenvalues of A
are distinct or A is diagonalizable. One of the contributions of this paper is to remove
this restriction.
When A has a cluster of very close or multiple eigenvalues, the corresponding eigen-
vectors are ill-determined, and it makes better sense to approximate the eigenspace X
spanned by the vectors instead of the vectors themselves. The Rayleigh{Ritz procedure
can be extended to do this [see (3.1) below]. A second contribution of this paper is
to provide an analysis of this extended procedure. In fact, essentially the same results
hold for eigenspaces as for eigenvectors, and we will present our results at this level of
generality.
Our approach is to derive bounds in terms of the sine  of the angle between X
and the subspace W. (In practice, of course, the size of  must be established from the
properties of the underlying algorithm thatdetermines W.) Sections 2 and 3 aredevoted
to background material and setting the stage for our analysis. In x4 we will consider
the convergence of the Ritz values of the Rayleigh quotient. In x5 we will establish the
convergence of Ritz pairs under a hypothesis that is computationally veriable. In x6
we treat the relation of residual vectors to the accuracy of an approximate eigenspace,
and in x7, we consider the convergence of the rened Ritz vectors mentioned above. The
paper concludes with a brief summary.
Analysis of the Rayleigh{Ritz Method 3
2. Preliminaries
Background material for this paper can be found in [2, 13]. The norm kk will denote
both the Euclidean vector norm and the subordinate spectral matrix norm. We will
denote the spectrum of a matrix A by the multiset (A).
A subspace X is an eigenspace (or invariant subspace) of A if AX  X. If X is a
basis for X, then there is a unique matrix L such that
AX = XL; (2.1)
and conversely if X haslinearly independent columns and satises (2.1),then the column
space of X is an eigenspace of A. In this case, we say that (L;X) is an eigenpair of A,
that the matrix X is an eigenbasis of A, and that L is its corresponding eigenblock. The
eigenpair is simple if its eigenvalues are distinct from the other eigenvalues of A. It is
orthonormal if X
H
X = I, in which case its eigenblock is given by the Rayleigh quotient
L = X
H
AX. Throughout this paper we will assume that eigenpairs are orthonormal.
We will measure the deviation of a subspace X from a subspace W as follows. Let X
be an orthonormal basis for X, W an orthonormal basis for W, and W
?
an orthonormal
basis for the orthogonal complement of W. Then we dene
sin
6
(X;W) = sin(X;W) = kW
H
?
Xk: (2.2)
This measure is a metric on any space of subspaces of xed dimension. If the dimension
of W is greater than that of X, the measure is not symmetric in its arguments; in fact,
sin
6
(W;X) = 1, although it may happen that sin
6
(X;W) < 1.
We will cast our results in terms of a function sep that in some sense measures the
distance between the spectra of two matrices. Specically, let L and M be matrices of
order ` and m, and dene
sep(L;M) = min
kPk=1
kPLBnZrMPk: (2.3)
Alternatively, let S be the Sylvester operator dened by
SP = PLBnZrMP:
Then S is a linear operator whose eigenvalues are
(S) = (L)BnZr(M);
i.e., its eigenvalues are the pairwise dierences of the eigenvalues of L and M. If
  minj(S)j> 0
4 Analysis of the Rayleigh{Ritz Method
then S is nonsingular and it follows from (2.3) that sep
BnZr1
(L;M) = kS
BnZr1
k  . Hence
sep(L;M) = kS
BnZr1
k
BnZr1
 ;
i.e., sep(L;M) is not greater than the physical separation  of the spectra of L and M.
Unfortunately, sep(L;M) can be much smaller than .
The function sep has an important advantage over |it is Lipschitz continuous.
Specically
sep(L;M)BnZrkEkBnZrkFk  sep(L+ E;M + F)  sep(L;M)+kEk+kFk: (2.4)
Finally, we note that because kk is unitarily invariant if U and V are unitary then
sep(U
H
LU;V
H
MV) = sep(L;M): (2.5)
We conclude this section with a theorem of Elsner, which will be used to establish
the convergence of Ritz values.
Theorem 2.1. Let the eigenvalues of A be 
1
;::: ;
n
and let the eigenvalues of
~
A =
A + E be
~

1
;::: ;
~

n
. Then there is a permutation j
1
;::: ;j
n
of the integers 1;::: ;n
such that
j
i
BnZr
~

j
i
j  n(kAk+k
~
Ak)
1BnZr
1
n
kEk
1
n
; i = 1;::: ;n:
3. The setting
As we indicated in the introduction, we are concerned with the approximation of a
simple eigenpair (L;X) by the Rayleigh{Ritz method applied to a subspace W. We will
consider the following generalization of the method (1.1) in the introduction.
1: Compute an orthonormal basis W for W.
2: Compute B = W
H
AW.
3: Let (N;Z) be an eigenpair of B, where (N)

=
(L).
4: Take (N;
~
X) = (N;WZ) as the approximate eigenpair.
(3.1)
For deniteness we will suppose that the dimensions of X and W are ` and p, so
that the eigenblock L is of order ` and the Rayleigh quotient B is of order p. Denoting
the column space of X by X, we will set
 = sin
6
(X;W) (3.2)
and examine the behavior of the method as  ! 0.
Analysis of the Rayleigh{Ritz Method 5
In the sequel the representation of X in the coordinate system specied by W will
play a central role. As above, let the columns of W
?
form an orthonormal basis for W
?
.
Then X can be written in the form
X = WY +W
?
Y
?
;
where
Y = W
H
X and Y
?
= W
H
?
X:
By denition kY
?
k = .
The columns of Y can be orthonormalized by setting
^
Y = YQ; where Q = (Y
H
Y )
BnZr
1
2
: (3.3)
Since Y
H
Y +Y
H
?
Y
?
= I, it follows that
kI BnZrQk = 1BnZr
1
p
1BnZr
2

=
1
2

2
and kQ
BnZr1
k =
1
p
1BnZr
2

=
1+
1
2

2
:
Note that since X BnZrW
^
Y = WY
?
+ WY(I BnZrQ)
kX BnZrW
^
Yk  +O(
2
): (3.4)
4. The convergence of Ritz values
Although we will be primarily concerned with Ritz pairs, the only eective way of
choosing a pair from a Rayleigh quotient B is to examine the eigenvalues of B. We
therefore need to know when the eigenvalues of a Rayleigh quotient converge.
It is a surprising fact that the hypothesis  ! 0 is by itself sucient to insure that
B in the algorithm (3.1) contains Ritz values that converge to the eigenvalues of L. We
will establish this result in two stages. First we will show that if  is small then (L)
is a subset of the spectrum of a matrix
~
B that is near B. We will then use Elsner's
theorem to show that B must have eigenvalues that are near those of L.
Theorem 4.1. Let B be the Rayleigh quotient in (3.1), and let
^
Y and Q be dened
by (3.3). Then there is a matrix E satisfying
kEk

p
1BnZr
2
kAk (4.1)
such that (Q
BnZr1
LQ;
^
Y) is an eigenblock of B +E.
6 Analysis of the Rayleigh{Ritz Method
Proof. From the relation AX BnZrXL = 0 we have
W
H
A(W W
?
)

W
H
W
H
?

X BnZrW
H
XL = 0:
Equivalently,
BY + W
H
AW
?
Y
?
BnZrYL = 0:
Postmultiplying by Q, we have
B
^
Y +W
H
AW
?
Y
?
QBnZr
^
YQ
BnZr1
LQ = 0: (4.2)
Set
R = B
^
Y BnZr
^
YQ
BnZr1
LQ: (4.3)
Then it follows from (4.2) that
kRk

p
1BnZr
2
kW
H
AW
?
k 

p
1BnZr
2
kAk: (4.4)
If we now dene
E = R
^
Y
H
;
then it is easy to verify that E satises (4.1) and (B +E)
^
Y =
^
YQ
BnZr1
LQ.
If we now apply Elsner's theorem, we get the following corollary.
Corollary 4.2. Let the eigenvalues ofB be
1
;::: ;
p
. Then there are integersj
1
;::: ;j
`
such that
j
i
BnZr
j
i
j  p(2kAk+kEk)
1BnZr
1
p
kEk
1
p
; i = 1;::: ;`: (4.5)
The right-hand side of (4.5) depends only on kAk and . Hence we may conclude
that as  ! 0 there are always Ritz values that converges to the eigenvalues of L. The
exponent
1
p
in (4.5) means that in the worst case the convergence of the Ritz values
can be slow. Unfortunately, if the eigenvalues of L are defective, we will indeed observe
slow convergence. And even if they are well conditioned, without additional conditions
convergence can still be slow. One such condition will emerge in the next section.
Analysis of the Rayleigh{Ritz Method 7
5. The convergence of Ritz pairs
Having determined that there are Ritz values that converge to the eigenvalues of our
distinguished eigenspace X, we now turn to the convergence of the Ritz pairs. The
chief diculty in establishing convergence is that an eigenspace can have any number of
eigenpairs even when the eigenpairs are required to be orthonormal. For if (L;X) is an
orthonormal eigenpair of A and U is unitary, then (U
H
LU;XU) is also an orthonormal
eigenpair corresponding to the same eigenspace. If we are to speak of convergence,
therefore, we must nd a way of removing the ambiguity in the pairs.
One way is to prove convergence of the Ritz spaces directly, after which we can
choose converging bases for the spaces, whose associated Rayleigh quotients will natu-
rally converge. The problem with this approach is that the convergence conditions must
be phrased in terms of the eigenblock L, which is unknown before convergence. For this
reason, we will take a less direct approach.
We will use the notation and results of Theorem 4.1 and Corollary 4.2. Let (N;Z)
be the eigenpair associated with the eigenvalues of B that converge to those of L. Let
(Z Z
?
) be unitary. From the relation BZ = ZN it follows that

Z
H
Z
H
?

B(Z Z
?
) =

N H
0 C

:
Now let E be such that (Q
BnZr1
LQ;
^
Y) is an eigenblock of B + E. The following
theorem, which shows that under appropriate conditions B + E has an eigenpair near
(N;Z), is an immediate consequence of Theorem V.2.7 in [13].
Theorem 5.1. Let
 =
2kEk
sep(N;C)BnZr2kEk
< 1 (5.1)
Then there is an orthonormal eigenpair (
~
N;
~
Z) and a complementary eigenblock
~
C of
B + E such that
tan
6
(
~
Z;Z)  
and
k
~
N BnZrNk;k
~
CBnZrCk
(1+ )kEk+ kBk
p
1BnZr
2
:
The condition (5.1) of Theorem 5.1 is not automatically satised. Since the only
assumption we have made about W is that it contains a good approximation to X, the
8 Analysis of the Rayleigh{Ritz Method
eigenvalues of C can lie almost anywhere within a circle about the origin of radius kAk.
In particular, it could happen that as  ! 0 the matrix C has a rogue eigenvalue that
converges to an eigenvalue of L so quickly that sep(N;C) is always too small for (5.1)
to be satised. From now on, we will assume that there is a constant  independent of
 such that
sep(N;C)  > 0: (5.2)
We will call this the uniform separation condition. It implies the condition (5.1), at
least for suciently small E. By (2.5) this condition is independent of the orthonormal
bases Z and Z
?
used to dene N and C. Note that in principle the uniform separa-
tion condition can be monitored computationally by computing sep(N;C) during the
Rayleigh{Ritz procedure.
To see how this theorem implies the convergence of eigenblocks, recall that B + E
has the eigenpair (Q
BnZr1
LQ;
^
Y). By construction, as  ! 0 the eigenvalues of N converge
to those of L, and hence by Elsner's theorem so do the eigenvalues of the eigenblock
~
N, which is an increasingly small perturbation of N. But by the the continuity of
sep and the assumption that the spectra of N and C are disjoint, the eigenvalues of
~
N are bounded away from those of
~
C. Hence the eigenvalues of
~
N are the same as
those of Q
BnZr1
LQ; i.e., (
~
N) = (L). But a simple eigenspace is uniquely determined
by its eigenvalues.
1
Hence for some unitary matrix U,
~
Z =
^
YU,
~
X = W
^
YU, and
~
N = U
H
(Q
BnZr1
LQ)U. Since kN BnZr
~
Nk! 0 and Q ! I, we have the following theorem.
Theorem 5.2. Under the uniform separation condition, there is a unitary matrix U
depending on , such that as  ! 0, the eigenpair (UNU
H
;ZU
H
) approaches (L;
^
Y).
Consequently, by (3.4) the Ritz pair (UNU;
~
XU
H
) approaches the eigenpair (L;X).
Thus the uniform separation condition implies the convergence of Ritz pairs, up to
unitary adjustments. In the sequel we will assume that these adjustments have been
made and simply speak of the convergence of (N;
~
X) to (L;X).
By combining the error bounds in Theorems 4.1 and 5.1 and the inequality (3.4)
we can, after some manipulation, establish the following on the asymptotic rate of
convergence of the Ritz pairs.
Corollary 5.3. If the uniform separation condition holds, then
sin
6
(X;
~
X);
kN BnZrLk
kAk


1+2
kAk
sep(L;C)

 +O(
2
): (5.3)
Thus the convergence is at a rate proportional to the rst power of . The main fac-
tor aecting the convergence constant is the reciprocal of the normalized separation
1
Although this fact seems obvious, its proof is nontrivial.
Analysis of the Rayleigh{Ritz Method 9
sep(L;C)=kAk, which by the uniform separation condition is bounded away from zero.
The linear bound for the convergence of N to L does not imply eigenvalues of N con-
verge at the same rate; however, their convergence is at worst as the
1
`
th power of ,
which is better than the rate in Corollary 4.2.
6. Residual analysis
Although the uniform separation condition insures convergence (in the sense of Theo-
rem 5.2) of the Ritz pair (N;
~
X) = (N;WZ) to the eigenpair (L;X), it does not tell
us when to stop the iteration. A widely used convergence criterion is to stop when the
residual
R = A
~
X BnZr
~
XN
is suciently small. Note that this residual is easily computable. For in the course of
computing the Rayleigh quotient B, we must compute the matrix V = AW. After the
pair (N;
~
X) has been determined we can compute R in the form VZ BnZr
~
XN.
We will be interested in the relation of the residual to the accuracy of
~
X = WZ, as
an approximation to the eigenbasis X in the eigenpair (X;L). To do this we will need
some additional notation. Let (X X
?
) unitary. Then

X
H
X
H
?

A(X X
?
) =

L G
0 M

; (6.1)
where the (2;1)-element of the right hand side is zero because X
H
?
AX = X
H
?
XL = 0.
The following theorem of Ipsen [3] holds for any approximate eigenpair.
Theorem 6.1. Let (
~
L;
~
X) be an approximate eigenpair of A and let
 = kA
~
X BnZr
~
X
~
Lk:
Then
sin
6
(
~
X;X)

sep(
~
L;M)
:
Proof. From (6.1) we have X
H
?
A = MX
H
?
. Hence if R = A
~
X BnZr
~
X
~
L, we have
X
H
?
R = X
H
?
A
~
X BnZrX
H
?
~
X
~
L = MX
H
?
~
X BnZrX
H
?
~
X
~
L:
It follows that
sin
6
(X;
~
X) = kX
H
?
~
Xk 
kRk
sep(
~
L;M)
:
10 Analysis of the Rayleigh{Ritz Method
In our application
~
X = WZ and
~
L = N. Hence the accuracy of the space spanned
by WZ as an approximation to the spaceX is proportional tothe size ofthe residual and
inversely proportional to sep(N;M). If the uniform separation condition holds, then up
to unitary similarities N ! L, sothat by the continuity ofsep, the accuracy is eectively
inversely proportional to sep(L;M). Unlike the bounds in the previous sections, these
bounds cannot be computed, since M is unknown. Nonetheless they provide us some
insight into the attainable accuracy of Ritz approximations to an eigenspace.
7. Rened Ritz vectors
When ` = 1, so that our concern is with approximating an eigenvector x and its eigen-
value , Theorem 6.1 has a suggestive implication. From Theorem 2.1, we know that
there is a Ritz pair (;Wz) = (;~x) such that  converges to . Hence by Theorem 6.1,
if the residual A~xBnZr~x approaches zero, ~x approaches x, independently of whether the
uniform separation condition (5.2) holds.
Unfortunately, if the uniform separation condition fails to hold, we will generally be
faced with a cluster of Ritz values and their Ritz vectors, ofwhich atmostone (and more
likely none) is a reasonable approximation to x. Now Theorem 6.1 does not require that
(; ~x) be a Ritz pair|only that  be suciently near , and that ~x have a suciently
small residual. Since the Ritz value  is known to converge to , this suggests that we
can deal with the problem of nonconverging Ritz vectors by retaining the Ritz value
and replacing the Ritz vector with a vector ^x 2 W having a suitably small residual.
It is natural to choose the best such vector. Thus we take ^x to be the solution of the
problem
minimize k(ABnZrI)^xk
subject to ^x 2 W;k^xk = 1:
Alternatively, ^x = Wv, where v is the right singular vector of (ABnZrI)W corresponding
to its smallest singular value. We will call such a vector a rened Ritz vector.
The following theorem shows that the rened Ritz vectors converge as  ! 0.
Theorem 7.1. If
sep(;M)  sep(;M)BnZrj BnZrj > 0; (7.1)
then
sin
6
(x;^x) 
kABnZrIk+jBnZrj
p
1BnZr
2
(sep(;M)BnZrjBnZrj)
: (7.2)
Analysis of the Rayleigh{Ritz Method 11
Proof. Let
^y =
P
W
x
p
1BnZr
2
be the normalized projection of x onto W [cf. (3.3)] and let
e = (I BnZrP
W
)x:
Then
(ABnZrI)^y =
(ABnZrI)P
W
x
p
1BnZr
2
=
(ABnZrI)(xBnZre)
p
1BnZr
2
=
(BnZr)xBnZr(ABnZrI)e
p
1BnZr
2
:
Hence
k(ABnZrI)^yk
kABnZrIk+jBnZrj
p
1BnZr
2
:
By the minimality of ^x we have
k(ABnZrI)^xk
kABnZrIk+jBnZrj
p
1BnZr
2
: (7.3)
Since k(ABnZrI)^xk is a residual norm, (7.2) follows directly from Theorem 6.1 and (2.4).
It follows immediately from (7.2) that if  !  as  ! 0 then the rened Ritz vector
^x converges to the eigenvector x. In particular, by Corollary 4.2 this will happen if  is
chosen to be the Ritz value.
As they are dened, Ritz vectors are computationally expensive. If A is of order n
and the dimension of W is m, a Rayleigh{Ritz procedure requires O(nm
2
) operations
to produce a complete set of m Ritz vectors. On the other hand, to compute a rened
Ritz vector requires the computation of the singular value decomposition of (ABnZrI)W,
which also requires O(nm
2
) operations. Thus in general the computation of a single re-
ned Ritz vector requires the same order of work as an entire Rayleigh{Ritz procedure.
Fortunately, if W is determined by a Krylov sequence, as in the Arnoldi method, this
work can be reduced to O(m
3
) [6, 8]. Moreover, if we are willing to sacrice some ac-
curacy we can compute the rened vectors from the cross product matrices W
H
A
H
AW
12 Analysis of the Rayleigh{Ritz Method
and W
H
AW with O(m
3
) work [10].
2
Thus in some important cases, the computa-
tion of rened Ritz vectors is a viable alternative to the Rayleigh{Ritz procedure for
eigenvectors.
There is a natural generalization ofrened Ritz vectors tohigher dimensions. Specif-
ically, given an approximate eigenblock N we can solve the problem
minimize k(A
^
X BnZr
^
XN)k
subject to R(
^
X)  W;
^
X
H
^
X = I:
The resulting basis satises a theorem analogous to Theorem 7.1. There are, however,
two diculties with this approach. First, there seems to be no reasonably ecient
algorithm for computing rened Ritz bases. Second, for the bound on the rened Ritz
basis toconvergetozero,we musthaveN ! L. However,theonly reasonable hypothesis
under which N ! L is the uniform separation condition, and if that condition is satised
the ordinary Ritz bases also converge.
8. Discussion
We have considered the convergence of Ritz pairs generated from a subspace W to an
eigenpair (L;X) associated with an eigenspace X. The results are cast in terms of the
quantity  = sin
6
(X;W). An appealing aspect of the analysis is that we do not need
to assume that eigenvalues of A are distinct or that A is nondefective.
The rst result shows that as  ! 0, there are eigenvalues of the Rayleigh quotient
B that converge to the eigenvalues of L. Unfortunately, that is not sucient for the
convergence of the eigenpairs, which requires the uniform separation condition (5.2) to
separate the converging eigenvalues from the remaining eigenvalues of B. This con-
dition, which can be monitored during the Rayleigh{Ritz steps, insures that the Ritz
pairs (N;
~
X) converge (with unitary adjustment) to the pair (L;X). The asymptotic
convergence bounds (5.3) show that the convergence is linear in .
When X is one dimensional|that is when we are concerned with approximating an
eigenpair (;x)|theRitz blocks, which becomescalarRitz values, convergewithoutthe
uniform separation condition. However, this condition is required for the convergence of
the Ritz vectors. Alternatively, we can compute rened Ritz vectors, whose convergence
is guaranteed without the uniform separation condition.
2
If the singular values of (ABnZrI)W are 
1
   
m
and 
m
is small, then the loss of accuracy
comes from the fact that the accuracy of the computed singular vector is around (
1
=
mBnZr1
)
M
, where

M
is the rounding unit. If we pass to the cross-product matrix then the ratio 
1
=
mBnZr1
is replaced by
the larger value (
1
=
mBnZr1
)
2
. If the ratio is near one, then the squaring will have little eect. But if the
ratio is fairly large|as it will be when we are attempting to resolve poorly separated eigenvalues |then
considerable accuracy can be lost.
Analysis of the Rayleigh{Ritz Method 13
We have analyzed only the the simplest version of Rayleigh{Ritz procedure. In some
forms of the method, the Rayleigh quotient is dened by V
H
AW, where V
H
W = I. We
expect that the above results will generalize easily to this case provided the product
kVkkWk remains uniformly bounded.
References
[1] B. N. Datta. Numerical Linear Algebra and Applications. Brooks/Cole, Pacic
Grove, CA, 1995.
[2] G. H. Golub and C. F. Van Loan. Matrix Computations. Johns Hopkins University
Press, Baltimore, MD, second edition, 1989.
[3] I. C. F. Ipsen. Absolute and relative perturbation bounds for invariant subspaces
of matrices. Technical Report TR97-35, Center for Research in Scientic Compu-
tation, Mathematics Department, North Carolina State Unversity, 1998.
[4] Z. Jia. Some Numerical Methods for Large Unsymmetric Eigenproblems. PhD
thesis, University of Bielefeld, 1994.
[5] Z. Jia. The convergence of generalized Lanczos methods for large unsymmetric
eigenproblems. SIAM Journal on Matrix Analysis and Applications, 16:843{862,
1995.
[6] Z. Jia. Rened iterative algorithms based on Arnoldi's process for large unsym-
metric eigenproblems. Linear Algebra and Its Applications, 259:1{23, 1997.
[7] Z. Jia. Generalized block Lanczos methods for large unsymmetric eigenproblems.
Numerische Mathematik, 80: 539{566, 1998.
[8] Z. Jia. A rened algorithm based on the block Arnoldi algorithm for large unsym-
metric eigenproblems. Linear Algebra and Its Applications, 270:171{189, 1998.
[9] Z. Jia. Polynomial characterizations of the approximate eigenvectors by the rened
Arnoldi method and an implicitly restarted rened Arnoldi algorithms. Linear
Algebra and Its Applications, 287: 191{214, 1999.
[10] Z. Jia. A rened subspace iteration algorithm for large sparse eigenproblems. Ap-
plied Numerical Mathematics, to appear.
[11] Y. Saad. Numerical Methods for Large Eigenvalue Problems: Theory and Algo-
rithms. John Wiley, New York, 1992. Cited in [1].
14 Analysis of the Rayleigh{Ritz Method
[12] G. L. G. Sleijpen and H. A. Van der Vorst. A Jacobi{Davidson iteration method for
linear eigenvalue problems. SIAM Journal on Matrix Analysis and Applications,
17:401{425, 1996.
[13] G. W. Stewart and J.-G. Sun. Matrix Perturbation Theory. Academic Press, New
York, 1990.