Approximation Algorithms for Connected Dominating Sets

Sudipto Guha
y
Dept. of Computer Science
University of Maryland,College Park,
MD 20742
Samir Khuller
z
Dept. of Computer Science and UMIACS
University of Maryland,College Park,
MD 20742
Abstract
The dominatingset problem in graphs asks for a minimumsize subset of vertices with the
following property: each vertex is required to either be in the dominating set, or adjacent
to some node in the dominating set. We focus on the question of nding a connected
dominating set of minimum size, where the graph induced by vertices in the dominating set
is required to be connected as well. This problem arises in network testing, as well as in
wireless communication.
Two polynomial time algorithms that achieve approximation factors of O(H()) are
presented, where  is the maximum degree, and H is the harmonic function. This ques-
tion also arises in relation to the traveling tourist problem, where one is looking for the
shortest tour such that each vertex is either visited, or has at least one of its neighbors
visited. We study a generalization of the problem when the vertices have weights, and
give an algorithm which achieves a performance ratio of 3lnn. We also consider the more
general problem of nding a connected dominating set of a specied subset of vertices and
provide an O(H()) approximation factor. To prove the bound we also develop an optimal
approximation algorithm for the unit node weighted Steiner tree problem.
1 Introduction
The connected dominating set problem is dened as follows. Find a minimum size subset S of
vertices, such that the subgraph induced by S is connected and S forms a dominating set. This
problem is known to be NP-hard [7]. Recall that a dominating set is one in which each vertex
is either in the dominating set, or adjacent to some vertex in the dominating set.

A preliminary version of this paper will appear in the Proceedings of the Fourth Annual European Symposium
on Algorithms (ESA 1996).
y
Research supported by NSF Research Initiation Award CCR-9307462. Email addr: sudipto@cs.umd.edu
z
Research supported by NSF Research Initiation Award CCR-9307462, and NSF CAREER Award CCR-
9501355. Email addr: samir@cs.umd.edu
1
A related problem is the traveling tourist problem. Given a graph G = (V;E) nd the
shortest walk visiting a subset of vertices, such that each vertex is either visited, or has at
least one of its neighbors visited. (The vertices of the graph correspond to monuments the
tourist would like to see, and an edge between two vertices denotes visibility of one monument
from another.) The shortest such walk would guarantee that the tourist sees all monuments of
interest.
We show that a  approximation for the connected dominating set problem yields a 2
approximation for the traveling tourist problem. Consider a spanning tree of the connected
dominating set S and perform a tree traversal. This yields a walk in which exactly 2(jSjBnZr 1)
edges are traversed. Any set of vertices visited by the tourist, form a connected dominating
set. Thus S  OPT  OPT
TT
, where OPT
TT
denotes an optimal traveling tourist tour,
and the result follows.
We also study the connected dominating set problem when the vertices have weights, and
we wish to minimize the total weighted sum of the vertices that form the connected dominating
set. This also yields an approximation algorithm for the weighted traveling tourist problem,
where the weights could potentially denote the tourist's cost of buying a ticket to visit the
monument.
We also consider Steiner generalizations, where only a specied subset of vertices have to
be dominated by a connected dominating set.
1.1 Our Results
We present two approximation algorithms for this problem. The rst algorithm develops a
greedy algorithm for solving the problem. A naive greedy algorithm is shown to do badly.
Surprisingly, with a simple modication we are able to show an approximation factor of 2(1+
H()) (in practice, this algorithm appears to do very well). We also provide a very ecient
implementation of this algorithm.
The second algorithm is an improvement of the rst algorithm. The algorithm nds a
dominating set in the rst phase, and in the second phase connects the dominating set. In an
earlier version of this paper [8] we established a bound of H() + H(H()). Using Slav

ik's
greedy set-cover bound [17], we were able to show that the approximation factor is lnn+O(1).
Recently, Berman suggested a modication to the algorithm, which improves the approximation
factor to H() + 2. We describe this algorithm and give a simple proof for a performance
guarantee of ln+3.
We also show an approximation preserving reduction from the set-cover problem to the con-
nected dominating set problem, showing that it is hard to improve the approximation guarantee
unless NP  DTIME[n
O(loglogn)
] [13, 6]. We give a 3lnn approximation for the version when
the vertices have weights. We also show that the upper bound of 2lnk for approximating node
weighted Steiner trees [10], can be improved to lnk, when all Steiner vertices have unit weight.
We then use this result to give a 3lnk approximation for nding a connected dominating set for
a specied subset of vertices. We also outline a second algorithm that gives an approximation
factor of (1 + c)H(min(;k))+ O(1), where c is the best approximation ratio for the Steiner
2
tree problem (currently c = 1:644 [12]). Even though this algorithm has a better approximation
guarantee, it is not practical due to the high running time, albeit polynomial.
1.2 Preliminaries
The Steiner tree problem is dened as follows: given a subset of required vertices in an edge
weighted graph, nd a minimum weight tree spanning the required subset of vertices. (Note
that the tree may include other vertices that are not required vertices.) The node weighted
Steiner tree problem is essentially the same problem, except that the vertices of the graph have
weights associated with them and the weight of the tree is the sum of the weights of its vertices.
The Unit Node Weighted Steiner tree is the special case when all vertices that are not required,
have the same weight. The required vertices all have zero weight.
The set cover problem is the following: given a set of elements U, and a set of subsets S, of
U, we wish to nd the smallest collection of sets S
0
 S such that [
S2S
0
S = U.
The set TSP problem is dened as follows: given an edge weighted graph G = (V;E) and
a partition of V = (V
1
[V
2
[:::[V
k
), nd the shortest tour that contains at least one vertex
from each V
i
.
Given a graph G = (V;E), we use  to denote the maximum degree of a vertex in the
graph. We use n and m to denote the number of vertices and edges in G. We use N(v) to
denote the set of neighbors of a vertex v.
1.3 Applications
The paper by Paul and Miller [15] discusses applications related to testing nodes in a computer
networkusing a short \traveling touristtour". They also consider the related question of nding
a tour that visits each edge of the graph (connected vertex cover). This is needed when one
requires testing the links as well as the nodes. Approximation algorithms for the latter problem
were given by Arkin, Halldorsson and Hassin [1]. We observe that there is a simple algorithm
for the unweighted connected vertex cover problem that gives a factor 2 approximation (the one
given in [1] is more complicated). Do a Depth First Search, and take all the non-leaf vertices as
the nodes in the vertex cover. This clearly induces a connected graph, and the approximation
ratio is 2, as shown by Savage [16]. In practice, however this method will probably give large
connected vertex covers.
Other applications for the connected dominating set problem are in doing broadcasts for
wireless computers in digital battleelds. The broadcast is done to the vertices in the con-
nected dominating set. The nodes in the connected dominating set are responsible for relaying
messages. Each node not in the dominating set, is not responsible for relaying any messages
[9]. Other relevant issues are regarding the maintenance of the connected dominating set as the
network topology changes.
2 Algorithm I
We introduce an algorithm that nds a connected dominating set, by \growing" a tree.
3
The idea behind the algorithm is the following: grow a tree T, starting from the vertex of
maximum degree. At each step we will pick a vertex v in T and \scan it". Scanning a vertex,
adds edges to T from v to all its neighbors not in T. In the end we will nd a spanning tree T,
and will pick the non-leaf nodes as the connected dominating set.
Initially all vertices are unmarked (white). When we scan a vertex (color it black), we mark
all its neighbors that are not in T and add them to T (color them gray). Thus marked nodes
that have not been scanned are leaves in T (gray nodes). The algorithm continues scanning
marked nodes, until all the vertices are marked (grayor black). The set of scanned nodes (black
nodes) will form the CDS in the end.
The main question is the following: what rule should we use for picking a vertex to be
scanned? A natural choice is to pick the vertex that has the maximum number of unmarked
(white) neighbors. We call this the \yield" of the scan step. Unfortunately, as the following
example shows this may not work well (see Fig. 1).
u
v
N(u)
N(v)
Figure 1: Example to show that the scanning rule fails.
Let u and v be vertices of degree d. There is a solution of size four, by picking a path from
u to v as the CDS. The algorithm begins by marking and scanning u. This adds all of u's
neighbors to T. We pick a vertex from N(u) and scan it, adding its only unmarked neighbor
(from N(v)) to T. At this point, each vertex has exactly one unmarked neighbor. We could
pick a vertex from N(u) again, and scan it, adding its only unmarked neighbor to T. This
continues until all the vertices from N(u) have been scanned. Finally we scan a vertex from
N(v) and mark v. At this point, the algorithm has picked d+ 2 vertices.
Implementation Issues: The above algorithm can be implemented in O(m) time (and was
implemented). To achieve this running time, we use a data structure DS that maintains all
gray vertices in T with a key value equal to the number of white neighbors that they have.
Rather than using a heap, we maintain an array of linked lists, where DS[i] is a pointer to a
(doubly linked) list containing all the gray vertices that have exactly i white neighbors. We also
maintain an integer maxd that records the maximum i, such that DS[i] 6= nil. This makes it
easy to locate a gray vertex with the highest \yield". The main work is in updating the value
of maxd when DS[maxd] becomes nil. The work that is done is at most O(maxd), and since
at this step, maxd vertices are colored gray, we can \charge" the work done to the vertices
that are colored gray at this step. (Equivalently, we could develop a potential function to prove
4
this.) The other operations are easy to perform (for example, when a vertex is colored gray,
we need to update the entries for its neighbors that are already in DS and create an entry in
DS for this vertex). The entire algorithm runs in O(m) steps. This implementation is useful
because it leads to a heuristic for the maximum leaf spanning tree problem as well [11].
Modied Greedy Algorithm: We now modify the scanning rule to prove a good approxi-
mation ratio for this class of algorithms (that grow a connected dominating set). We dene a
new operation of scanning a pair of adjacent vertices u and v. Let u be gray and v be white.
Scanning the pair means, rst making u black (this makes v along with some other nodes, gray)
and then coloring v black (makes more nodes gray). The total number of nodes that are colored
gray is called the \yield" of the scan step. At each step, we will either scan a single vertex, or
a pair of vertices, whichever gives the higher yield. (In some sense we are doing a \look-ahead"
by one extra vertex, and are willing to scan a pair, if this has a higher yield.)
It is clear that this algorithm nds the optimal solution in the example shown in Fig. 1.
What is perhaps a little surprising, is that this simple modication lets us prove the following
theorem.
Theorem 2.1 Using the scanning rule described above yields a connected dominating set of
size at most 2(1+ H())jOPT
DS
j.
Proof: Let OPT
DS
be the set of vertices in an optimal dominating set. The sets of vertices
of G dominated by vertex i 2 OPT
DS
is called S
i
(we assume that i also belongs to S
i
. If a
vertex is dominated by more than one vertex, we arbitrarily put it in one of the sets). The
proof will be based on a charging scheme. Each time we scan a vertex, we add a new vertex to
our connected dominating set. We will \charge" each new vertex marked (colored gray) in this
step. Since each vertex in the graph gets marked exactly once, it is charged exactly once (the
rst time it is marked). We will then prove that the total charge on the vertices belonging to a
set S
i
(for any i) is at most 2(1+H()). Since there are jOPT
DS
j sets in the optimal solution,
the theorem follows.
Assume that when we pick a vertex to scan, we mark x new vertices. We will charge each
such newly marked vertex
1
x
. In some steps we scan two vertices, and charge each newly marked
vertex
2
x
. The main advantage of the \look-ahead" is the following. The instant we mark some
nodes in set S
i
, even if vertex i has not been marked, since it is adjacent to a marked vertex, it
becomes eligible to be scanned as part of a pair. Without the look-ahead, only marked vertices
were candidates to be scanned .
We now prove the upper bound on the total charges to vertices belonging to a single set
S
i
. At each step, some vertices may get marked. The number of unmarked vertices is initially
u
0
, and nally drops to 0. Let u
j
denote the number of unmarked vertices after step j. For
simplicity, let us assume that at each step some vertices of S
i
are marked, so the number of
unmarked vertices decreases at each step.
The number of marked vertices after the rst step is u
0
BnZru
1
. Each vertex gets a charge of
at most
2
u
0
BnZru
1
(the actual charge may be a lot smaller, if only one vertex was scanned at this
step, or if we marked many other vertices as well). Once some vertex in S
i
is marked, vertex i
becomes an \eligible" vertex to be scanned as a part of a pair, since it is adjacent to a marked
5
vertex. In the j
th
step, the number of vertices of set S
i
that get marked is u
j
BnZru
j+1
, and the
charge to each vertex is at most
2
u
j
as vertex i was an eligible vertex to be scanned. Let u
k
= 0.
Adding up all the charges we get
2
u
0
BnZru
1
(u
0
BnZru
1
)+
kBnZr1
X
j=1
2
u
j
(u
j
BnZru
j+1
)
 2+2
kBnZr1
X
j=1
(u
j
BnZru
j+1
)
u
j
:
(With some algebraic manipulation (see [5, page 977]), one can show that this is at most
2(1+ H()). 2
Remark: We could modify the algorithm and at each step scan either one or two vertices,
whichever results in a smaller charge to each vertex. In practice, this should give better solu-
tions.
Implementation Issues: A naive implementation appears to give a worst case running time
of O(mn
2
). In each iteration we choose either one vertex, or a pair of vertices, and color them
black. It is clear that we may have (n) iterations, since the optimal solution may have (n)
vertices. In each iteration, we wish to identify a pair of nodes with the highest yield. For each
gray vertex u, we scan its adjacency list and consider all its white neighbors. For each white
neighbor v of u, we wish to determine the number of vertices that would get marked if we
scanned the pair (u;v). Since u and v have common white neighbors, we cannot simply add
up the number of white neighbors of each vertex to obtain the \yield" of this pair. We need to
identify the number of white neighbors of v that are not adjacent to u (since those will not be
colored gray by u). The number of steps in a single iteration can be computed as follows.
Let G be the gray nodes in T. Let W be the white vertices that are adjacent to gray vertices.
We can upper bound the total work done in a single iteration as follows:
S =
X
u2G
X
v2N(u)^v2W
d(v) :
In the double summation each vertex in W is counted as many times as the number of its gray
neighbors, we obtain the following.
S 
X
v2W
d(v)
2

X
v2W
nd(v)  O(mn) :
This yields a bound of O(mn
2
). We now show that the total number of steps over all iterations
is O(mn) by a more careful analysis.
For each vertex we can maintain two adjacency lists, one of its gray neighbors and one of
its white neighbors. We use d
W
(u) to denote the number of white neighbors of u and d
G
(u) to
denote the number of gray neighbors of u. The work done in a single iteration is as follows:
S =
X
u2G
X
v2W^v2N(u)
d
W
(v)
6
=
X
v2W
d
W
(v)d
G
(v) :
(In the double summation, each vertex v is counted as many times as the number of its gray
neighbors.) Observe that at this step, we will make a subset of white vertices gray.
Lemma 2.2 The number of white vertices that are made gray in this iteration is at least
1
2
max
v2W
d
W
(v) :
Proof: We pick the pair of vertices that give the highest \yield"; we certainly consider all such
vertices v, and color their white neighbors gray. We might pick a single vertex with a smaller
yield, but only if its yield is at least half the yield of a pair of vertices. 2
At this step, we can \charge" the vertices whose color changed from white to gray. The
charge to each such vertex is at most
P
v2W
d
W
(v)d
G
(v)
1
2
max
v2W
d
W
(v)
 2
X
v2W
d
G
(v) = 4m :
Since each vertex changes color from white to gray exactly once over the entire algorithm,
and there are n such vertices the total number of steps is O(mn).
The only remaining issue is maintaining the required adjacency lists. This can be done each
time we change the color of a vertex from white to gray by scanning its adjacency list, and
updating the structures for its neighbors.
3 Algorithm II
An alternate approach to growing one connected tree is to grow separate components that
form a dominating set and to then connect them together. One straightforward approach is
to nd a dominating set using a greedy heuristic, and to use a Steiner tree approximation to
connect it. Since members of the optimum connected dominating set along with the members
of the dominating set we found, form a spanning tree, we can prove a performance guarantee of
c(1+ H()), where c is the best approximation ratio for the unweighted Steiner tree problem
(currently c = 1:644 [12]).
For the special case when the required vertices form a dominating set in a graph and all
edges have unit weight, Berman and Furer [3] have announced a new algorithm with c =
4
3
.
Thus we can improve the performance ratio to
4
3
(1 + H()). By applying a simple greedy
strategy to connect the vertices in the dominating set, we proved a bound of H()+H(H())
[8]. Here we present a modication of the above algorithm, as suggested by Berman [2], and
are able to prove a performance guarantee of ln + 3. (Berman has an alternate proof for a
performance ratio of H()+2.)
7
The algorithm runs in two phases. At the start of the rst phase all nodes are colored
white. Each time we include a vertex in the dominating set, we color it black. Nodes that are
dominated are colored gray (once they are adjacent to a black node). In the rst phase the
algorithm picks a node at each step and colors it black, coloring all adjacent white nodes gray.
A piece is dened as a white node or a black connected component. At each step we pick a
node to color black that gives the maximum (non-zero) reduction in the number of pieces.
We show that at the end of this phase if no vertex gives a non-zero reduction to the number
of pieces, then there are no white nodes left.
In the second phase, we have a collection of black connected components that we need to
connect. Recursively connect pairs of black components by choosing a chain of two vertices,
until there is one black connected component. Our nal solution is the set of black vertices
that form the connected component.
Lemma 3.1 At the end of the rst phase there are no white vertices left.
Proof: Suppose there is a white node v at the end of the phase. We will show that there is
a vertex that strictly reduces the number of pieces. If v has a white neighbor then coloring v
black, reduces the number of white nodes by two, and increases the number of black components
by one, thus picking v would reduce the number of pieces. Otherwise, v has a gray neighbor
u. Coloring u black would reduce the number of white nodes, and not increase the number of
black components since u is adjacent to a black node. Thus picking u reduces the number of
pieces. 2
We show that at the end of the rst phase there is always a pair of black components that
can be connected by choosing a chain of two vertices. For each such component i, we consider
the shortest path to component j. The path goes through vertices u
1
;u
2
;u
3
;:::;u
k
not in
components i or j. u
1
is dominated by a vertex in component i. Observe that u
2
is gray,
otherwise picking u
1
would give a strict reduction in the number of pieces. Thus u
2
is adjacent
to a black component ` (` 6= i since we selected the shortest path from i to j). Components i
and ` can be connected by choosing a chain of two vertices.
Theorem 3.2 The connected dominating set found by the algorithm is of size at most (ln+
3)jOPTj.
Proof: Dene a
i
as the number of pieces left after i
th
iteration, and a
0
= n. Since a node can
connect up to  pieces, jOPTj 
a
0

. Consider i + 1
st
iteration. The optimal solution can
connect a
i
pieces. Hence the greedy procedure is guaranteed to pick a node which connects at
least
l
a
i
jOPTj
m
pieces. This gives us the recurrence relation,
a
i+1
 a
i
BnZr

a
i
jOPTj

+ 1  a
i
(1BnZr
1
jOPTj
)+ 1 :
Its solution is,
a
i+1
 a
0
(1BnZr
1
jOPTj
)
i
+
iBnZr1
X
j=i
(1BnZr
1
jOPTj
)
j
:
8
Notice after jOPTj ln
a
0
jOPTj
iterations, the number of pieces left is less than 2  jOPTj. For
each node we choose, we will decrease the number of pieces by at least one. This will continue
until the number of black components is at most jOPTj, thus at most jOPTj more vertices are
picked.
Assume from this point onwards, we stop after choosing a
f
more nodes. The number
of pieces left to connect is at most jOPTj BnZr a
f
. We connect the remaining pieces choosing
chains of two vertices in the second phase. The total number of nodes chosen is at most
jOPTjln
a
0
jOPTj
+jOPTj+a
f
+2(jOPTjBnZra
f
), and since jOPTj 
a
0

, the solution found has
at most jOPTj(ln+ 3) nodes. 2
Remark: Berman, [2], has an alternate proof of H()+2 of the same algorithm. However,
since ln  H()BnZr 0:7, the dierence is very small.
4 Generalizations
4.1 Vertex Weighted Graphs
An approximation factor of 3lnn is possible when the vertices have weights. The algorithm
rst nds a dominating set, and then connects the nodes in the dominating set.
Step 1. Use a weighted set cover approximation algorithm to nd a dominating set DS. (A set
cover instance is created by making each vertex an element, and each vertex corresponds to a
set that contains the vertex itself, together with its neighbors. The greedy algorithm picks sets
based on the ratio of their weight to the number of new elements they cover.)
Step 2. To connect the vertices in DS we use a node-weighted Steiner tree approximation
algorithm due to Klein and Ravi [10] to nd a Steiner tree that includes all the vertices in DS,
after making the weights of all vertices in DS equal to zero. This yields a connected dominating
set CDS.
Theorem 4.1 The weight of vertices in CDS is at most 3lnn  jOPTj where OPT is the
minimum weight connected dominating set in G.
Proof: The weight of the vertices in DS is at most lnjOPTj. We now run the algorithm by
Klein and Ravi [10] for the node-weighted Steiner tree case. The approximation factor of the
algorithm is 2lnk, where k is the number of Steiner vertices. Consider the vertices in OPT;
these together with the vertices in DS induce a connected subgraph. Hence there exists a
node weighted Steiner tree of weight OPT. The total weight of the vertices in the connected
dominating set is the weight of DS together with the weights of optional vertices chosen from
G in the Steiner instance. Adding the weight of the two sets gives the required bound. 2
Before looking at other generalizations, we rst consider a problem closely related to our
discussion.
9
4.2 Unit Node Weighted Steiner Trees
The best known algorithm for node weighted Steiner trees, has a performance ratio of 2lnk,
where k is the number of required vertices [10]. However, if the nodes have unit weight, there
is a simpler algorithm, which gives a better performance ratio.
We have k required vertices in a graph G = (V;E), which we want to connect using the
least number of non-required vertices. We assume that the non-required vertices have weight
1, and the required vertices have weight 0.
Our algorithm runs in two phases. In the rst phase, the algorithm greedily picks high
degree stars (a star is a vertex that has at least two required vertices belonging to distinct
components as neighbors) and merges them, until very few components are left. In the second
phase, the algorithm runs a Steiner tree approximation algorithm with each edge having unit
weight.
In a preprocessing phase we merge all adjacent required vertices into their connected com-
ponents. We pick  = 2c+1 where c is the best approximation ratio for the unweighted Steiner
tree problem.
Algorithm A
Step 1. In each iteration choose a vertex that merges the largest number of required vertices
until we reach a stage that the number of components left to merge is less than
iteration count
lnkBnZr
+e

or no merging is possible.
Step 2. Apply an (edge weighted) Steiner tree approximation algorithm, with each edge having
unit weight.
Theorem 4.2 The above algorithm nds a solution to the unit node weighted Steiner tree
(UNST) problem with an approximation factor of lnk (which is best possible), when the optimal
solution is greater than ce

.
Proof: Assume that the set of components remaining after the rst phase is A
0
. We claim that
there is a Steiner tree with jA
0
j+jOPTj edges. Thus when we apply an (edge weighted) Steiner
tree approximation, we get a tree with at most c(jA
0
j+ jOPTj) edges.
If the number of iterations in the rst phase is r, the nal solution has a cost r+c(jA
0
j+
jOPTj). We now proceed to give a bound on r.
Let a
i
components be left after i
th
iteration. Since jOPTj nodes are capable of merging
these components, for each i, in the i
th
iteration, there must be a node that merges
l
a
iBnZr1
jOPTj
m
components. This gives a bound on a
i
,
a
i
 a
iBnZr1
BnZr

a
iBnZr1
jOPTj

+ 1  a
iBnZr1
(1BnZr
1
jOPTj
)+ 1:
We can easily verify that a
i
 a
0
 (1 BnZr
1
jOPTj
)
i
+
P
iBnZr1
j=0
(1 BnZr
1
jOPTj
)
j
. The second term is a
geometric series that sums to at most jOPTj. Thus when i = (lnkBnZr)jOPTj the rst term
10
is at most e

, and the number of components a
i
 jOPTj+ e


i
lnkBnZr
+ e

. This guarantees
that the number of iterations, r  (lnkBnZr)jOPTj.
If we stop because no merging by stars is possible, then the components have disjoint
neighborhoods, and OPT has to pick at least one vertex from each neighborhood. Thus jA
0
j 
jOPTj. If we stop because the number of components is small, then jA
0
j  jOPTj + e

. In
any case, jA
0
j  jOPTj + e

and this yields a solution of cost at most lnk jOPTj + ce

+
(2cBnZr )jOPTj. Putting  = 2c + 1 gives at most lnk  jOPTj vertices in our solution (when
jOPTj  ce
2c+1
. 2
The optimality of this approximation ratio was established by Berman (see [10]).
We can modify the above algorithm, to run until no further merging is possible.
Algorithm B
Step 1. In each iteration choose a vertex that merges the largest number of required vertices
(at least two).
Step 2. Apply an (edge weighted) Steiner tree approximation algorithm, with each edge having
unit weight.
Theorem 4.3 The above algorithm nds a solution to the unit node weighted Steiner tree
(UNST) problem with an approximation factor of ln+ 2c+1 .
Proof: As before, let a
i
denote the number of vertices left after the i
th
iteration and a
0
= n.
Then after jOPTjln
a
0
jOPTj
; there are at most 2jOPTj components to connect. Hence we will
continue to merge by stars for jOPTj more iterations then the number of components will be
denitely less than jOPTj.
Since each Steiner vertex can be adjacent to at most  required vertices, jOPTj 
a
0

.
If at this stage we use a
f
more iterations before invoking the edge weighted Steiner tree
algorithm, there is a tree with jOPTj BnZr a
f
+ jOPTj edges. So we nd a solution of cost at
most c  (jOPTj BnZr a
f
+ jOPTj). The nal solution has at most jOPTj  ln
a
0
jOPTj
+ jOPTj +
a
f
+ c (jOPTjBnZra
f
+ jOPTj) nodes. Since jOPTj 
a
0

, we get a performance guarantee of
ln+ 2c+ 1 for the algorithm.
2
4.3 Dominating a Subset of Vertices
We now address the connected dominating set problem when we are required to dominate only
a specied subset S of the vertices. The cost of the solution is the size of the smallest connected
dominating set that dominates the vertices in S. (Notice that the objective function is slightly
dierent from the unit node weighted Steiner tree problem, where required vertices have zero
cost. In the Steiner CDS problem, we are charged for all vertices in the nal solution that are
not leaf nodes in the tree that connects S.)
11
Unweighted Graphs
Let jSj = k, and OPT denote the optimal solution. we present two algorithm that solve this
problem. A straightforward strategy is to rst nd a small dominating set A, of the vertices in
S, and to then connect these nodes.
Algorithm A
Step 1. Greedily choose a dominating set of the vertices in S. We can transform this to a
set cover problem in which corresponding to each vertex v we have a set that includes all its
neighbors and itself. The greedy algorithm for set cover yields a dominating set A.
Step 2. For each element in A choose a representative element in S that is adjacent to it. Call
this set R(A). Run the unit node weighted Steiner tree approximation algorithm to nd a
Steiner tree with required set R(A). The nal solution is the union of A, R(A), and the Steiner
tree vertices.
Theorem 4.4 The connected dominating set for the subset S of size k, is at most 3lnk times
the optimal.
Proof: Since we chose the cover greedily, we have that jAj  jOPTj lnk, since OPT forms a
dominating set for S.
Notice that jAj  k. We cannot claim that there is a Steiner tree of size jOPTj connecting
the set A. But there is a Steiner tree of size jOPTj connecting the elements of set R(A), since
the connected dominating set also forms a Steiner tree on the members of S, and R(A)  S.
Notice that jR(A)j jAj  k. Apply Theorem 4.2, and obtain a Steiner tree of R(A), of size at
most jOPTjlnjR(A)j. So the nal solution is of cost less than jAj+jR(A)j+jOPTjlnjR(A)j 
3lnkjOPTj. 2
Algorithm B
Step 1. We modify the greedy set cover algorithm on the set S, to run until no vertex covers
more than one uncovered vertex of S. We call the set of vertices chosen as B.
Step 2. We now choose the uncovered vertices of S , calling this set B
0
.
Step 3. For each member of B, choose a representative element of S that it dominates.
Let this set be R(B). We apply an (edge weighted) Steiner tree approximation, with the set of
required nodes as R(B)[B
0
. The nal solution is the nodes of this tree and the nodes of B.
Theorem 4.5 The connected dominating set for the subset S, is at most (c+ 1)H()+ cBnZr 1
times the optimal (where c is the Steiner ratio). We dene  as the size of the largest subset of
S, adjacent to a node in the graph (  min(;k)).
Proof: By a slight modication to the proof given in [5, page 977] we can prove, jBj (H()BnZr
1) jOPTj. (Since the rst step reduces to nding a set cover with the size of the largest set
being ). Since OPT cannot dominate any two vertices of B
0
by one vertex, jB
0
j  jOPTj.
Notice B [B
0
dominates the set S.
12
Consider the set R(B); there is a Steiner tree with jR(B)j+jB
0
j+jOPTj edges that connects
the nodes of R(B)[B
0
.
Apply an (edge weighted) Steiner tree approximation, with all edges having unit weight, and
nd a tree of size c(jR(B)j+jB
0
j+jOPTj), where c is the Steiner ratio [12]. Since this tree is
edge weighted, it has essentially the same number of nodes, including those of R(B)[B
0
. Since
we havetoadd the vertices ofB aswell, wegetan upper bound ofc(jR(B)j+jB
0
j+jOPTj)+jBj.
Notice that jR(B)j  jBj  (H()BnZr 1)jOPTj, and jB
0
j  jOPTj. This gives us a solution of
cost at most ((c+1)H()+ cBnZr 1)jOPTj. 2
This denitely is a better algorithm in terms of the worst case approximation guarantee.
However the rst algorithms is simpler and faster. Most of the approximation algorithms that
reduce the Steiner ratio below 2, have a high running time [4, 12].
5 Lower Bounds
5.1 Hardness result for Connected Dominating Set
We can prove that the set-cover problem can be reduced to the connected dominating set
problem by an approximation preserving reduction, thus showing that the approximation factor
H() will be hard to improve. This is based on the hardness results for set cover proven by
Lund and Yannakakis [13] and Feige [6].
Given a set cover instance we reduce it to a connected dominating set problem as follows:
Let the set cover instance be to cover the set U, with minimum number of sets from the
collection S = fS
1
;S
2
;:::;S
m
g.
Construct a graph G, that has vertex set U
S
fu;v;v
1
;v
2
;:::;v
m
g. An element e 2 U, and
v
i
has an edge joining them i e 2 S
i
. Each v
i
has an edge to v. u has an edge only to v. (see
Fig. 2)
u
v
v
m
v
1
e
1
e
3
e
2
e
n
Figure 2: Reduction of set cover to connected dominating sets.
Let us look at a minimum connected dominating set of G. Vertex v belongs to any con-
nected dominating set, and hence u does not belong to any minimal connected dominating set.
13
No vertex e
j
is chosen in a minimal connected dominating set, since any node that it might
potentially dominate, is already dominated by v, which also provides the connectivity. Hence
we will only have v and some v
i
's. These v
i
's will correspond to the minimum cover for the
given instance of set cover.
The size of the connected dominating set is one more than the minimum set cover. Thus ap-
proximating the connected dominating set with a factorof (1BnZr)H() would mean approximat-
ing minimum setcoverwithin thesamefactor. This wouldimply thatNP  DTIME[n
O(loglogn)
]
[6].
5.2 Hardness results for Generalizations
We show two simple reductions, that demonstrate that other generalizations of the CDS prob-
lem may be as hard to approximate as the \set TSP" problem for which no approximation
algorithms are known. (For the Euclidean case, Mata and Mitchell [14] have given approxima-
tion algorithms for this problem.)
c
j
V
j
Figure 3: Reduction of set TSP problem to edge weighted CDS
Theorem 5.1 A polynomial approximation algorithm for the edge weighted connected domi-
nating set problem with factor f(n) would imply a polynomial approximation algorithm for the
set TSP problem with factor 2f(n).
Proof: We show how to reduce the set TSP problem to the edge weighted connected dominating
set problem. Consider a set TSP instance G = (V;E) where V = (V
1
[V
2
[:::[V
k
). For each
subset V
j
, introduce a special vertex c
j
, and add edges from c
j
to all v 2 V
j
, with very high
cost edges. For u;v 2 V
j
, if (u;v) 62 E, add the edge (u;v) with very high cost. Call this new
graph G
0
.
14
Any set TSP tour in G chooses at least one vertex of V
j
to visit. Thus all nodes of V
j
[fc
j
g
will be dominated by the corresponding node in the tour. Since every node of G occurs in some
V
j
, this yields a dominating set. Since these are nodes on a tour, they also form a connected
set. Hence OPT
CDS
 OPT
TOUR
.
If we have a connected dominating set of G
0
, then it must have a vertex of V
j
to dominate
c
j
. Hence the dominating set must have at least one vertex from each set V
j
. If the cost of this
connected dominating set is small ( f(n)OPT
CDS
), since we are not using the high cost edges
in G
0
, we are using only the edges of the graph G. By traversing this tree twice, we can produce
a tour in G, with cost at most 2f(n)OPT
CDS
 2f(n)OPT
TOUR
. Thus, if we can approximate
the connected dominating set with edge weights to a factor f(n), we can approximate set TSP
within a factor 2f(n). 2
Theorem 5.2 A polynomial approximation algorithm for the node weighted Steiner connected
dominating set problem with factor f(n) would imply a polynomial approximation algorithm for
the set TSP problem with factor 2f(n).
Proof: The proof is similar to the proof of the previous theorem. Given a set TSP instance
G = (V;E) where V = (V
1
[ V
2
[ :::[ V
k
) we construct a graph G
0
. First convert the edge
weights of the set TSP problem into node weights. For every edge e = (v
p
;v
q
) 2 E, create an
extra node v
pq
of the same cost, connected to v
p
and v
q
. All other nodes are given 0 cost. For
every subset V
j
, introduce a special vertex c
j
(of very high cost), and connect it to all v 2 V
j
.
We show that the problem reduces to nding a node weighted connected dominating set of the
subset U = fc
j
jj = 1:::kg of nodes of G
0
.
Any set TSP tour in G chooses at least one vertex of V
j
to visit. Thus each c
j
will be
dominated. The weight of the edges e = (v
p
;v
q
) translates to the weight of the corresponding
vertices v
pq
. Since the nodes form a tour, they also form a connected set in G
0
, together with
the new nodes that subdivide edges. Thus OPT
CDS
 OPT
TOUR
.
Consider a connected dominating set that dominates U. To dominate c
j
, it must pick a
vertex from V
j
. (W.l.o.g, the connected dominating set does not contain c
j
.) If the cost of
this connected dominating set is small ( f(n)OPT
CDS
), (since we are not using the high cost
nodes in G
0
), we are using only the nodes of the graph G along with nodes that correspond
to the subdivided edges. Thus the dominating set chooses vertices that are also in G, and the
corresponding vertices for each edge of G that it includes. This yields a tree that connects
at least one element from each V
j
using edges of G. By traversing this tree twice, we can
produce a tour in G, of cost at most 2f(n)OPT
CDS
 2f(n)OPT
TOUR
. Thus, if we able to
approximate the connected dominating set of a subset with node weights to a factor f(n), we
can approximate set TSP within a factor 2f(n). 2
Acknowledgements: We thank Ray Miller and Azriel Rosenfeld for rst mentioning the
traveling tourists problem. We thank Vaduvur Bharghavan for re-igniting our interest in the
connected dominating sets problem. We would like to thank Estie Arkin, and Randeep Bhatia
for useful discussions. We thank Balaji Raghavachari and Serge Plotkin for raising the questions
about the vertex weighted case, and the Steiner case respectively. We thank Piotr Berman for
allowing us to include his improvement to Algorithm II.
15
References
[1] E. Arkin, M. Halldorsson and R. Hassin, \Approximating the tree and tour covers of a
graph", Information Processing Letters, 47: 275{282, (1993).
[2] P. Berman, personal communication, May (1996).
[3] P. Berman and M. Furer, personal communication, May (1996).
[4] P. Berman and V. Ramaiyer, \Improved approximation algorithms for the Steiner tree
problem", J. Algorithms, 17:381{408, (1994).
[5] T.Cormen, C. Leiserson, and R. Rivest, Introduction to Algorithms, The MIT Press, 1989.
[6] U. Feige, \A threshold of lnn for approximating set-cover", 28th ACM Symposium on
Theory of Computing, pages 314{318, (1996).
[7] M. R. Garey and D. S. Johnson, \Computers and Intractability: A guide to the theory of
NP-completeness", Freeman, San Francisco (1978).
[8] S. Guha and S. Khuller, \Approximation algorithms for connected dominating sets", Proc.
of 4th Annual European Symposium on Algorithms, (1996).
[9] A. Kothari and V. Bharghavan, \Algorithms for unicast and multicast routing in ad-hoc
networks", manuscript.
[10] P. N. Klein and R. Ravi, \A nearly best-possible approximation algorithm for node-
weighted Steiner trees", J. Algorithms, 19(1):104{114, (1995).
[11] D. Kleitman and D. West, \Spanning trees with many leaves", SIAM Journal on Discrete
Mathematics, 4(1):99{106, (1991).
[12] M.Karpinsky and A.Zelikovsky, \New approximationalgorithmsfortheSteiner treeprob-
lem", Technical Report, Electronic Colloquium on Computational Complexity (ECCC):
TR95-030, (1995).
[13] C. Lund and M. Yannakakis, \On the hardness of approximating minimization problems",
Journal of the ACM, 41(5): 960{981, (1994).
[14] C. S. Mata and J. S. B. Mitchell \Approximation algorithms for geometric tour and
network design problems", Proc. of the 11th Annual Symp. on Computational Geometry,
pages 360{369, (1995).
[15] S. Paul and R. Miller, \Locating faults in a systematic manner in a large heterogeneous
network", IEEE INFOCOM, pages 522{529, (1995).
[16] C. Savage, \Depth-First search and the vertex cover problem", Information Processing
Letters, 14(5): 233{235, (1982).
[17] P. Slav

ik \A tight analysis of the greedy algorithm for set cover" 28th ACM Symposium
on Theory of Computing, pages 435{441, (1996).
16