Minimum Spanning Trees

A minimum spanning tree is the same as any other fully connected graph, except that each of the edges (not pathes) is minimal. Note that this does not mean all the paths are minimal. The minumum spaning tree is the (minimal) set of edges required to join the graph. A further constraint is that the minimum spanning tree contains no cycles. This is because a cycle means there's an extra edge somewhere that isn't required.

Take for example this graph. Its got 7 nodes, most of which with 3 or 4 edges connecting them to other nodes. Clearly, this is not a minimum spanning tree. However, if we remove all the edges labeld with red, we will have a min spanning tree.

Take, for example, this simple graph. First, note that the shortest path from any one node to any other node is 1. For a minimum spanning tree, we aren't interrested in shortest pathes. Our concern is having each node connected to the minumum spanning tree with the cheapest edge possible.
This is one possiblity of a minimum spanning tree.
There are two other correct solutions.
One with edges { (A,B), (A,C) } and { (A,C), (B,C) }
The solution set is dependent on how a minimum path is chosen.

Algorithms

We'll cover two different algorithms. Kruskal, and Prim's. Kruskal's algorithm is easiest to hand trace, but more difficult to code. Prim's is the opposite. Pseudo-code is in your text book. See pages 357-361 (or there abouts, I don't have a book with me right now).

Kruskal's Algorithm

Kruskal's algorithm searches throught the edges, picks the smallest one, and adds it to the tree, provided that edge doesn't create a cycle. Recall that a cycle is the condition where you can follow a series of edges and arrive back where you started without backtracking. Pseudo-code for Kruskal's Algorithm follows:
T = {}	// T is the set of pairs (edges) that make up the min span tree

while ( T has < n-1 edges and E not empty ) {

  pick (u,v) from E such that its weight is minimal

  delete (u,v) from E

  if adding (u,v) doesn't create a cycle, add it to T

}

Remarks

On completion, T will contain a list of pairs. E is a list of pairs, sorted in ascending order according to weights. For the graph above, we have the following list for E:
Weight (edge)
1  ( 1, 7 )
1  ( 2, 4 )
2  ( 0, 2 )
2  ( 4, 5 )
3  ( 0, 3 )
3  ( 2, 5 )
4  ( 1, 2 )
4  ( 4, 6 )
5  ( 0, 1 )
5  ( 5, 7 )
7  ( 0, 4 )
10 ( 3, 4 )
21 ( 3, 6 )
Trace through Kruskal's algorithm:
  • Initially, T is empty, so the first edge (1,7) is added.
  • The next edge, (2,4) is also added, since its addition to T does not create a cycle. At this point the spannning tree contains two, unconnected edges.
  • The process continues, adding the edges (0,2), (4,5), (0,3)
  • The addition of next edge, (2,5) will create a cycle (T already contains (2,4) and (4,5), so adding (2,5) is unnecessary).
  • The algorithm terminates when we've emptied out the list E. The list T will contain { (1,7), (2,4), (0,2), (4,5), (0,3), (1,2), (4,6) }, which are exactly the edges that make up the minimum spanning tree. Each edge is the cheapest one that connects that node to the rest of the tree.

  • Prim's Algorithm

    Prim's algorithm takes a different approach. It maintains two lists, TV which is the nodes already in the tree, and T, the list of edges that makes up the spanning tree. In determining cantidate edges for the tree, we look for a node that's in TV, and on that isn't, such that its path is minimum.
    TV = { 0 }
    
    T = { }
    
    while( T has < n-1 edges ) {
    
      find (u,v) with least cost, such that
    	u is in TV and v isn't in TV
    
      if no such edge exists, break
    
      add v to TV
    
      add (u,v) to T
    }
    
    On termiantion, if the graph is connected, TV will contain all the nodes in the graph, and T will contain the set edges comprising the minimum spanning tree.
    Tracing through the algorithm, we get the following:
    TVT
    { 0 } { }
    { 0, 2 } { (0,2) }
    { 0, 2, 4 } { (0,2), (2,4) }
    { 0, 2, 4, 5 } { (0,2), (2,4), (4,5) }
    { 0, 2, 4, 5, 3 } { (0,2), (2,4), (4,5), (0,3) }
    { 0, 2, 4, 5, 3, 1 } { (0,2), (2,4), (4,5), (0,3), (2,1) }
    { 0, 2, 4, 5, 3, 1, 7 } { (0,2), (2,4), (4,5), (0,3), (2,1), (1,7) }
    { 0, 2, 4, 5, 3, 1, 7, 6 } { (0,2), (2,4), (4,5), (0,3), (2,1), (1,7), (4,6) }
    On the last step, it is impossible to find a pair (u,v) where u is in TV, and v isn't, because we have all the nodes already in TV.