Assignments | Course Home | Schedule & Lecture Notes

Saint Louis University

Computer Science 314
Algorithms

Michael Goldwasser

Spring 2008

Dept. of Math & Computer Science

Homework Assignment 04

Strongly Connected Components

Contents:


Overview

Topic: Strongly Connected Components
Related Reading: class notes
Due: Monday, 10 March 2008

Please make sure you adhere to the policies on academic integrity.


Problem to be Submitted (40 points)

You are to implement the algorithm for computing all strongly-connected components of a given directed graph. This algorithm was discussed in the excerpt from Introduction to Algorithms by Cormen, Leisersen, Rivest, and Stein. In short, it was described as

  1. Call DFS(G) to compute finishing times f[u] for each vertex u
  2. compute GT, that is, the "transpose" graph (a copy of G yet with the orientation of all edges reversed)
  3. call DFS(GT), but starting and restarting the search by considering vertices in order of decreasing f[u] (as computed in line 1)
Each tree in the depth-first forest formed by step 3 is a strongly-connected component of G.

Output: Your program should report the number of strongly-connected components for the given graph (see extra credit for a discussion of more verbose output).

Running Time: If done properly, your entire program should run in time O(n+m).


Input Format

The first line will be an integer designating the number of vertices of the graph. The second line will be an integer designating the number of edges in the graph. Each subsequent line describes a single edge of the graph as two integers: the first is the ID of the origin of the edge, and the second is the ID of the destination of the edge. Note: vertices are presumed to be identified from 0 to n-1 in this format.

As an example, the graph portrayed on page 553 of the notes could be described as

8
13
0 1
1 2
2 3
3 2
2 6
3 7
1 5
4 0
4 5
5 6
6 5
6 7
1 4
Please note that edges may be given in arbitrary order.


Data Sets

For convenience, we provide a set of sample files of varying sizes (although we reserve the right to test your program on other scenarios as well). The above example is in a file named graph_clrs553.txt. The other graphs are provided in files named according to the respective number of vertices, edges, and components. For example, the input file graph_100_800_26.txt is a graph with 100 vertices, 800 edges, and 26 strongly-connected components.

If you wish to examine the actual makeup of the connected components, we provide corresponding answer keys (e.g., answer_100_800_26.txt). The precise format of those files is described in the extra credit challenge.


Extra Credit

For the required part of the assignment, your program must simply report the number of components. For extra credit, we want you to produce output detailing the contents of the various components. But in order to more easily compare the results, it is convenient to list them in a canonical form.

We have chosen the following form. Each component should be reported on a single line, with the vertex IDs from that component given in increasing order, separated by spaces. The various components should themselves be listed in increasing order of their first element. As an example, here is the output from the file answer_100_800_26.txt in our data set.

0 14 16 20 35 42 45 48 49 54 60 74 75 86 93 99
1 3 4 5 9 10 15 19 21 25 29 37 51 58 63 65 72 84 95 98
2 12 13 18 24 31 39 44 47 57 64 68 78 97
6 11 17 26 28 36 40 41 43 46 59 61 70 76 77 79 83 88 89 91
7
8 90
22
23 30 52 55 62 71
27
32
33
34 67
38
50
53
56
66
69
73
80
81
82
85
87 96
92
94

The extra credit is to produce this output while still maintaining the O(n + m) running time. This means that you cannot use the standard sorting algorithms for sorting any of your data. Fortunately, there are linear time ways to sort elements whos items are keys in a known integer range. Come talk to me about this if you are interested.

If you attempt this challenge, please make sure that there is still a way to run the program from the command line so that it produces only the summary number of components (when testing your software on large data sets, this verbose output becomes too cumbersome and adversely affects the clock time).


Michael Goldwasser © 2008
CSCI 314, Spring 2008
Last modified: Wednesday, 27 February 2008
Assignments | Course Home | Schedule & Lecture Notes