Saint Louis University |
Computer Science 314
|
Dept. of Math & Computer Science |
Topic: Divide and Conquer
Related Reading: Ch. 5
Due:
Monday, 9 April 2007, 12:00 noon
Please make sure you adhere to the policies on academic integrity.
See "Solved Exercises" in Chapter 5 of text
Exercise 1 of Chapter 5 (p. 246)
Exercise 3 of Chapter 5 (p. 247)
You are to implement both the O(n2) brute-force and the O(n log n) divide and conquer algorithm for finding the closest pair of points in two dimensions (you may use the programming language of your choice).
We have created several random data sets which you can use for testing. Please report the running times which your program achieves on each of the data sets, for both the efficient and brute force solutions (of course when the brute force becomes unreasonably slow, you do not need to wait for it to complete).
Input File Format: The first line is an integer, which specifies the number of points in the set. Then each additional line specifies a single point, given three fields of form:
xval yval stringTagwhere xval and yval are integers and the remainder of the line (if any) serves as the stringTag. Here is a simple sample file:
5 8763 4188 Chicago 9038 3875 St. Louis 9458 3912 Kansas City 8628 3973 Indianapolis 9365 4153 Des Moines
Advice: The brute force approach should be easy to implement and can serve as a sanity check. Of course it will not be able to handle the larger data sets we provide.
Getting the divide and conquer approach to work correctly and efficiency is a bit of an engineering challenge. I will say that my implementation (in Python on turing) solves the 100,000 point data set in under 12 seconds and the 1 million point data set in approximately 3 minutes (just imagine how quick it might do if implemented in C). Besides getting the high-level algorithm correct, a major impact on the efficiency is the underlying memory management. In fact in practice, this will almost surely be the bottleneck. So think carefully about how much memory you use and how it is managed.
In Exercise 3 of the text, you developed an O(n log n) time algorithm for finding an equivalent cluster of cards.
It is actually possible to solve this problem in O(n) worst case time. Develop and analyze such an algorithm.