Course Home | Homework | Lab Open Hours | Programming | Schedule & Lecture Notes | Submit

Saint Louis University

Computer Science 180
Data Structures

Michael Goldwasser

Spring 2007

Dept. of Math & Computer Science

Programming Assignment 07

Birthdays

Due: Thursday, 26 April 2007, 8pm

Please see the general programming webpage for details about the programming environment for this course, guidelines for programming style, and details on electronic submission of assignments.

Collaboration Policy

For this assignment, you must work individually in regard to the design and implementation of your project.

Please make sure you adhere to the policies on academic integrity in this regard.


The files you may need for this assignment can be downloaded here.


Contents:


Overview

Our textbook provides a simple version of a binary search tree with a find function that looks for an exact match for a target item. More generally, search trees are a very powerful concept which can be used to provide a great deal more functionality. They are used in databases for finding exact matches and partial matches.

For this assignment, we will develop a program which keeps track of people's birthdays, allowing us to perform searches based on name or birthdate, including partial matches. From a technical standpoint, we will rely upon two separate search trees which use strings as the data type. You will be repsonsible for rewriting the find method so that it identifies all entries of a tree whose string begins with a given query string.

We will leave it to you to develop the precise algorithmic intuition for this goal. As an example, consider the tree pictured in Figure 8.14 of our text on page 468. How might you search for all entries which begin with the substring "m"? All entries which begin with the substring "th"?


Formal Requirements

In the original version of find, there could be at most one answer (as duplicates are not allowed in the tree). For this reason, the signature of the function was designed to return a pointer to an item (with NULL returned in the case no match was found).

For this assignment, there may be many possible matches to a query. Rather than return those directly, we have chosen the following signature and semantics.

  /** Find all entries which contain target as a prefix.
      @param target The item sought
      @param results An initially empty vector in which results should be pushed
  */
  void find(const Item_Type& target, std::vector<Item_Type>& results) const;
The caller provides a reference to an initially empty vector. The responsibility of the function is to push all matching results (if any) onto the vector.


Driver

As usually, we are providing a driver and a makefile to help with some mundane details. Our driver is called bday and can be used to insert new people into the database as well as to query the database (for simplicity, we are not worrying about erasing entries).

Most significantly, our driver actually makes use of two different instances of the Binary_Search_Tree class: one which is used for querying by name and another when querying by birthdate. It does this by making clever use of data strings. As an example, consider Jack Nicholson who was born on April 22, 1937. When dealing with the bday-based tree, we insert the string "1937-04-22 Nicholson, Jack". By this convention, we can find all people born on that day in history by searching for all entries which begin with the substring "1937-04-22". In fact, we can even find all entries born in 1937 by searching for the initial substring "1937".

However this technique is useless when trying to (efficiently) locate Jack Nicholson's birthday. For this reason, a second tree is created which is based upon names, with strings of the form "Nicholson, Jack 1937-04-22 ". A prefix-search in this tree would allow us to find this person (or even all names starting with "Nicholson, J" if we wished).

Batched input

Our driver will allow you to hand enter individual entries for small-scale testing. However for large-scale testing of efficiency, we have created a text file with nearly 200,000 birthdays (mostly celebrities). We are distributing a link to this fulllist file with the project, and the driver allows it to be loaded.


Help with C++ string class

Though we have not explored this class very seriously, documentation on it is provided in Chapter P.8 of our text. You may either check for partial matches character-by-character with your own logic, or by taking advantage of the existing methods afforded by the class.


Files We Are Providing

All such files can be downloaded here.


Files to Submit


Grading Standards

The assignment is worth 10 points.


Michael Goldwasser
CSCI 180, Spring 2007
Last modified: Monday, 16 April 2007
Course Home | Homework | Lab Open Hours | Programming | Schedule & Lecture Notes | Submit