Saint Louis University |
Computer Science 145
|
Dept. of Math & Computer Science |
For this assignment, you are allowed to work with one other student if you wish (in fact, we suggest that you do so). If any student wishes to have a partner but has not been able to locate one, please let the instructor know so that we can match up partners. One member of the partnership should submit the final script, making sure that all members' names are included as part of comments at the beginning of the file.
Please make sure you adhere to the policies on academic integrity in this regard.
Topic: Stock Market
Related Reading: Ch. 4 for control structures and
Ch. 6.3-6.4 for discussion of formatting output via
sprintf or fprintf.
Due:
Friday, 13 February 2009, 11:59pm
For this assignment, we will be revisiting the stock market data
introduced in the previous assignment. This time, rather than
plotting the data graphically, we will be performing various numerical
analyses.
As a reminder, the data (download here) has one row for each day
of market activity. It is arranged in columns that designate
Year | Month | Date | Daily Opening Price | Daily High Price | Daily Low Price | Daily Closing Price | Daily Volume |
---|---|---|---|---|---|---|---|
1928 | 10 | 01 | 239.43 | 242.46 | 238.24 | 240.01 | 3500000 |
1928 | 10 | 02 | 240.01 | 241.54 | 235.42 | 238.14 | 3850000 |
... | |||||||
2009 | 01 | 29 | 8373.06 | 8373.06 | 8092.14 | 8149.01 | 5067060000 |
2009 | 01 | 30 | 8149.01 | 8243.95 | 7924.88 | 8000.86 | 5350580000 |
Please be aware of the following issues:
For this assignment, you should produce a single script named dow.m that behaves as follows. It should load the raw data and then repeatedly offer the user a menu of possible analyses, until the user chooses to quit. The choices should be
In the remainder of this assignment description, we describe the requirements for the four types of analyses.
Consider each calendar month in the data set (e.g., September 2007). For a particular month, take the ratio between the closing price on the last market day of the month and the opening price on the first market day of the month. The gain is computed as that ratio, minus one, then multiplied by 100. You are to determine the calendar month with the highest percentage gain on record. Output your results using a format such as
Opening price on 2007/09/04 was 13358.39 Closing price on 2007/09/28 was 13895.63 That is a gain of 4.02%.(but this is not actually the best month)
Consider a day as being one with a gain if the closing price for the day is strictly greater than the opening price for that same day. You are to determine the longest streak of daily gains for the history of the market. Output your results using a format such as
There were 5 consecutive days of gain starting on 2008/11/21 and continuing through 2008/11/28.(but this is not actually the longest streak)
Today's stock market value is only about 50% of what it once was.
But this is not the worst such percentage drop in history. Determine
the largest such drop, namely a low point that is the smallest
percentage when compared to a preceding high point.
More formally, you are looking for the days a
and b with a < b chronologically so as to minimize
the ratio of
High price on 2007/10/11 was 14279.96. Low price on 2008/11/21 was 7392.27. That is 51.77% of the previous high.(but this is not actually the largest percentage drop)
Make sure to consider the efficiency of your approach. Given the size of the data set, there are over 200 million pairs of the form a < b. You do not want to do the computation in a brute force manner over all such pairs. Think about how to streamline the search for the solution.
In this problem, we examine some historical trends in the market. In particular, there is reason to believe that market activity is different on the very last market day of a year, or the very first market day of a year, because some people may time their trades due to tax considerations.
To test the hypothesis, we want to compute the geometric mean of the daily percentage change taken over three sample sets
The geometric mean is computed as follows. For a day in the sample, compute the daily percentage change as close(day)/open(day). For all days in a given sample, compute the product of those ratios and then take the nth root of the product, where n is the number of samples. Note that the nth root can be computed by raising the product to the (1/n) power.
Output your results using the format
Day type Geometric Mean ------------- -------------- first of year ?.?????? last of year ?.?????? typical ?.??????of course, you should replace the question marks with the actual answers. Show six significant digits in the result.
The initial analysis of market trends shows that the first and last day of the year are slightly better than average. So the next interesting question is what is the absolute best calendar date using the historical data.
Add an extra option to the menu that does a daily analysis. This should internally compute the geometric mean for every possible calendar day (e.g., geometric mean for the set of all Jan. 1, geometric mean for the set of all Jan. 2). Then report which day produces the highest such mean using the form
Best historic date is MM/DD with geometric mean ?.??????.
Advice: One way to approach this problem is to compile statistics using 12-by-31 arrays that can be indexed by (month,date) pairs. Make a pass through the entire data set compiling the statistics for building the geometric means. The final challenge is to learn how max works with arrays, so that you can find out what the maximum geometric mean is and also what month and date it corresponds to (see help max for more information, or feel free to come ask me).
A second approach would be to use loops to iterate over all possible calendar days, and then for a specific one to pull out all entries of the data set that match that MM/DD, computing the geometric mean for those entries, and then tracking the highest such mean as you go. For this approach, the find function can be useful.
Good luck!