Assignment 04 - Stock Market

Contents:


Collaboration Policy

For this assignment, you are allowed to work with one other student if you wish (in fact, we suggest that you do so). If any student wishes to have a partner but has not been able to locate one, please let the instructor know so that we can match up partners. One member of the partnership should submit the final script, making sure that all members' names are included as part of comments at the beginning of the file. As usual, please submit your solution to dferry_submit@slu.edu.

Please make sure you adhere to the policies on academic integrity in this regard.


Overview

Topic: Stock Market
Related Reading: Ch. 6 for control structures and Ch. 4 for discussion of formatting output via sprintf or fprintf.

For this assignment, we will be revisiting the stock market data introduced in the previous assignment. This time, rather than plotting the data graphically, we will be performing various numerical analyses. As a reminder, the data (download here) has one row for each day of market activity. It is arranged in columns that designate

Year Month Date Daily Opening Price Daily High Price Daily Low Price Daily Closing Price Daily Volume
1928 10 01 239.43 242.46 238.24 240.01 3500000
1928 10 02 240.01 241.54 235.42 238.14 3850000
 ...
2009 01 29 8373.06 8373.06 8092.14 8149.01 5067060000
2009 01 30 8149.01 8243.95 7924.88 8000.86 5350580000

Please be aware of the following issues:


Problems to be Submitted (20 points)

For this assignment, you should produce a single script named dow.m that behaves as follows. It should load the raw data and then repeatedly offer the user a menu of possible analyses, until the user chooses to quit. The choices should be

You should use the built-in function menu to handle the user I/O for selecting from this menu. It is called using the syntax
   userSelection = menu(title, choice1, choice2, ...);
where the parameters are strings to be displayed and the returned value is the index of the selected choice. Type help menu for documentation.

In the remainder of this assignment description, we describe the requirements for the four types of analyses.

Best Single Month (% gain)

Consider each calendar month in the data set (e.g., September 2007). For a particular month, take the ratio between the closing price on the last market day of the month and the opening price on the first market day of the month. The gain is computed as that ratio, minus one, then multiplied by 100. You are to determine the calendar month with the highest percentage gain on record. Output your results using a format such as

Opening price on 2007/09/04 was 13358.39
Closing price on 2007/09/28 was 13895.63
That is a gain of 4.02%.
(but this is not actually the best month)

Most Consecutive Days with Gain

Consider a day as having gain if the closing price for the day is strictly greater than the opening price for that same day. You are to determine the longest streak of daily gains for the history of the market. Output your results using a format such as

There were 5 consecutive days of gain
starting on 2008/11/21 and continuing through 2008/11/28.
(but this is not actually the longest streak)

Biggest Historical Downturn (% drop)

After the housing crsis and market crash the value of the stock market dropped about 50% of what it previously was. But this is not the worst such percentage drop in history. Determine the largest such drop, namely a low point that is the smallest percentage when compared to a preceding high point. More formally, you are looking for the days a and b with a < b chronologically so as to minimize the ratio of low(b) / high(a). Output your results using a format such as

High price on 2007/10/11 was 14279.96.
Low price on 2008/11/21 was 7392.27.
That is 51.77% of the previous high.
(but this is not actually the largest percentage drop)

Make sure to consider the efficiency of your approach. Given the size of the data set, there are over 200 million pairs of the form a < b. You do not want to do the computation in a brute force manner over all such pairs. Think about how to streamline the search for the solution.

Market Trends

In this problem, we examine some historical trends in the market. In particular, there is reason to believe that market activity is different on the very last market day of a year, or the very first market day of a year, because some people may time their trades due to tax considerations.

To test the hypothesis, we want to compute the geometric mean of the daily percentage change taken over three sample sets

The geometric mean is computed as follows. For a day in the sample, compute the daily percentage change as close(day)/open(day). For all days in a given sample, compute the product of those ratios and then take the nth root of the product, where n is the number of samples. Note that the nth root can be computed by raising the product to the (1/n) power.

Output your results using the format

Day type          Geometric Mean
-------------     --------------
first of year         ?.??????
last of year          ?.??????
typical               ?.??????
of course, you should replace the question marks with the actual answers. Show six significant digits in the result.

Extra Credit (2 points)

The initial analysis of market trends shows that the first and last day of the year are slightly better than average. So the next interesting question is what is the absolute best calendar date using the historical data.

Add an extra option to the menu that does a daily analysis. This should internally compute the geometric mean for every possible calendar day (e.g., geometric mean for the set of all Jan. 1, geometric mean for the set of all Jan. 2). Then report which day produces the highest such mean using the form

Best historic date is MM/DD with geometric mean ?.??????.

Advice: One way to approach this problem is to compile statistics using 12-by-31 arrays that can be indexed by (month,date) pairs. Make a pass through the entire data set compiling the statistics for building the geometric means. The final challenge is to learn how max works with arrays, so that you can find out what the maximum geometric mean is and also what month and date it corresponds to (see help max for more information, or feel free to come ask me).

A second approach would be to use loops to iterate over all possible calendar days, and then for a specific one to pull out all entries of the data set that match that MM/DD, computing the geometric mean for those entries, and then tracking the highest such mean as you go. For this approach, the find function can be useful.

Good luck!


Originally by
Michael Goldwasser