Saint Louis University |
Computer Science 150
|
Dept. of Math & Computer Science |
In class, we gave out fully functional code which computed a frequency count for each letter of the alphabet given an original document as the source. The output of that program gave a table with two columns, the first being the letter of the alphabet and the second being the frequency count. That table was sorted based upon the letter of the alphabet (i.e., from 'a' to 'z').
For this assignment, you are to alter that program so that the output is ordered so as to list the results from the most frequent to the least frequent character (i.e., sort according to the second column).
No real new techniques here, just an exercise of existing techniques.
For this assignment you must work individually in regard to the design and implementation of your project.
Please make sure you adhere to the policies on academic integrity in this regard.
Though such a cosmetic change in the output might seem like a trivial task, the reason this is an interesting assignment is that no perfect tool exists in our ready-to-use toolbox. Therefore, you will either need to find a way to adapt the use of some other tool to get this to work, or else could simply scrap the toolbox and go ahead and write the low-level flow of control yourself.
Looking at the original version, the output order was based on performing a for loop over the list of tallies. Those individual tallies were ordered based upon the implicit ordering of the alphabet characters. You could use tally.sort() and tally.reverse() to get the frequencies ordered, but now you would face the problem that you no longer know which alphabet symbol is associated with each count.
You should create a new file, appropriately names, which contains all of your own code. This file must be submitted electronically.
You should also submit a separate 'readme' text file, as outlined in the general webpage on programming assignments.
Please see details regarding the submission process from the general programming web page, as well as a discussion of the late policy.
Add a third column to the output displaying the frequency of the letter usage as a percentage. Please make sure that it is reported as a percentage restricted only to the number of other alphabetic characters used (as opposed to an overall percentage in the original document, which would include spacing and punctuation.