Hands-on Day


All of these problems are for practice. You are freely encouraged to work together and you do not need to submit any of this work for a grade.


  1. Assume that price is a dictionary mapping string descriptors to integer values (e.g., price['apple'] = 75), and assume that groceries is a list of strings representing products (e.g., ['apple', 'apple', 'milk'])

    Give python code that computes a variable total that represents the total cost of the groceries, as per their prices.

    Data set: sample

    Spoiler: my code

  2. Give a collection of values, the statistical mode of the collection is a value that occurs the most of any value. (In case of a tie, we will consider any such value to be a mode.)

    Write a function, mode(collection), which returns the mode of a given collection. (You may assume the elements of the collection belong to an immutable type.)

    Spoiler: my code

  3. Assume that data is a list of integers. Write a function pairsum(data, n) that returns a pair of values from the given data that sum precisely to n, if such a pair can be found, and otherwise returns None.

    As a test, see if your program can find the following pythagorean triples (well, the squares of the triples) by calling

          pairsum([k*k for k in range(1,100)], 9409)            # should return (4225, 5184)
          pairsum([k*k for k in range(1,1000)], 998001)         # should return (104976, 893025)
          pairsum([k*k for k in range(1,10000)], 99980001)      # should return (3920400, 96059601)
          pairsum([k*k for k in range(1,100000)], 9999800001)   # should return (481846401, 9517953600)

    Note: The key to being efficient enough for the larger sets is that you can search for such a satisfying pair in time that is linear in the size of the data, rather than quadratic. In particular, rather than testing every possible pair of numbers to build the sum, you should put all of the original values into a dictionary (actually, you can simply use a set if you prefer), and then for each val in the data, efficiently test whether n - val is also in the data.

    Spoiler: my code

File Processing

For the rest of the problems, we work with the following lyrics that can be read from the file money.txt. Recall that you can get a list of lines of this file using the syntax open('money.txt').readlines().

  1. Compute the character frequency of the (lowercased) text, and output a list sorted from most frequent to least frequent.

    Assuming you compute a dictionary that maps from letter to frequency, if you can build a list of tuples of the form (freq,letter) then you can sort that list, and then reverse it, to get the results sorted form most frequent to least frequent.

    Here are the beginning of the results I get:

    ' ' 121
    'e' 83
    'o' 59
    'n' 49
    't' 45

    Spoiler: my code

  2. Output the lyrics, but replacing each of the numeric words (e.g. "one") with the corresponding numeral (e.g. "1"). You may use the following dictionary for translations:

    numeral = {'one' : 1, 'two' : 2, 'three' : 3, 'four' : 4, 'five' : 5,
               'six' : 6, 'seven' : 7, 'eight' : 8, 'nine' : 9, 'ten' : 10}

    Spoiler: my code

  3. Create a dictionary, lines such that lines[word] is a list of all distinct line numbers on which that word appears (in either case).

    For example you should find that
    lines['nowhere'] is [14, 15, 17] and
    lines['the'] is [3, 7, 11, 12, 19, 21].

    Spoiler: my code

