Hands-on Day

Quiz: Files and String Formatting

Overview

Today, we will do a 50-minute challenge quiz.

We examine a data set which shows populations for each state, broken down by reported race as of the 2010 census. The original source data can be found at census.gov; we have saved the data for use in this lab as a comma-separated values file:

raceByState.csv

This file uses the same format as the recent homework question, except that an additional line has been added at the beginning of the file that provides text labels for the columns.


Your Goal

You are to write a scripts that reads the raw data from raceByState.csv, analyzes it, and then produces a file report.txt that summarizes the data in the following form:

State                     Total      White              Black             Native              Asian            Pacific          Multirace
Alabama                 4779736    3362877 (70.4%)    1259224 (26.3%)      32903 ( 0.7%)      55240 ( 1.2%)       5208 ( 0.1%)      64284 ( 1.3%)
Alaska                   710231     483873 (68.1%)      24441 ( 3.4%)     106268 (15.0%)      38882 ( 5.5%)       7662 ( 1.1%)      49105 ( 6.9%)
Arizona                 6392017    5418483 (84.8%)     280905 ( 4.4%)     335278 ( 5.2%)     188456 ( 2.9%)      16112 ( 0.3%)     152783 ( 2.4%)
Arkansas                2915918    2342403 (80.3%)     454021 (15.6%)      26134 ( 0.9%)      37537 ( 1.3%)       6685 ( 0.2%)      49138 ( 1.7%)
...

The percentages shown are calculated relative to the total state population. Please note the following aspects about the report:


Advice

  1. Take advantage of loops! Not just for the obvious processing of 50 states, but also because there are six racial categories and you want to do the same things for each.

  2. Remember when writing to a file, you do not need to compose and write an entire line at the same time! You can call the write() function many times (for example, for each individual field of a line).

  3. A summary of styles for f-strings:

    left justified in fixed width { :<9}
    right justified in fixed width { :>9}
    control of floating-point digits { :5.2f}

Submission


Additional Challenges

Still have time? We dictated the column widths of 20 and 9 by examining the actual data set to see what the longest state name was and the maximum width of a number (or column label). A better approach is to do a first past through the data and determine the minimum width requirement for each individual column, and then do a second pass to generate the output using those widths.


Michael Goldwasser
Last modified: Sunday, 22 December 2019