Hands-on Day

Quiz: Files and String Formatting

Overview

Today, we will do a 50-minute challenge quiz.

We examine a data set which shows populations for each state, broken down by reported race as of the 2010 census. The original source data can be found at census.gov; we have saved the data for use in this lab as a comma-separated values file:

raceByState.csv

This file uses the same format as the recent homework question, except that an additional line has been added at the beginning of the file that provides text labels for the columns.

Your Goal

You are to write a scripts that reads the raw data from raceByState.csv, analyzes it, and then produces a file report.txt that summarizes the data in the following form:

State                     Total      White              Black             Native              Asian            Pacific          Multirace
Alabama                 4779736    3362877 (70.4%)    1259224 (26.3%)      32903 ( 0.7%)      55240 ( 1.2%)       5208 ( 0.1%)      64284 ( 1.3%)
Alaska                   710231     483873 (68.1%)      24441 ( 3.4%)     106268 (15.0%)      38882 ( 5.5%)       7662 ( 1.1%)      49105 ( 6.9%)
Arizona                 6392017    5418483 (84.8%)     280905 ( 4.4%)     335278 ( 5.2%)     188456 ( 2.9%)      16112 ( 0.3%)     152783 ( 2.4%)
Arkansas                2915918    2342403 (80.3%)     454021 (15.6%)      26134 ( 0.9%)      37537 ( 1.3%)       6685 ( 0.2%)      49138 ( 1.7%)
...

The percentages shown are calculated relative to the total state population. Please note the following aspects about the report:

States names are left-justified using width 20 (because "District of Columbia" uses 20 characters).
The population counts are each right-justified using width 9.
The labels on the first line are appropriately justified to match the associated columns.
We have used two spaces to horizontally separate each category, and one space to separate the population for a category from its percentage.

Advice

Take advantage of loops! Not just for the obvious processing of 50 states, but also because there are six racial categories and you want to do the same things for each.
Remember when writing to a file, you do not need to compose and write an entire line at the same time! You can call the write() function many times (for example, for each individual field of a line).
A summary of styles for f-strings:

left justified in fixed width { :<9}

right justified in fixed width { :>9}

control of floating-point digits { :5.2f}

Submission

One member of the pair must submit your project electronically, placing it in the quiz15 folder of their git repository.
Submit source code titled makeReport.py.
Make sure that BOTH students' names are in comments at the top of the source code.
Submit a copy of the file report.txt that your script generates.

Additional Challenges

Still have time? We dictated the column widths of 20 and 9 by examining the actual data set to see what the longest state name was and the maximum width of a number (or column label). A better approach is to do a first past through the data and determine the minimum width requirement for each individual column, and then do a second pass to generate the output using those widths.

Michael Goldwasser

Last modified: Sunday, 22 December 2019

left justified in fixed width	{ :<9}
right justified in fixed width	{ :>9}
control of floating-point digits	{ :5.2f}