Lists and the range function are introduced in sections 0.11 and 0.12 of Chapter 0.
Chapter 2 of the text introduces use of the range function to perform index-based for loops.
While strings are convenient for representing a sequence of characters, Python allows for representation of sequences of arbitrary types of data as well. The primary structure for such a sequence is a Python list.
Some other time, we will see that there are many ways to construct and manipulate lists. For today, we'll focus specifically on Python's range function which can be used to create regular sequences of integers.
There are three forms of range.
The first version uses a single parameter.
The syntax range(k)
produces the list of numbers
We will see use of range(k) a lot because those integers from 0 to k-1 are precisely the indices of a list of k items.
The second version uses two parameters, which give a starting
value and stopping value for the range. Specifically, a syntax
such as range(j, k) produces the list of integers
The third version uses three parameters, with the third being
the step size for the sequence. For example, we could
get some even numbers with range(0, 10, 2) which
produces
A negative step size can be used to get a decreasing sequence,
such as range(10, 5, -1) which produces sequence
You should notice a great similarity between the use of parameters for a range and the use of parameters when describing slices of a string, although the syntax is different (with commas separating range parameters, and colons separating those arguments for a slice).
The reason the range function is so important in Python is that it allows for a technique known as an index-based loop.
We have already seen a for loop to iterate through the characters of a string. While this is quite intuitive, a problem is that when you are in one pass of such a loop, you have a name for the current element but you do not have any context for where that element is relative to others.
Just as a loop can iterate over characters of a string, it can be used to iterate through elements of a list (any list). When we use range to make a list, we can define a range to intentionally correspond to the integers that are indices of some other sequence. As a simple example, a direct for loop on a string might appear as
A corresponding index-based loop instead formally defines a loop variable which is an integer index that can subsequently be used to index into the original sequence. An equivalent behavior to the previous loop might be expressed as
for base in dna: print(base)
For example, if the original dna string had length 5, then notice that len(dna) is 5 and thus range(len(dna)) produces sequence
for k in range(len(dna)): print(dna[k])
Clearly, for the above example, the more direct for loop is cleaner and more intuitive. We prefer that direct loop when you simply want to process each element once, without any need for context. But as we saw in lab02, we may sometimes want to better understand the neighborhood around an element, and knowledge of an element's index helps. For example, we had a warmup question on lab02 that asked to count how many times a dna base is immediately followed by the same base. We can implement that count as follows:
Note well that in this example, we chose to loop over range(len(dna)-1) rather than range(len(dna)). As a sanity check, assume that dna had length 5. Then we only need a loop that executes 4 times to compare the four pairs of neighbors. Our loop would be executing only over the sequence
count = 0 for k in range(len(dna)-1): if dna[k] == dna[k+1]: count += 1