Course Home | Assignments | Data Sets/Tools | Python | Schedule | Git Submission | Tutoring

Overview

As we begin our exploration of the Python programming language, our first goal will be to gain a high-level overview of programming concepts and a comfort level in reading Python code and understanding its instructions (as opposed to writing new Python code). I'd equate this similar to natural languages, in which it is easier to learn to read/hear existing samples of the language before learning to express yourself by writing/speaking the language. Therefore, this document provides an initial overview of the language and programming concepts. We will add further links to more specific documentation on aspects of the language at a later point.

Data Types

It is important to understand that there are various types of data that programmers will want to store and manipulate, and that each of these has a different internal representation. In Python, a concept known as a class is used to define a type of data. The primary classes that we will use include:

Identifiers (a.k.a Names, Variables)

When programming in Python we can give internal names to any values that we compute as a way to store and later identify such a value. For example, we might set

  samples = 357
to assign the name samples to the associated value 357, or the command
  dna = 'ACCTAAGA'
to assign the name dna to the associated value. The = symbol is used to designate such an assignment of a value to a name.

Control Structures

The order in which commands are executed by a computer is called the "flow of control." The default flow is that commands are executed in a Python script in the order in which they are expressed.

firstCommand
secondCommand
thirdCommand
...
However, there are many control structures that allow you to vary the flow of control. Most notably:


Examples

Cleaning Input

The primary task of the SMS Filter tool is to produce a result that is equivalent to the original piece of text, except omitting all characters that are not alphabetic. We offer our own function, named clean which accomplishes this task. Our implementation appears as follows.

def clean(original):
    result = ''
    for c in original:
        if c.isalpha():
            result = result + c
    return result

To step through this function running on an example, click the "Forward" button below.

Reverse Complement

A primary task of the SMS Reverse Complement tool is to compute the reverse complement of an original strand of dna. We offer our own function, named complement which accomplishes this task. Our implementation appears as follows.

# convert a given DNA sequence to its reverse complementary strand
dna2dna = {'A':'T', 'T':'A', 'C':'G', 'G':'C'}
def reverse_complement(dna):
    other = ''
    for base in dna:
        other = other + dna2dna[base]
    return other[::-1]   # python trick to reverse a string
To step through this function running on an example, click the "Forward" button below.

Finding Pairwise Differences


The SMS Color Align Conservation tool, though more advanced than the following, was used to try to graphically highlight locations within two equal-length sequences that differe from each other. We compute a (non-graphical) summary of the pairwise differences of two sequences as follows.

# compute pairwise difference between two equal-length strings
# list of difference returned using notation such as 'T35G'
# to reflect that at location 35 a T in first sequence is G in second
def difference(first, second):
    diff = []
    for k in range(len(first)):
        if first[k] != second[k]:
            diff.append( first[k] + str(1+k) + second[k])
    return diff
To step through this function running on an example, click the "Forward" button below.
Michael Goldwasser
Last modified: Friday, 01 February 2019
Course Home | Assignments | Data Sets/Tools | Python | Schedule | Git Submission | Tutoring