Course Home | Assignments | Computing Resources | Data Sets | Lab Hours/Tutoring | Python | Schedule | Submit

Introduction to Trees

A "tree" is a discrete structure that serve as important model for a variety of hierarchical data sets that need to be represented and processed in computer science (and in bioinformatics). Here are just a handful of common uses of trees in modeling data:

File System

Organizational Chart

Parse Tree

Family Tree

Phylogeny


Terminology

CS termDescriptionBiologists term
nodeA representation of a single entity within the tree
edgeA connection representing a relationship between two nodes
rootThe (topmost) node from which an entire tree eminates
parentThe immediate ancestor of a node in a tree
external node (leaf)a node without any subsequent childrentip
internal nodea node without any subsequent childrennode
childThe immediate descendants of a node in a tree
ancestorAny of the nodes "above" a given node (i.e., toward the root)
descendantAny of the nodes "below" a given node (i.e., away from the root)
brancha path between an ancestor and one of its descendants
subtreethe portion of a tree including a node and all of its descendantsclade


Representation and Computation

Trees are inherently recursive and so our representation of trees, and our functions for processing trees, will also be recursive.

To represent trees, we will begin by considering a special class of trees known as binary trees in which each internal node of a tree has precisely two children. (Although the techniques we use can typically be extended to more general trees with arbitrary branching factors.)

Our textbook recommends a relatively simple representation using Python's tuples (this is a built-in structure that is similar to a list, but immutable). The basic format used is a triple,

(label, leftsubtree,  rightsubtree)
By convention, we will use a representation where if a node doesn't have any children, we will use empty tuples, such as
('C', (), ())

By this convention, a tree that might be represented graphically as

  B
 /
A
 \
  C
would be represented by the recursive structure
('A', ('B', (), ()), ('C', (), ()) )

Michael Goldwasser
Last modified: Monday, 19 March 2018
Course Home | Assignments | Computing Resources | Data Sets | Lab Hours/Tutoring | Python | Schedule | Submit