In class, we will have a more conversational discussion of the principles of genomics, DNA, RNA, genes, and so forth. However, the following are a wealth of resources that will guide the discussion.
Dr. Ben Langmead, of Johns Hopkins University, provides a wonderful set of slides with an introduction to genomics
The Genome News Network (yes, there is such a thing), provides an oustanding set of resources, such as What's a (Genome|Chromosome|Gene|DNA|Genome Sequencing|Genome Map)?
Like videos? Khan Academy has a bunch.
Coding regions of DNA are converted to RNA and then to proteins, with each triple of nucleotides (codon) leading to a specific amino acid in the protein sequence. There are some cases where several distinct codons end up producing the same amino acid. There is also a particular codon that is known as the "start codon", which produces the amino acid methionine, yet this start codon is key at the molecular level for getting the process rolling.
There are also three specific codons that serve as stop codons, ending the process (though not producing an amino acid).
The codons are always read from the 5' end of the messenger RNA (which is itself aligned with the 3' end of the DNA strand that was transcribed). The translation from codons to amino acids can therefore either be described in actuality as the triple of RNA nucelotides, or traced back to the original triple of DNA nucelotides that would have resulted in the transcribed RNA.
DNA | mRNA | |
---|---|---|
Start codon | ATG | AUG |
Stop codons | UAA UAG UGA |
TAA TAG TGA |
The complete mapping from codons to amino acids can also be described either in the context of the original DNA or the transcribed mRNA, and can be given as a complete table, or often using a wheel-like structure that is more convenient for tracing codons as a three-letter sequence.
|
|
from DNA | from RNA |
Because it matters where you start grouping three nucleotides into a codon, there are actually six different reading frames, three in the forward direction, and three because there could be coding regions that are on the reverse complementary strand.
How many ORFs are you able to find for the following strand? (including possible ORFs in the implicit complementary strand)
TTACCTATGCATGCATAACTGA