Course Home | Assignments | Computing Resources | Data Sets | Lab Hours/Tutoring | Python | Schedule | Submit

Saint Louis University

Computer Science 1020
Introduction to Computer Science: Bioinformatics

Michael Goldwasser

Spring 2018

Computer Science Department

Homework Assignment 06

Sequence Assembly Algorithms

Contents:


Collaboration Policy

For this assignment, you must work alone. Please make sure you adhere to the policies on academic integrity in this regard.


Overview

Topic: Sequence Assembly Algorithms
Related Reading: notes from JHU ( Overlap graphs, De Bruijn graphs)
Due: 2:10pm, Friday, 4 May 2018


Problems

  1. (11 points)
    The JHU notes on the OLC algorithm, and this Langmead video, introduces the concept of an overlap graph for a set of reads. (Such as the example that is on page 20 of those notes, even though the pages are not explicitly numbered.)

    Illustrate the overlap graph that results from the following reads, including all edges that represent an overlap of 4 or greater.
    { AGCAGG, AGGCAG, CAGGCA, GAGCAG, GCAGCA, GCAGGC, GGCAGC }

  2. (9 points)
    Starting on page 21 of the JHU OLC slides is a discussion that some edges are redundant because they can be transitively inferred from other edges. While there are many edges in the previous graph that represent an overlap of 4, many of those are redundant by this definition.

    Identify the three edges representing overlaps of 4 that are not redundant.

  3. (10 points)
    This Langmead video, and what is implicitly page 5 of the JHU notes on De Bruijn graphs, shows an example of a De Bruijn graph that is built from a set of 3-mers. Illustrate the De Bruijn graph that results from the following 4-mers: {AAAC, AACG, AACT, ACTA, CTAA, GTAA, TAAA, TAAC}. (Thus each node of the graph will have an associated 3-mer, and each directed edge represents a 4-mer that contains overlapping 3-mers.)

  4. (10 points)
    Although the pages are not visually numbered, what is implicitly be page 10 of the JHU notes on De Bruijn graphs defines a Eulerian walk of a directed graph, and page 21 demonstrates how a Eulerian walk of a De Bruijn graph implies a potential assembly of the original k-mers by overlapping the pieces represented by the nodes in order.

    Your De Bruijn graph from the previous question should have 8 edges, and two possible Eulerian walks. Give the assemblies that would be implied by each of those two Eulerian walks.


Michael Goldwasser
CSCI 1020, Spring 2018
Last modified: Tuesday, 01 May 2018
Course Home | Assignments | Computing Resources | Data Sets | Lab Hours/Tutoring | Python | Schedule | Submit