{
 "metadata": {
  "name": "",
  "signature": "sha256:41cfc86b9e6591b6c5c0923993a6474486bd348aee8984369117217bfef4980a"
 },
 "nbformat": 3,
 "nbformat_minor": 0,
 "worksheets": [
  {
   "cells": [
    {
     "cell_type": "heading",
     "level": 1,
     "metadata": {},
     "source": [
      "Problem Set 1 - DNA"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "**Total points:** 30\n",
      "\n",
      "**Due:** Wednesday 25th March 19:00\n",
      "\n",
      "**Format:** IPython Notebook\n",
      "\n",
      "*The number of points in this problem sheet is *not* directly proportional to the difficulty. In fact, Part 3 is more difficult than Part 1 and 2, but is worth fewer points in terms of the amount of work/code, and are there for people who enjoy challenges :) So you can choose to not complete Part 3, and you should still be able to get more than 60% of points from the previous questions if you answer them correctly.*"
     ]
    },
    {
     "cell_type": "heading",
     "level": 2,
     "metadata": {},
     "source": [
      "Background"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "This problem set is about DNA data and methods typically applied to this kind of data. As a reminder, [DNA](http://en.wikipedia.org/wiki/DNA) is a molecule that encodes genetic instructions for living organisms. DNA is typically found as a double-stranded helix, in which each strand corresponds to a sequence of *nucleotides*. Each nucleotide consists of a nucleobase (guanine, adenine, thymine, or cytosine) attached to sugars, which are in turn separated from each other by phosphate groups:\n",
      "\n",
      "
\n",
      "