{
"metadata": {
"name": "",
"signature": "sha256:41cfc86b9e6591b6c5c0923993a6474486bd348aee8984369117217bfef4980a"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "heading",
"level": 1,
"metadata": {},
"source": [
"Problem Set 1 - DNA"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Total points:** 30\n",
"\n",
"**Due:** Wednesday 25th March 19:00\n",
"\n",
"**Format:** IPython Notebook\n",
"\n",
"*The number of points in this problem sheet is *not* directly proportional to the difficulty. In fact, Part 3 is more difficult than Part 1 and 2, but is worth fewer points in terms of the amount of work/code, and are there for people who enjoy challenges :) So you can choose to not complete Part 3, and you should still be able to get more than 60% of points from the previous questions if you answer them correctly.*"
]
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Background"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This problem set is about DNA data and methods typically applied to this kind of data. As a reminder, [DNA](http://en.wikipedia.org/wiki/DNA) is a molecule that encodes genetic instructions for living organisms. DNA is typically found as a double-stranded helix, in which each strand corresponds to a sequence of *nucleotides*. Each nucleotide consists of a nucleobase (guanine, adenine, thymine, or cytosine) attached to sugars, which are in turn separated from each other by phosphate groups:\n",
"\n",
"
\n",
"