Assignment 5: Writing your own mass spec software and
an adventure in scientific collaboration
You recently received an email from the genomics guru Craig Venter
asking (no, begging) you to collaborate with him on some mass
spectrometry experiments his institute is undertaking (apparently the
word is out that you are taking the revolutionary Introduction to
Genomics course at WashU). His group has engineered an organism that
is apparently reliant on 10 synthetic proteins that they have inserted
into the organism's genome. (They have named the organism Ego
maximus.) However, it's unclear whether or not these proteins are
expressed becasue they can't visualize them by PAGE. They are going to
try to detect these 10 proteins in E. maximus by the standard
tryptic digest/LC/ES ionization/MS/MS experiments that you learned
about this week.
Craig has sent you a file that
contains the 10 E. maximus protein sequencess. He wants you to write a
perl script that takes in a file with fasta-formated protein sequences
and outputs:
- All of the tryptic fragments for each of the proteins. (Trypsin cuts after arginine and lysine.)
- The predicted molecular weights of the tryptic
fragments. (The molecular weights for the amino acids are
available here. The weights include the water molecule)
... OK, now a few weeks have passed, and your are feeling pretty good
about yourself. But then you get an frantic email from your buddy
Craig saying that you are a poor bioinformatician because the list of
predicted trypsin fragments you sent him doesn't quite match the data
they see in their experiments. Worried, you ask him to send you the
data. He sends you a list of fragment masses (i.e., not the raw M/Z
values) that they got for one of the ten proteins (i.e., from a
LC/MS experiment, not a LC/MS/MS experiment). Click here for the file of the observed masses.
Question 1: Which protein is it? Can you suggest an alternate
explanation (other than you being a crappy coder)? Like something
rooted in protein biochemistry?
Question 2: The second trypsin peptide in protein 1 is SCHLLR, predict what you
would actually see in the real MS experiment. That is, write down the
values for the two most prevelant M/Z values you would expect to see
for the whole peptide and for the entire daughter ion spectrum.
Your scripts and answers are due Friday, Feb 20th, 2009 at midnight. Enjoy.