MS Plotting utilities
=====================

These scripts provide some simple utilities for mining and plotting MS data.
The data is expected to be available in mzML or the openMS featureXML format.

Script files can be run as standalone apps or as a library.


MSPlot.msplot3d
---------------

Command Line Usage: msplot3d.py [options]

Options:
  -h, --help            show this help message and exit
  -m MZML, --mzml=MZML  Input mzML file
  -f FEATURE, --feature=FEATURE
                        comma separated list of featureXML files
  -o OUTFILE, --output=OUTFILE
                        output filename (default='plot.pdf')
  -s, --showX11         Show X11 interactive plot.
  -x XROT, --xrot=XROT  Xaxis rotation for plot (default (30)
  -y YROT, --yrot=YROT  Yaxis rotation for plot (default: -45)
  -l CLIP, --limit=CLIP
                        Clip threshold for plot
  -c COLOURS, --colours=COLOURS
                        list of colours with which to plot features. Colours
                        will be recycled when needed (default='r,g,m,c,y,k')
  -r RTMIN, --rtmin=RTMIN
                        minimum retention time
  -R RTMAX, --rtmax=RTMAX
                        maximum retention time
  -t MZMIN, --mzmin=MZMIN
                        minimum m/z value
  -T MZMAX, --mzmax=MZMAX
                        maximum m/z value
  -w RTWINDOW, --rtwindow=RTWINDOW
                        Window around plot area in which to identify feature
                        peaks
  -p, --plotms2         Plot MS2 events
  -d MS2WIN, --ms2win=MS2WIN
                        MS2 ion capture window size

Module usage:
mzml is the only non-named argument. All arguments as per the command line version except that lists should be provided as arrays rather than comma-delimited text.

>>> import MSPlot.msplot3d
>>> MSPlot.plot3d(mzml, featfiles=[], outfile='plot.pdf', show=False, xrot=30, yrot=-45, featcols=['r','g','b','m','c','k'], thresh=100, minrt=None,maxrt=None,minmz=None, maxmz=None, ms2win=2.0, rtwindow=20.0, plotms2=True)


MSPlot.ms1pep
-------------
Module Usage:
-------------
>>> import Unimod.unimod
>>> import MSPlot.ms1pep
>>> peptide='ACDEFGHIKLMNPQRSTVWYKKACDPRFGHI'
# protein amino acid sequence

>>> frags=MSPlot.ms1pep.digestprotein(peptide, enzyme=1, overlap=True, unfavoured=False)
    # overlap - include 1 missed cleavage
    # unfavoured - include unfavourable sites (see the EMBOSS documentation for more details)
    # enzyme - numerical enzyme selection according to the EMBOSS documentation
    Enzymes and Reagents
         1 : Trypsin
         2 : Lys-C
         3 : Arg-C
         4 : Asp-N
         5 : V8-bicarb
         6 : V8-phosph
         7 : Chymotrypsin
         8 : CNBr
>>> frags
[{'start': '1', 'end': '15', 'sequence': 'ACDEFGHIKLMNPQR'}, {'start': '16', 'end': '23', 'sequence': 'STVWYKK'}, {'start': '24', 'end': '32', 'sequence': 'ACDPRFGHI'}, {'start': '1', 'end': '23', 'sequence': 'ACDEFGHIKLMNPQRSTVWYKK'}]
# list of digested fragment peptides. 

>>> camc=Unimod.unimod.database.get_label('Carbamidomethyl')['delta_mono_mass']
# get delta mass for fixed cysteine modification.

>>> MSPlot.ms1pep.listmz(frags[0]['sequence'], charges=[2,3,4], modifications=[], fixedmods={'C': camc })
[908.43562931579208, 605.95969455552802, 454.72172717539604]
# returns a list of mz values

>>> mzlist=MSPlot.ms1pep.listmz(frags[1]['sequence'], charges=[2,3,4], modifications=['2 Phospho (T)',], fixedmods={'C': camc })
>>> mzlist
[536.7538243454826, 358.17182457532175, 268.8808246902413]

# listmz does not permute, so for variable modifications, call it with each permutation.

>>> xics=MSPlot.ms1pep.getXIC("LCMSrun.mzML", mzlist, tolerance=0.01, minrt=0, maxrt=None)
>>> xics.keys()
['rt', '358.171824575', '536.753824345', '268.88082469']
# extracts the XIC for each mz ratio. The keys are a truncated string version of the full precision calculated mass. 

MSPlot.ms1pep.plotXIC(xics, colours=['r','g','b','m','c','k'], outfile="plot.pdf", title="XIC plot"):
# plot the XICs. Any number can be plotted, but they are separated by 7% max(XIC)


Dependencies
============

pyMS - OpenMS data parsing and analysis by Niek de Klein
pymzml - mzML parsing library
pyteomics - package for calculating mz values and more
numpy -  large scale data handling
matplotlib - plotting library
EMBOSS (for the digest application)
Unimod - module for handling a Unimod database.



Author
======

Dr David Martin, University of Dundee d.m.a.martin@dundee.ac.uk