=========================
Paired Sequence Utilities 
=========================
Utilities to work with paired sequence files

Copyright 2012 Lance Parsons <lparsons@princeton.edu>

BSD 2-Clause License http://www.opensource.org/licenses/BSD-2-Clause - See LICENSE.txt

Installation
============
1. Install BioPython version 1.57 or above (required for paired_sequence_match.py)::

	pip install BioPython

2. Install paired_sequence_utils::

	pip install paired_sequence_utils


paired_sequence_match.py
=========================

Takes two sequence files as input and matches up paired sequences, outputting them separately from orphan sequences.
Useful when paired reads are in two separate files and were filtered separately. By default, paired reads are output 
interleaved with another (read 1 and read 2 of a pair, then read 1 and read 2 of a second pair, etc.).  If the 
paired output file is specified twice, the first read is output to the in the first file, the second read of a pair is
output in the second file. 

Examples
--------

Output paired reads interleaved to STDOUT and the single reads to STDERR::
	
	paired_sequence_match.py read1.fastq read2.fastq > paired_reads.fastq 2>single_reads.fastq
	
Output paired reads to separate files::
	
	paired_sequence_match.py read1.fastq read2.fastq -p read1_paired.fastq -p read2_paired.fastq -s single_reads.fastq

NOTE: This script requires BioPython (http://biopython.org) version 1.57 or above
	

barcode_splitter.py
===================

Split multiple fastq files by matching barcodes in one of the sequence files. 
Barcodes in the tab-delimited barcodes.txt file are matched against the beginning of the specified index read
By default, barcodes must match exactly, but --mistmatches can be set higher if desired
If input files are gzipped, the output is as well.  Compression can be forced with the --gzip option.

Examples
--------

Split a an Illumina paired-end run where the index read is read 2, the forward read is read 1, and the reverse read is read 3::

	barcode_splitter.py --bcfile barcodes.txt read1.fastq read2_index.fastq read3.fastq --idxread 2 --suffix .fastq
	