Metadata-Version: 1.0
Name: pyfasta
Version: 0.2.5
Summary: pythonic access to fasta sequence files
Home-page: http://code.google.com/p/bpbio/
Author: brentp
Author-email: bpederse@gmail.com
License: MIT
Description: ==================================================
        pyfasta: pythonic access to fasta sequence files.
        ==================================================
        
        
        :Author: Brent Pedersen (brentp)
        :Email: bpederse@gmail.com
        :License: MIT
        
        
        Implementation
        ==============
        
        Requires Python >= 2.5. Stores a flattened version of the fasta file without
        spaces or headers. And a pickle of the start, stop (for fseek) locations of
        each header in the fasta file for internal use.
        Now supports the numpy array interface.
        
        
        Usage
        =====
        
        ::
        
        >>> from pyfasta import Fasta
        
        >>> f = Fasta('tests/data/three_chrs.fasta')
        >>> sorted(f.keys())
        ['chr1', 'chr2', 'chr3']
        
        >>> f['chr1']
        FastaRecord('tests/data/three_chrs.fasta.flat', 0..80)
        
        Slicing
        -------
        ::
        
        >>> f['chr1'][:10]
        'ACTGACTGAC'
        
        # get the 1st basepair in every codon (it's python yo)
        >>> f['chr1'][::3]
        'AGTCAGTCAGTCAGTCAGTCAGTCAGT'
        
        
        # the index stores the start and stop of each header from the fasta file.
        # (you should never need this)
        >>> f.index
        {'chr3': (160, 3760), 'chr2': (80, 160), 'chr1': (0, 80)}
        
        
        # can query by a 'feature' dictionary
        >>> f.sequence({'chr': 'chr1', 'start': 2, 'stop': 9})
        'CTGACTGA'
        
        # with reverse complement for - strand
        >>> f.sequence({'chr': 'chr1', 'start': 2, 'stop': 9, 'strand': '-'})
        'TCAGTCAG'
        
        
        ---------------------
        Numpy Array Interface
        ---------------------
        ::
        
        # FastaRecords support the numpy array interface.
        >>> import numpy as np
        >>> a = np.array(f['chr2'])
        >>> a.shape[0] == len(f['chr2'])
        True
        
        >>> a[10:14]
        array(['A', 'A', 'A', 'A'],
        dtype='|S1')
        
        
        # cleanup (though for real use these will remain for faster access)
        >>> import os
        >>> os.unlink('tests/data/three_chrs.fasta.gdx')
        >>> os.unlink('tests/data/three_chrs.fasta.flat')
        
Keywords: bioinformatics blast fasta
Platform: UNKNOWN
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
