Package: VISET: Computer Vision Datasets  
Author: Jeffrey Byrne <jeff@visym.com>  
URL: https://github.com/visym/viset/

VISET is a python package for creating streaming computer vision datasets.
A VISET is a redistributable HDF5 file used for sharing datasets,
which provides a common python programming interface for 
downloading and caching datasets for typical evaluation tasks.

VISET supports iterating over large image datasets in a pythonic way:

# Create dataset (once), dowload (once) and iterate over stream
dbfile = viset.caltech.Caltech101().export()
db = viset.dataset.Viset(dbfile)            
for (image, annotation) in db.annotation.categorization: 
    ...

VISET supports iterating over various dataset partitions:

for fold in db.annotation.categorization(strategy='kfold', folds=3): 
    for (im,annotation) in fold.train:   
        ... 
    for (im,annotation) in fold.test:   
        ...

VISET is useful for sharing large annotated datasets defined by URLs
and large image archive files.  Researchers can post their VISET dataset
file (e.g. caltech101.h5) files online to share an annotated dataset for
easy reuse.

VISET currently exports:

1. ImageNet-Fall2011  
2. Caltech101  
3. Caltech256  
4. ETHZ shapes
5. ETHZ extended shapes
6. MNIST
7. LabelMe3 
8. Weizmann Horses (single scale)
9. Weizmann Horses (multiscale scale)
10. Pascal VOC 2012 
... more coming!

See demo_viset.py for an example of usage.

Try it:

sh> pip install viset

Inspired by the excellent work at:

1. http://scikit-learn.org/
2. http://jaberg.github.io/skdata/
3. http://mldata.org





