================================
 Japanese Tokenizers for Whoosh
================================

About
=====

Tokenizers for Whoosh full text search library designed for Japanese language.
This package conteins two Tokenizers.

 * IgoTokenizer

  + requires igo-python(http://pypi.python.org/pypi/igo-python/) and its dictionary.

 * TinySegmenterTokenizer

  + requires TinySegmenter in Python(https://code.google.com/p/mhagiwara/source/browse/trunk/nltk/jpbook/tinysegmenter.py)


How To Use
==========

IgoTokenizer::

 import igo.Tagger
 import WhooshJapaneseTokenizer

 tk = WhooshJapaneseTokenizer.IgoTokenizer(igo.Tagger.Tagger('ipadic'))
 scm = Schema(title=TEXT(stored=True, analyzer=tk), path=ID(unique=True,stored=True), content=TEXT(analyzer=tk))


TinySegmenterTokenizer::

 import WhooshJapaneseTokenizer
 import tinysegmenter

 tk = WhooshJapaneseTokenizer.TinySegmenterTokenizer(tinysegmenter.TinySegmenter())
 scm = Schema(title=TEXT(stored=True, analyzer=tk), path=ID(unique=True,stored=True), content=TEXT(analyzer=tk))

