The NGram class extents the Python set class with the ability
to search for set members ranked by their N-Gram string similarity
to the query. There are also methods for comparing a pair of strings.

There is [documentation][1] and a [tutorial][2].

How does it work?
=================

The set stores arbitrary items by using a specified "key" function
to produce a string representation of set members suitable for N-gram indexing.
By default it simply calls str() on the objects.

The N-grams are obtained by splitting strings into overlapping substrings
of N (usually N=3) characters in length and association is maintained from
each distinct N-Gram to items that use it.

To find items similar to a query string, it splits the query into N-grams,
collects all items sharing at least one N-gram with the query,
and ranks the items by score based on the ratio of shared to unshared
N-grams between strings.

History
=======

In 2007, Michel Albert (exhuma) wrote the python-ngram module based on Perl's
[String::Trigram][3] module by Tarek Ahmed, and committed the code for 2.0.0b2 to
Sourceforge subversion repo.

Since late 2008 python-ngram development has been continued Graham Poulter,
adding features, documentation, performance improvements and Python 3 support.
The repo was first moved to Mercurial repo on Google Code, but primary
development now takes place here on GitHub.

[1]: http://packages.python.org/ngram/
[2]: http://packages.python.org/ngram/tutorial.html
[3]: http://search.cpan.org/dist/String-Trigram/

