<todo version="0.1.19">
    <title>
        Pyndexter, pronounced 'poindexter', a full text indexing abstraction layer
    </title>
    <note priority="medium" time="1145722536">
        Callbacks for index() and discard(), perhaps something similar for Source objects
    </note>
    <note priority="medium" time="1145802778">
        Finish PyLucene adapter
    </note>
    <note priority="medium" time="1145854608" done="1146296772">
        Finish MetaSource
    </note>
    <note priority="medium" time="1146296806">
        Optimise on disk format for DefaultIndexer. Use URI/word "ids" rather than full word.
    </note>
    <note priority="medium" time="1146321654">
        I think it might need a MIME filter system, for translating known content types to plain text for indexing. eg. Just the content of HTML pages. This could get out of hand.
    </note>
    <note priority="medium" time="1146328561" done="1146368244">
        state() is being called, which in the naive implementation simply walks the entire source. Need some way around this. Should the state() be accumulated somehow when the source is being walked?
    </note>
    <note priority="medium" time="1146331225" done="1146368238">
        HTTPSource should be able to handle multiple iterations, but self._traversed renders this impossible.
    </note>
</todo>
