  TODO for IMDbPY
  ===============

See the code, and search for XXX, FIXME and TODO.

NOTE: it's always time to clean the code! <g>


[general]
* Searching names and titles with "local" and "sql" data access systems
  is somewhat a mess: the various functions should return/manage
  information in a more coherent way.
* Write better summary() methods for Movie and Person classes.
* Some portions of code are poorly commented.
* The documentation is written in my funny English.
* Compatibility with Python 2.2 and previous versions is not assured
  for every data access system (the imdbpy2sql.py script for sure
  requires at least Python 2.3).
* I have a testsuite, but it's not ready to be make public;
  clean and extend it.


[searches]
* Support advanced query for movie titles/person names.


[Movie objects]
* Define invariable names for the sections (the keys you use to access
  info stored in a Movie object).
* Should the __contains__() methods descend into nested lists
  and dictionaries?
* For TV series the list of directors/writers returned by 'local'
  and 'sql' is a long list with every single episodes listed in the
  'notes' attribute (i.e.: the same person is listed more than one time,
  just with a different note).
  For 'http' and 'mobile' there's a list with one item for every
  person, with a long 'notes' listing every episode.
  It's not easy to split these information since they can contain
  notes ("written by", "as Aka Name", ...)
* The 'laserdisc' information for 'local' and 'sql' is probabily
  wrong: I think they merge data from different laserdisc titles.


[Person objects]
* Define invariable names for the sections (the keys you use to access
  info stored in a Person object).
* Fetching data from the web ('http' and 'mobile'), the filmography
  for a given person contains a list named "himself" or "herself" for
  movies/shows where they were not acting.
  In 'local' and 'sql', these movies/shows are listed in the
  "actor" or "actress" list.


[http data access system]
* There's a known bug using IMDbPY with Python 2.0 and 2.1: 
  it fails to retrieve web pages, raising the httplib.UnknownTransferEncoding
  exception, if a proxy is used; unset the HTTP_PROXY environment variable
  or call the imdbObject.set_proxy(None) method.
* If the access through the proxy fails, is it possible to
  automatically try without?  It doesn't seem easy...
* Some (many?) HTML parser can be interrupted as long as they've
  parsed every needed information, as HTMLSearchMovieParser does.
* Access to the "my IMDb" functions for registered users would
  be really cool.
* Gather more movies' data: user comments, laserdisc details, trailers,
  posters, photo gallery, on tv, schedule links, showtimes, message boards.
* Gather more people's data: photo gallery.


[httpThin data access system]
* It should be made _really_ faster than 'http'.


[mobile data access system]
* General optimization.
* Make find() methods case insensitive.


[local data access system]
* There's probably a bug converting the rating to a float;
  see: ER (1994) (TV), but I suspect this to be a mkdb bug.
* The 'votes' key is not correctly stored for very high values;
  this is for sure a mkdb bug.
* Files like biographies.data are plainly wrong: sometimes
  a person is referred with 'Name Surname' (qv) and sometimes
  with 'Surname, Name' (qv).
  Names and titles with "'" are not handled properly and so on...
  This is a problem of the Plain Text Data Files, and can't be fixed.
* The convBin() and getFullIndex() functions in the utils
  modules _must_ be rewritten (they're crap!).
* Some files are not considered: mpaa-ratings-reasons.list,
  miscellaneous-companies.list (but they're used by the 'sql' data
  access system).
* methods like get_imdbMovieID() should use the mobile access system?
* You need the mkdb executable from the moviedb program to generate the
  database/index files; moviedb is not open source software (although
  it can be downloaded and used without paying money).
  An open source (pure python?) script must be written.
  NOTE: since release 2.1 I've introduced the imdbpy2sql.py script,
  so a replacement for mkdb is no more an urgent issue.


[sql data access system]
NOTE NOTE NOTE: this is still beta code and I'm not a database guru;
moreover I'm short of time and so I will be happy to fix every bug
you'll find, but if you're about to write me an email like "ehi,
the database access should be faster", "the imdbpy2sql.py script must
run with 64 MB of RAM and complete in 2 minutes" or "your database
layout sucks: I've an idea for a better structure...", well, consider
that _these_ kinds of email will be probably immediately discarded.

I _know_ these are important issues, but I've neither the time nor
the ability to fix these problems by myself, sorry.
Obviously if you want to contribute with patches, new code, new
SQL queries and new database structures you're welcome and I will
be very grateful for your work.

Again: if there's something that bother you, write some code.
It's free software, after all.

Things to do:
* Support for other SQL database (like PostgresSQL, SQLite, ...), and try
  to be database-independent.
* The imdbpy2sql.py script MUST be run on a database with empty tables;
  unfortunately so far a SQL installation can't be "updated" without
  recreating the database from scratch.
  IMDb releases also "diff" files to keep the plain text files updated;
  it would be wonderful to directly use these diff files to upgrade the
  SQL database, but I think this is a nearly impossible task.
* The database layout sucks (what about indexes?  I'm not sure they are
  ok this way).
* The SQL queries I've written are probably crap.
* Running the imdbpy2sql.py script will still issue some Warnings, from
  the MySQLdb module.  This is especially true for the readMovieList()
  function.
  NOTE: this is due to data truncation in 'note' field, where information
  are longer than 255 chars; see mysql-python bug #1295308:
  http://sf.net/tracker/index.php?func=detail&aid=1295308&group_id=22307&atid=374932
* The imdbpy2sql.py is slow and takes a lot of time, but honestly this
  is a minor issue, and I fear there're few chances to see substantial
  improvements.
* Comment the imdbpy2sql.py script.


