.TH %(program)s %(section)s "%(month)s %(day)s %(year)s" "version %(version)s" "USER COMMANDS"
.SH NAME
%(program)s \- Load NetLogger log files into a relational database
.SH SYNOPSIS
.B %(program)s [options]
.SH DESCRIPTION
Load NetLogger log files into a relational database, using a fixed general-purpose
schema. The supported databases are SQLite, PostgreSQL, MySQL, and a "test" database
that simply prints the SQL statements to the console.
.SH OPTIONS
%(options)s 
.SH USAGE
.SS Modes of operation
The program runs in two modes: 'standalone' and 'pipeline'. In standalone mode, 
you provide a list of files to load, and provide the connection URI, database
parameters, etc. via command-line options. In 'pipeline' mode, you are assumed to
be running the 
.B nl_parser
program to create a series of output files that the 
.B nl_loader
is then loading into the database. The use of one mode or the other is signaled
by whether the
.B -u
option (for standalone mode) or 
.B -c
option (for pipeline mode) is given. These two modes are documented separately below.
.SS Standalone mode
In this mode, the -u option is required. The input file can either be read from standard input, given explicitly with
the -i option,  or inferred from the restore-state file given with the -r option.
.PP
For the 
.B -p
option, please note that the name=value pairs used as parameters 
to the connection are not standard across
database modules. You will need to consult the documentation for the appropriate
database and/or its python module. The modules used are as follows. 
sqlite:
.B sqlite3 
for Python2.5+, 
.B pysqlite2.dbapi2
for Python2.4 or lower;
PostgreSQL:
.B psycopg2
, or
.B pgdb
if that's not found;
MySQL:
.B MySQLdb
\.
.PP
For the 
.B -r/--restore
option, note that the file need not exist, in which case it will be created. Subsequent invocations with the
same file name will simply start where the last one left off. This removes the fear from loading very large
files into the database, since you can interrupt with control-C and resume (with the same command line) 
multiple times.
.SS Pipeline mode
Pipeline mode uses a configuration file. 
The general syntax of the configuration file is an 'INI' variant
recognized by the Python ConfigObj module.  See the ConfigObj homepage
(listed at bottom) for details.  Basically, the format consists of
sections of keyword, value pairs. 
Keywords and values are separated by an '=', and section markers 
are between square brackets. 
Keywords, values, and section names can be surrounded by single or double
quotes. 
Nested subsections are indicated by increasing numbers the square brackets in the
section marker, e.g., "[section]", "[[subsection]]", and "[[[sub-subsection]]]".
You can have list values by separating items with a comma, and values spanning multiple lines by using triple quotes (single or double).
.PP
The top-level sections are: global, input, database, and logging.
.PP
The following keywords are recognized in the 
.B [global]
section:
.TP
.B state_file
Save state to the given file. This preserves the name and offset within the current numbered input file. 
The database connection (and related parameters) is not saved.
.br
Default value =
.I /tmp/netlogger_loader_state
The following keywords are recognized in the 
.B [input]
section:
.TP
.B filename
Input filename or, for numbered files, base filename.
.br
Default value =
.I <required>
.TP
.B numbered_files
Whether files have a '.##' suffix used to determine their order.
This is the convention followed by the nl_parser.
.br
Default value =
.I False
.TP
.B delete_old_files
If true, delete files after loading them into the database.
Overrides move_files_dir and move_files_suffix keywords.
.br
Default value =
.I False
.TP
.B move_files_dir
If true, move files to the given directory after loading them.
Overrides move_files_suffix.
.br
Default value =
.I None
.TP
.B move_files_suffix
If true, rename files by appending the given suffix after loading them. 
.br
Default value =
.I None
.PP
The following keywords are recognized in the 
.B [database]
section:
.TP
.B uri
Database connection URI, in the same format as the -u/--uri option.
.br
Default value =
.I <required>
.TP
.B batch
Load batch size, same as -b/--batch option.
.br
Default value =
.I 100
.TP
.B create
Create database on load, same as -C/--create option.
.br
Default value =
.I True
.TP
.B unique
Whether a 'UNIQUE' constraint should be enforced on all
events. This can add time to the load, but eliminates the 
problem of duplicate events.
.br
Default value =
.I True
.TP
.B schema_file
Absolute or relative path to an alternative schema configuration
file. This file describes the SQL statements to execute when creating
new tables and when loading is finished.
.br
Default value =
.I /path/to/netlogger/analysis/schema.conf 
.TP
.B schema_init
Type of schema initialization to use, encoded as a comma-separated list of
keys. Which keys are available depends on the database engine. By default,
sqlite has unique/nounique initialization and MySQL can combine the
keys index/noindex with unique/nounique. The
.I index
key means to use indexes on relevant columns. The
.I unique
key means to enforce the UNIQUE constraint.
.br
Default value =
.I MySQL: index,unique, SQLite: unique
.TP
.B schema_finalize
Type of schema finalization to use, encoded as a comma-separated list of
keys. The matched statements are executed right before nl_loader exits.
The intent is to allow one-time post-load actions, such as compression of
the database or indexing at the end of the load. 
Which keys are available depends on the database engine.
Currently,
.I noop
, with the obvious meaning of "do nothing", is the only available key for MySQL or SQLite.
.br
Default value =
.I MySQL, SQLite: noop
.TP
.B [[parameters]]
Subsection for database keyword = value parameters, like the -p/--param option.
.br
Default value =
.I None
.PP
The
.B [logging]
section is the same as the one for 
.B nl_parser
\.
.SH EXAMPLES
Example command-line invocations of the nl_loader program. The two modes of operation, 'standalone' and 'pipeline', have separate examples.
.SS Standalone mode
.TP
Connect to MySQL on localhost, using database 'nltest' as user joe with password 'foobar', and load in the data in "file.log":
.B %(program)s -u mysql://localhost -p user=joe -p passwd=foobar -p db=nltest -i file.log
.TP
Same as above, but use the MySQL configuration in ~/.my.cnf :
.B %(program)s -u mysql://localhost -p read_default_file="~/.my.cnf" -p db=nltest -i file.log
.TP
Connect to to MySQL server on remote.host:3344, use database 'nltest', using the file ~/.my.cnf to get user and password information, and reading the log file from standard input:
.B <file.log %(program)s -u mysql://remote.host:3344 -p read_default_file="~/.my.cnf" -p db=nltest
.SS Pipeline mode
.nf
.RS

[global]
state_file = None
[input]
numbered_files = yes
move_files_suffix = ".DONE"
filename = "/tmp/parsed-data.log"
[database]
uri = "mysql://localhost"
[[parameters]]
database = pegasus

.fi
.RE
.SH EXIT STATUS
%(program)s returns zero on success, non-zero on failure.
.SH BUGS
No testing has been done with PostgreSQL in too long a time for there not to be bugs with that API. 
.SH AUTHOR
Dan Gunter (dkgunter (at) lbl.gov))
.SH SEE ALSO
http://dsd.lbl.gov/NetLoggerWiki
