[alpha_api]: javascript:alphaApi()
[alpha_implementation]: javascript:alphaImplementation()

Database Meta Data {@name=metadata}
==================

### Describing Databases with MetaData {@name=tables}    

The core of SQLAlchemy's query and object mapping operations is database metadata, which are Python objects that describe tables and other schema-level objects.  Metadata objects can be created by explicitly naming the various components and their properties, using the Table, Column, ForeignKey, Index, and Sequence objects imported from `sqlalchemy.schema`.  There is also support for *reflection*, which means you only specify the *name* of the entities and they are recreated from the database automatically.

A collection of metadata entities is stored in an object aptly named `MetaData`.  This object takes an optional `name` parameter:

    {python}
    from sqlalchemy import *
    
    metadata = MetaData(name='my metadata')

Then to construct a Table, use the `Table` class:

    {python}
    users = Table('users', metadata, 
        Column('user_id', Integer, primary_key = True),
        Column('user_name', String(16), nullable = False),
        Column('email_address', String(60), key='email'),
        Column('password', String(20), nullable = False)
    )
    
    user_prefs = Table('user_prefs', metadata, 
        Column('pref_id', Integer, primary_key=True),
        Column('user_id', Integer, ForeignKey("users.user_id"), nullable=False),
        Column('pref_name', String(40), nullable=False),
        Column('pref_value', String(100))
    )

The specific datatypes for each Column, such as Integer, String, etc. are described in [types](rel:types), and exist within the module `sqlalchemy.types` as well as the global `sqlalchemy` namespace.

Foreign keys are most easily specified by the `ForeignKey` object within a `Column` object.  For a composite foreign key, i.e. a foreign key that contains multiple columns referencing multiple columns to a composite primary key, an explicit syntax is provided which allows the correct table CREATE statements to be generated:

    {python}
    # a table with a composite primary key
    invoices = Table('invoices', metadata, 
        Column('invoice_id', Integer, primary_key=True),
        Column('ref_num', Integer, primary_key=True),
        Column('description', String(60), nullable=False)
    )
    
    # a table with a composite foreign key referencing the parent table
    invoice_items = Table('invoice_items', metadata, 
        Column('item_id', Integer, primary_key=True),
        Column('item_name', String(60), nullable=False),
        Column('invoice_id', Integer, nullable=False),
        Column('ref_num', Integer, nullable=False),
        ForeignKeyConstraint(['invoice_id', 'ref_num'], ['invoices.invoice_id', 'invoices.ref_num'])
    )
    
Above, the `invoice_items` table will have `ForeignKey` objects automatically added to the `invoice_id` and `ref_num` `Column` objects as a result of the additional `ForeignKeyConstraint` object.

The `MetaData` object supports some handy methods, such as getting a list of Tables in the order (or reverse) of their dependency:

    {python}
    >>> for t in metadata.table_iterator(reverse=False):
    ...    print t.name
    users
    user_prefs
        
And `Table` provides an interface to the table's properties as well as that of its columns:
        
    {python}
    employees = Table('employees', metadata, 
        Column('employee_id', Integer, primary_key=True),
        Column('employee_name', String(60), nullable=False, key='name'),
        Column('employee_dept', Integer, ForeignKey("departments.department_id"))
    )
    
    # access the column "EMPLOYEE_ID":
    employees.columns.employee_id
    
    # or just
    employees.c.employee_id
    
    # via string
    employees.c['employee_id']
    
    # iterate through all columns
    for c in employees.c:
        # ...
        
    # get the table's primary key columns
    for primary_key in employees.primary_key:
        # ...
    
    # get the table's foreign key objects:
    for fkey in employees.foreign_keys:
        # ...
        
    # access the table's MetaData:
    employees.metadata
    
    # access the table's Engine, if its MetaData is bound:
    employees.engine
    
    # access a column's name, type, nullable, primary key, foreign key
    employees.c.employee_id.name
    employees.c.employee_id.type
    employees.c.employee_id.nullable
    employees.c.employee_id.primary_key
    employees.c.employee_dept.foreign_key
    
    # get the "key" of a column, which defaults to its name, but can 
    # be any user-defined string:
    employees.c.name.key
    
    # access a column's table:
    employees.c.employee_id.table is employees
    >>> True
    
    # get the table related by a foreign key
    fcolumn = employees.c.employee_dept.foreign_key.column.table

#### Binding MetaData to an Engine {@name=binding}

A MetaData object can be associated with one or more Engine instances.  This allows the MetaData and the elements within it to perform operations automatically, using the connection resources of that Engine.  This includes being able to "reflect" the columns of tables, as well as to perform create and drop operations without needing to pass an `Engine` or `Connection` around.  It also allows SQL constructs to be created which know how to execute themselves (called "implicit execution").

To bind `MetaData` to a single `Engine`, use `BoundMetaData`:

    {python}
    engine = create_engine('sqlite://', **kwargs)
    
    # create BoundMetaData from an Engine
    meta = BoundMetaData(engine)
    
    # create the Engine and MetaData in one step
    meta = BoundMetaData('postgres://db/', **kwargs)
    
Another form of `MetaData` exists which allows connecting to any number of engines, within the context of the current thread.  This is `DynamicMetaData`:

    {python}
    meta = DynamicMetaData()
    
    meta.connect(engine)    # connect to an existing Engine
    
    meta.connect('mysql://user@host/dsn')   # create a new Engine and connect

`DynamicMetaData` is ideal for applications that need to use the same set of `Tables` for many different database connections in the same process, such as a CherryPy web application which handles multiple application instances in one process.

#### Using the global Metadata object

Some users prefer to create `Table` objects without specifying a `MetaData` object, having Tables scoped on an application-wide basis.  For them the `default_metadata` object and the `global_connect()` function is supplied.  `default_metadata` is simply an instance of `DynamicMetaData` that exists within the `sqlalchemy` namespace, and `global_connect()` is a synonym for `default_metadata.connect()`.  Defining a `Table` that has no `MetaData` argument will automatically use this default metadata as follows:

    {python}
    from sqlalchemy import *

    # a Table with just a name and its Columns
    mytable = Table('mytable', 
        Column('col1', Integer, primary_key=True),
        Column('col2', String(40))
        )

    # connect all the "anonymous" tables to a postgres uri in the current thread    
    global_connect('postgres://foo:bar@lala/test')

    # create all tables in the default metadata
    default_metadata.create_all()

    # the table is bound
    mytable.insert().execute(col1=5, col2='some value')
    
#### Reflecting Tables

Once you have a `BoundMetaData` or a connected `DynamicMetaData`, you can create `Table` objects without specifying their columns, just their names, using `autoload=True`:

    {python}
    >>> messages = Table('messages', meta, autoload = True)
    >>> [c.name for c in messages.columns]
    ['message_id', 'message_name', 'date']

At the moment the Table is constructed, it will query the database for the columns and constraints of the `messages` table.

Note that if a reflected table has a foreign key referencing another table, then the metadata for the related table will be loaded as well, even if it has not been defined by the application:              
        
    {python}
    >>> shopping_cart_items = Table('shopping_cart_items', meta, autoload = True)
    >>> print shopping_cart_items.c.cart_id.table.name
    shopping_carts
        
To get direct access to 'shopping_carts', simply instantiate it via the Table constructor.  You'll get the same instance of the shopping cart Table as the one that is attached to shopping_cart_items:

    {python}
    >>> shopping_carts = Table('shopping_carts', meta)
    >>> shopping_carts is shopping_cart_items.c.cart_id.table
    True
        
This works because when the Table constructor is called for a particular name and `MetaData` object, if the table has already been created then the instance returned will be the same as the original.  This is a <b>singleton</b> constructor:

    {python}
    >>> news_articles = Table('news', meta, 
    ... Column('article_id', Integer, primary_key = True),
    ... Column('url', String(250), nullable = False)
    ... )
    >>> othertable = Table('news', meta)
    >>> othertable is news_articles
    True

##### Overriding Reflected Columns {@name=overriding}

Individual columns can be overridden with explicit values when reflecting tables; this is handy for specifying custom datatypes, constraints such as primary keys that may not be configured within the database, etc.

    {python}
    >>> mytable = Table('mytable', meta,
    ... Column('id', Integer, primary_key=True),   # override reflected 'id' to have primary key
    ... Column('mydata', Unicode(50)),    # override reflected 'mydata' to be Unicode
    ... autoload=True)
    
#### Specifying the Schema Name {@name=schema}

Some databases support the concept of multiple schemas.  A `Table` can reference this by specifying the `schema` keyword argument:

    {python}
    financial_info = Table('financial_info', meta,
        Column('id', Integer, primary_key=True),
        Column('value', String(100), nullable=False),
        schema='remote_banks'
    )

Within the `MetaData` collection, this table will be identified by the combination of `financial_info` and `remote_banks`.  If another table called `financial_info` is referenced without the `remote_banks` schema, it will refer to a different `Table`.  `ForeignKey` objects can reference columns in this table using the form `remote_banks.financial_info.id`.

#### ON UPDATE and ON DELETE {@name=onupdate}

`ON UPDATE` and `ON DELETE` clauses to a table create are specified within the `ForeignKeyConstraint` object, using the `onupdate` and `ondelete` keyword arguments:

    {python}
    foobar = Table('foobar', meta,
        Column('id', Integer, primary_key=True),
        Column('lala', String(40)),
        ForeignKeyConstraint(['lala'],['hoho.lala'], onupdate="CASCADE", ondelete="CASCADE"))

Note that these clauses are not supported on SQLite, and require `InnoDB` tables when used with MySQL.  They may also not be supported on other databases.

#### Enabling Table / Column Quoting {@name=quoting}

Feature Status: [Alpha Implementation][alpha_implementation] 

Many table, schema, or column names require quoting to be enabled.  Reasons for this include names that are the same as a database reserved word, or for identifiers that use MixedCase, where the database would normally "fold" the case convention into lower or uppercase (such as Postgres).  SQLAlchemy will attempt to automatically determine when quoting should be used.  It will determine a value for every identifier name called `case_sensitive`, which defaults to `False` if the identifer name uses no uppercase letters, or `True` otherwise.  This flag may be explicitly set on any schema item as well (schema items include `Table`, `Column`, `MetaData`, `Sequence`, etc.) to override this default setting, where objects will inherit the setting from an enclosing object if not explicitly overridden.

When `case_sensitive` is `True`, the dialect will do what it has to in order for the database to recognize the casing.  For Postgres and Oracle, this means using quoted identifiers.

Identifiers that match known SQL reserved words (such as "asc", "union", etc.) will also be quoted according to the dialect's quoting convention regardless of the `case_sensitive` setting.

To force quoting for an identifier, set the "quote=True" flag on `Column` or `Table`, as well as the `quote_schema=True` flag for `Table`. 

    {python}
    table2 = Table('WorstCase2', metadata,
        # desc is a reserved word, which will be quoted.
        Column('desc', Integer, primary_key=True),

        # if using a reserved word which SQLAlchemy doesn't know about,
        # specify quote=True
        Column('some_reserved_word', Integer, quote=True, primary_key=True),

        # MixedCase uses a mixed case convention. 
        # it will be automatically quoted since it is case sensitive
        Column('MixedCase', Integer),

        # Union is both a reserved word and mixed case
        Column('Union', Integer),

        # normal_column doesnt require quoting
        Column('normal_column', String(30)))

    # to use tables where case_sensitive is False by default regardless
    # of idenfifier casings, set "case_sensitive" to false at any level
    # (or true to force case sensitive for lowercase identifiers as well)
    lowercase_metadata = MetaData(case_sensitive=False)

#### Other Options {@name=options}

`Tables` may support database-specific options, such as MySQL's `engine` option that can specify "MyISAM", "InnoDB", and other backends for the table:

    {python}
    addresses = Table('engine_email_addresses', meta,
        Column('address_id', Integer, primary_key = True),
        Column('remote_user_id', Integer, ForeignKey(users.c.user_id)),
        Column('email_address', String(20)),
        mysql_engine='InnoDB'
    )
    
### Creating and Dropping Database Tables {@name=creating}    

Creating and dropping individual tables can be done via the `create()` and `drop()` methods of `Table`; these methods take an optional `connectable` parameter which references an `Engine` or a `Connection`.  If not supplied, the `Engine` bound to the `MetaData` will be used, else an error is raised:

    {python}
    meta = BoundMetaData('sqlite:///:memory:')
    employees = Table('employees', meta, 
        Column('employee_id', Integer, primary_key=True),
        Column('employee_name', String(60), nullable=False, key='name'),
        Column('employee_dept', Integer, ForeignKey("departments.department_id"))
    )
    {sql}employees.create()
    CREATE TABLE employees(
    employee_id SERIAL NOT NULL PRIMARY KEY,
    employee_name VARCHAR(60) NOT NULL,
    employee_dept INTEGER REFERENCES departments(department_id)
    )
    {}            

`drop()` method:
    
    {python}
    {sql}employees.drop(connectable=e)
    DROP TABLE employees
    {}            

The `create()` and `drop()` methods also support an optional keyword argument `checkfirst` which will issue the database's appropriate pragma statements to check if the table exists before creating or dropping:

    {python}
    employees.create(connectable=e, checkfirst=True)
    employees.drop(checkfirst=False)
    
Entire groups of Tables can be created and dropped directly from the `MetaData` object with `create_all()` and `drop_all()`.  These methods always check for the existence of each table before creating or dropping.  Each method takes an optional `connectable` keyword argument which can reference an `Engine` or a `Connection`.  If no engine is specified, the underlying bound `Engine`,  if any, is used:

    {python}
    engine = create_engine('sqlite:///:memory:')
    
    metadata = MetaData()
    
    users = Table('users', metadata, 
        Column('user_id', Integer, primary_key = True),
        Column('user_name', String(16), nullable = False),
        Column('email_address', String(60), key='email'),
        Column('password', String(20), nullable = False)
    )
    
    user_prefs = Table('user_prefs', metadata, 
        Column('pref_id', Integer, primary_key=True),
        Column('user_id', Integer, ForeignKey("users.user_id"), nullable=False),
        Column('pref_name', String(40), nullable=False),
        Column('pref_value', String(100))
    )
    
    {sql}metadata.create_all(connectable=engine)
    PRAGMA table_info(users){}
    CREATE TABLE users(
            user_id INTEGER NOT NULL PRIMARY KEY, 
            user_name VARCHAR(16) NOT NULL, 
            email_address VARCHAR(60), 
            password VARCHAR(20) NOT NULL
    )
    PRAGMA table_info(user_prefs){}
    CREATE TABLE user_prefs(
            pref_id INTEGER NOT NULL PRIMARY KEY, 
            user_id INTEGER NOT NULL REFERENCES users(user_id), 
            pref_name VARCHAR(40) NOT NULL, 
            pref_value VARCHAR(100)
    )

### Column Defaults and OnUpdates {@name=defaults}    

SQLAlchemy includes flexible constructs in which to create default values for columns upon the insertion of rows, as well as upon update.  These defaults can take several forms: a constant, a Python callable to be pre-executed before the SQL is executed, a SQL expression or function to be pre-executed before the SQL is executed, a pre-executed Sequence (for databases that support sequences), or a "passive" default, which is a default function triggered by the database itself upon insert, the value of which can then be post-fetched by the engine, provided the row provides a primary key in which to call upon.

#### Pre-Executed Insert Defaults {@name=oninsert}

A basic default is most easily specified by the "default" keyword argument to Column.  This defines a value, function, or SQL expression that will be pre-executed to produce the new value, before the row is inserted:

    {python}
    # a function to create primary key ids
    i = 0
    def mydefault():
        global i
        i += 1
        return i

    t = Table("mytable", meta, 
        # function-based default
        Column('id', Integer, primary_key=True, default=mydefault),
    
        # a scalar default
        Column('key', String(10), default="default")
    )
            
The "default" keyword can also take SQL expressions, including select statements or direct function calls:

    {python}
    t = Table("mytable", meta, 
        Column('id', Integer, primary_key=True),
    
        # define 'create_date' to default to now()
        Column('create_date', DateTime, default=func.now()),
    
        # define 'key' to pull its default from the 'keyvalues' table
        Column('key', String(20), default=keyvalues.select(keyvalues.c.type='type1', limit=1))
        )
            
The "default" keyword argument is shorthand for using a ColumnDefault object in a column definition.  This syntax is optional, but is required for other types of defaults, futher described below:

    {python}
    Column('mycolumn', String(30), ColumnDefault(func.get_data()))

#### Pre-Executed OnUpdate Defaults {@name=onupdate}

Similar to an on-insert default is an on-update default, which is most easily specified by the "onupdate" keyword to Column, which also can be a constant, plain Python function or SQL expression:

    {python}
    t = Table("mytable", meta, 
        Column('id', Integer, primary_key=True),
        
        # define 'last_updated' to be populated with current_timestamp (the ANSI-SQL version of now())
        Column('last_updated', DateTime, onupdate=func.current_timestamp()),
    )
    

To use an explicit ColumnDefault object to specify an on-update, use the "for_update" keyword argument:

    {python}
    Column('mycolumn', String(30), ColumnDefault(func.get_data(), for_update=True))
        
#### Inline Default Execution: PassiveDefault {@name=passive}    

A PassiveDefault indicates an column default that is executed upon INSERT by the database.  This construct is used to specify a SQL function that will be specified as "DEFAULT" when creating tables.

    {python}
    t = Table('test', meta, 
        Column('mycolumn', DateTime, PassiveDefault(text("sysdate")))
    )
        
A create call for the above table will produce:

    {code}
    CREATE TABLE test (
        mycolumn datetime default sysdate
    )
        
PassiveDefault also sends a message to the `Engine` that data is available after an insert.  The object-relational mapper system uses this information to post-fetch rows after the insert, so that instances can be refreshed with the new data.  Below is a simplified version:

    {python}
    # table with passive defaults
    mytable = Table('mytable', engine, 
        Column('my_id', Integer, primary_key=True),

        # an on-insert database-side default
        Column('data1', Integer, PassiveDefault(text("d1_func()"))),
    )
    # insert a row
    r = mytable.insert().execute(name='fred')

    # check the result: were there defaults fired off on that row ?
    if r.lastrow_has_defaults():
        # postfetch the row based on primary key.
        # this only works for a table with primary key columns defined
        primary_key = r.last_inserted_ids()
        row = table.select(table.c.id == primary_key[0])
        
When Tables are reflected from the database using `autoload=True`, any DEFAULT values set on the columns will be reflected in the Table object as PassiveDefault instances.

##### The Catch: Postgres Primary Key Defaults always Pre-Execute {@name=postgres}

Current Postgres support does not rely upon OID's to determine the identity of a row.  This is because the usage of OIDs has been deprecated with Postgres and they are disabled by default for table creates as of PG version 8.  Pyscopg2's "cursor.lastrowid" function only returns OIDs.  Therefore, when inserting a new row which has passive defaults set on the primary key columns, the default function is <b>still pre-executed</b> since SQLAlchemy would otherwise have no way of retrieving the row just inserted.

#### Defining Sequences {@name=sequences}    

A table with a sequence looks like:

    {python}
    table = Table("cartitems", meta, 
        Column("cart_id", Integer, Sequence('cart_id_seq'), primary_key=True),
        Column("description", String(40)),
        Column("createdate", DateTime())
    )

The Sequence is used with Postgres or Oracle to indicate the name of a database sequence that will be used to create default values for a column.  When a table with a Sequence on a column is created in the database by SQLAlchemy, the database sequence object is also created.   Similarly, the database sequence is dropped when the table is dropped.  Sequences are typically used with primary key columns.  When using Postgres, if an integer primary key column defines no explicit Sequence or other default method, SQLAlchemy will create the column with the SERIAL keyword, and will pre-execute a sequence named "tablename_columnname_seq" in order to retrieve new primary key values, if they were not otherwise explicitly stated.   Oracle, which has no "auto-increment" keyword, requires that a Sequence be created for a table if automatic primary key generation is desired.

A Sequence object can be defined on a Table that is then used for a non-sequence-supporting database.  In that case, the Sequence object is simply ignored.  Note that a Sequence object is **entirely optional for all databases except Oracle**, as other databases offer options for auto-creating primary key values, such as AUTOINCREMENT, SERIAL, etc.  SQLAlchemy will use these default methods for creating primary key values if no Sequence is present on the table metadata.

A sequence can also be specified with `optional=True` which indicates the Sequence should only be used on a database that requires an explicit sequence, and not those that supply some other method of providing integer values.  At the moment, it essentially means "use this sequence only with Oracle and not Postgres".

### Defining Constraints and Indexes {@name=constraints}

#### UNIQUE Constraint

Unique constraints can be created anonymously on a single column using the `unique` keyword on `Column`.  Explicitly named unique constraints and/or those with multiple columns are created via the `UniqueConstraint` table-level construct.

    {python}
    meta = MetaData()
    mytable = Table('mytable', meta,
    
        # per-column anonymous unique constraint
        Column('col1', Integer, unique=True),
        
        Column('col2', Integer),
        Column('col3', Integer),
        
        # explicit/composite unique constraint.  'name' is optional.
        UniqueConstraint('col2', 'col3', name='uix_1')
        )

#### CHECK Constraint

Check constraints can be named or unnamed and can be created at the Column or Table level, using the `CheckConstraint` construct.  The text of the check constraint is passed directly through to the database, so there is limited "database independent" behavior.  Column level check constraints generally should only refer to the column to which they are placed, while table level constraints can refer to any columns in the table.

Note that some databases do not actively support check constraints such as MySQL and sqlite.

    {python}
    meta = MetaData()
    mytable = Table('mytable', meta,
    
        # per-column CHECK constraint
        Column('col1', Integer, CheckConstraint('col1&gt;5')),
        
        Column('col2', Integer),
        Column('col3', Integer),
        
        # table level CHECK constraint.  'name' is optional.
        CheckConstraint('col2 &gt; col3 + 5', name='check1')
        )
    
#### Indexes

Indexes can be created anonymously (using an auto-generated name "ix_&lt;column label&gt;") for a single column using the inline `index` keyword on `Column`, which also modifies the usage of `unique` to apply the uniqueness to the index itself, instead of adding a separate UNIQUE constraint.  For indexes with specific names or which encompass more than one column, use the `Index` construct, which requires a name.  

Note that the `Index` construct is created **externally** to the table which it corresponds, using `Column` objects and not strings.

    {python}
    meta = MetaData()
    mytable = Table('mytable', meta,
        # an indexed column, with index "ix_mytable_col1"
        Column('col1', Integer, index=True),

        # a uniquely indexed column with index "ix_mytable_col2"
        Column('col2', Integer, index=True, unique=True),

        Column('col3', Integer),
        Column('col4', Integer),

        Column('col5', Integer),
        Column('col6', Integer),
        )

    # place an index on col3, col4
    Index('idx_col34', mytable.c.col3, mytable.c.col4)

    # place a unique index on col5, col6
    Index('myindex', mytable.c.col5, mytable.c.col6, unique=True)

The `Index` objects will be created along with the CREATE statements for the table itself.  An index can also be created on its own independently of the table:

    {python}
    # create a table
    sometable.create()

    # define an index
    i = Index('someindex', sometable.c.col5)

    # create the index, will use the table's connectable, or specify the connectable keyword argument
    i.create()

### Adapting Tables to Alternate Metadata {@name=adapting}

A `Table` object created against a specific `MetaData` object can be re-created against a new MetaData using the `tometadata` method:

    {python}
    # create two metadata
    meta1 = BoundMetaData('sqlite:///querytest.db')
    meta2 = MetaData()
                        
    # load 'users' from the sqlite engine
    users_table = Table('users', meta1, autoload=True)
    
    # create the same Table object for the plain metadata
    users_table_2 = users_table.tometadata(meta2)
    
    
