======================
Historical Connections
======================

Usage
=====

A database can be opened with a read-only, historical connection when given
a specific transaction or datetime.  This can enable full-context application
level conflict resolution, historical exploration and preparation for reverts,
or even the use of a historical database revision as "production" while
development continues on a "development" head.

A database can be opened historically ``at`` or ``before`` a given transaction
serial or datetime. Here's a simple example. It should work with any storage
that supports ``loadBefore``. 

We'll begin our example with a fairly standard set up.  We

- make a storage and a database;
- open a normal connection;
- modify the database through the connection;
- commit a transaction, remembering the time in UTC;
- modify the database again; and
- commit a transaction.

    >>> import ZODB.MappingStorage
    >>> db = ZODB.MappingStorage.DB()
    >>> conn = db.open()

    >>> import persistent.mapping
    
    >>> conn.root()['first'] = persistent.mapping.PersistentMapping(count=0)
    
    >>> import transaction
    >>> transaction.commit()

We wait for some time to pass, record he time, and then make some other changes.
    
    >>> import time
    >>> time.sleep(.01)

    >>> import datetime
    >>> now = datetime.datetime.utcnow()
    >>> time.sleep(.01)
    
    >>> root = conn.root()
    >>> root['second'] = persistent.mapping.PersistentMapping()
    >>> root['first']['count'] += 1
    
    >>> transaction.commit()

Now we will show a historical connection. We'll open one using the ``now``
value we generated above, and then demonstrate that the state of the original
connection, at the mutable head of the database, is different than the
historical state.

    >>> transaction1 = transaction.TransactionManager()
    
    >>> historical_conn = db.open(transaction_manager=transaction1, at=now)
    
    >>> sorted(conn.root().keys())
    ['first', 'second']
    >>> conn.root()['first']['count']
    1
    
    >>> historical_conn.root().keys()
    ['first']
    >>> historical_conn.root()['first']['count']
    0

Moreover, the historical connection cannot commit changes.

    >>> historical_conn.root()['first']['count'] += 1
    >>> historical_conn.root()['first']['count']
    1
    >>> transaction1.commit()
    Traceback (most recent call last):
    ...
    ReadOnlyHistoryError
    >>> transaction1.abort()
    >>> historical_conn.root()['first']['count']
    0

(It is because of the mutable behavior outside of transactional semantics that
we must have a separate connection, and associated object cache, per thread,
even though the semantics should be readonly.)

As demonstrated, a timezone-naive datetime will be interpreted as UTC.  You
can also pass a timezone-aware datetime or a serial (transaction id).
Here's opening with a serial--the serial of the root at the time of the first
commit.

    >>> historical_serial = historical_conn.root()._p_serial
    >>> historical_conn.close()
    
    >>> historical_conn = db.open(transaction_manager=transaction1,
    ...                           at=historical_serial)
    >>> historical_conn.root().keys()
    ['first']
    >>> historical_conn.root()['first']['count']
    0
    >>> historical_conn.close()

We've shown the ``at`` argument. You can also ask to look ``before`` a datetime
or serial. (It's an error to pass both [#not_both]_) In this example, we're
looking at the database immediately prior to the most recent change to the
root.

    >>> serial = conn.root()._p_serial
    >>> historical_conn = db.open(
    ...     transaction_manager=transaction1, before=serial)
    >>> historical_conn.root().keys()
    ['first']
    >>> historical_conn.root()['first']['count']
    0

In fact, ``at`` arguments are translated into ``before`` values because the
underlying mechanism is a storage's loadBefore method.  When you look at a
connection's ``before`` attribute, it is normalized into a ``before`` serial,
no matter what you pass into ``db.open``.

    >>> print conn.before
    None
    >>> historical_conn.before == serial
    True

    >>> conn.close()

Configuration
=============

Like normal connections, the database lets you set how many total historical
connections can be active without generating a warning, and
how many objects should be kept in each historical connection's object cache.

    >>> db.getHistoricalPoolSize()
    3
    >>> db.setHistoricalPoolSize(4)
    >>> db.getHistoricalPoolSize()
    4

    >>> db.getHistoricalCacheSize()
    1000
    >>> db.setHistoricalCacheSize(2000)
    >>> db.getHistoricalCacheSize()
    2000

In addition, you can specify the minimum number of seconds that an unused
historical connection should be kept.

    >>> db.getHistoricalTimeout()
    300
    >>> db.setHistoricalTimeout(400)
    >>> db.getHistoricalTimeout()
    400

All three of these values can be specified in a ZConfig file. 

    >>> import ZODB.config
    >>> db2 = ZODB.config.databaseFromString('''
    ...     <zodb>
    ...       <mappingstorage/>
    ...       historical-pool-size 3
    ...       historical-cache-size 1500
    ...       historical-timeout 6m
    ...     </zodb>
    ... ''')
    >>> db2.getHistoricalPoolSize()
    3
    >>> db2.getHistoricalCacheSize()
    1500
    >>> db2.getHistoricalTimeout()
    360


The pool lets us reuse connections.  To see this, we'll open some
connections, close them, and then open them again:

    >>> conns1 = [db2.open(before=serial) for i in range(4)]
    >>> _ = [c.close() for c in conns1]
    >>> conns2 = [db2.open(before=serial) for i in range(4)]

Now let's look at what we got.  The first connection in conns 2 is the
last connection in conns1, because it was the last connection closed.

    >>> conns2[0] is conns1[-1]
    True

Also for the next two:

    >>> (conns2[1] is conns1[-2]), (conns2[2] is conns1[-3])
    (True, True)

But not for the last:

    >>> conns2[3] is conns1[-4]
    False

Because the pool size was set to 3.

Connections are also discarded if they haven't been used in a while.
To see this, let's close two of the connections:

    >>> conns2[0].close(); conns2[1].close()

We'l also set the historical timeout to be very low:

    >>> db2.setHistoricalTimeout(.01)
    >>> time.sleep(.1)
    >>> conns2[2].close(); conns2[3].close()

Now, when we open 4 connections:

    >>> conns1 = [db2.open(before=serial) for i in range(4)]

We'll see that only the last 2 connections from conn2 are in the
result:

    >>> [c in conns1 for c in conns2]
    [False, False, True, True]


If you change the historical cache size, that changes the size of the
persistent cache on our connection.

    >>> historical_conn._cache.cache_size
    2000
    >>> db.setHistoricalCacheSize(1500)
    >>> historical_conn._cache.cache_size
    1500

Invalidations
=============

Invalidations are ignored for historical connections. This is another white box
test.

    >>> historical_conn = db.open(
    ...     transaction_manager=transaction1, at=serial)
    >>> conn = db.open()
    >>> sorted(conn.root().keys())
    ['first', 'second']
    >>> conn.root()['first']['count']
    1
    >>> sorted(historical_conn.root().keys())
    ['first', 'second']
    >>> historical_conn.root()['first']['count']
    1
    >>> conn.root()['first']['count'] += 1
    >>> conn.root()['third'] = persistent.mapping.PersistentMapping()
    >>> transaction.commit()
    >>> len(historical_conn._invalidated)
    0
    >>> historical_conn.close()

Note that if you try to open an historical connection to a time in the future,
you will get an error.

    >>> historical_conn = db.open(
    ...     at=datetime.datetime.utcnow()+datetime.timedelta(1))
    Traceback (most recent call last):
    ...
    ValueError: cannot open an historical connection in the future.

Warnings
========

First, if you use datetimes to get a historical connection, be aware that the
conversion from datetime to transaction id has some pitfalls. Generally, the
transaction ids in the database are only as time-accurate as the system clock
was when the transaction id was created. Moreover, leap seconds are handled
somewhat naively in the ZODB (largely because they are handled naively in Unix/
POSIX time) so any minute that contains a leap second may contain serials that
are a bit off. This is not generally a problem for the ZODB, because serials
are guaranteed to increase, but it does highlight the fact that serials are not
guaranteed to be accurately connected to time. Generally, they are about as
reliable as time.time.

Second, historical connections currently introduce potentially wide variance in
memory requirements for the applications. Since you can open up many
connections to different serials, and each gets their own pool, you may collect
quite a few connections. For now, at least, if you use this feature you need to
be particularly careful of your memory usage. Get rid of pools when you know
you can, and reuse the exact same values for ``at`` or ``before`` when
possible. If historical connections are used for conflict resolution, these
connections will probably be temporary--not saved in a pool--so that the extra
memory usage would also be brief and unlikely to overlap.

.. ......... ..
.. Footnotes ..
.. ......... ..

.. [#not_both] It is an error to try and pass both `at` and `before`.

    >>> historical_conn = db.open(
    ...     transaction_manager=transaction1, at=now, before=historical_serial)
    Traceback (most recent call last):
    ...
    ValueError: can only pass zero or one of `at` and `before`
