

Overview
--------

The keas.pbpersist package provides a way to store ZODB objects using
Google Protocol Buffer encoding rather than pickling.  It requires the
keas.pbstate package.  The Protocol Buffer format is more limited than
pickles, but it is also simpler, better documented, and not dependent
on any particular programming language.

Only Persistent objects that use the ProtobufState metaclass
(from the keas.pbstate.meta module) are eligible
for this kind of serialization, so applications that want to take advantage
of keas.pbpersist need to be refactored at the model level.  However,
applications can be refactored gradually, since protobuf
encoded objects and pickled objects can coexist and hold references
to each other within a single database.

This package is expected to be fully compatible with most ZODB
storages, including FileStorage, ZEO, RelStorage, and any other storage
that treats object records as opaque binary streams.

Note that this package requires a patch to ZODB; see the "-polling-serial"
patched version at the following site:

    http://packages.willowrise.org


How to Use This Package
-----------------------

The documentation below is in doctest format, meaning that it serves
as both documentation and tests.  You can try out the code in a Python
interpreter session.

Register the protobuf serializer with ZODB.  This only needs to happen
once per application run, but it does no harm if it happens more than once.

    >>> from keas.pbpersist.pbformat import register
    >>> register()
    >>> register()

PContact is a simple Persistent class that stores its state in a protobuf
message.  See the source code; it's quite simple.  For this test, create
an instance an fill out the required field.

    >>> from keas.pbpersist.tests import PContact
    >>> bob = PContact()
    >>> bob.name = u'Bob'

Set up an in-memory object database and put the PContact in it.  Of course,
if you already have a database, you can instead attach the object to some
object in your own database.

    >>> from ZODB.DemoStorage import DemoStorage
    >>> from ZODB.DB import DB
    >>> import transaction
    >>> storage = DemoStorage()
    >>> db = DB(storage, database_name='main')
    >>> conn1 = db.open()
    >>> conn1.root()['bob'] = bob
    >>> transaction.commit()

Let's take a peek at what got stored in the database.  Look Ma, no pickles!
The {protobuf} prefix tells ZODB to use code in the keas.pbpersist package
to deserialize this object.

    >>> data, serial = storage.load(bob._p_oid, '')
    >>> data
    '{protobuf}\n \n\x14keas.pbpersist.tests\x12\x08PContact\x12\x07\x08\x01\x12\x03Bob'

Here is a demonstration of how to crack open that record using nothing but
language independent operations.  The ObjectRecord message holds the object's
class, state, and references; the state field holds a class-specific
protobuf message.

    >>> from keas.pbpersist.persistent_pb2 import ObjectRecord
    >>> rec = ObjectRecord()
    >>> rec.MergeFromString(data[10:])  # skip the {protobuf} prefix
    43
    >>> rec.class_meta.module_name
    u'keas.pbpersist.tests'
    >>> rec.class_meta.class_name
    u'PContact'
    >>> len(rec.references)
    0
    >>> pbuf = PContact.protobuf_type()
    >>> pbuf.MergeFromString(rec.state)
    7
    >>> pbuf.name
    u'Bob'

You could easily emulate that code in C++ or Java, the other two languages
supported by Protocol Buffers at this time.  However, that was a
noisy, complex operation.  ZODB makes reading objects dramatically simpler.

    >>> conn2 = db.open()
    >>> bob2 = conn2.root()['bob']
    >>> bob2.name
    u'Bob'
    >>> conn2.close()


References
----------

Create another PContact and assign that contact to be a guardian for Bob.
The add() method below is part of Google's protobuf API.  The
protobuf_refs attribute is provided by the ProtobufState metaclass
in keas.pbstate.

    >>> alice = PContact()
    >>> alice.name = u'Alice'
    >>> ref = bob.guardians.add()
    >>> bob.protobuf_refs.set(ref, alice)
    >>> transaction.commit()

[Ed: I sure don't like the way we manage references right now.  I'd much
rather see "bob.guardians.append(alice)", but that is not yet possible.]

Now let's see what happens if we get Alice through Bob in a different
ZODB connection.

    >>> conn2 = db.open()
    >>> bob2 = conn2.root()['bob']
    >>> alice2 = bob2.protobuf_refs.get(bob2.guardians[0])
    >>> alice2.name
    u'Alice'

Note that alice2 is a duplicate of alice, not the same object, because
the two come from different ZODB connections.

    >>> alice is alice2
    False

We also have the option of referring to objects by OID, but that
is rarely necessary.

    >>> conn3 = db.open()
    >>> bob3 = conn3[bob._p_oid]

ZODB loads all objects lazily, including objects serialized using
this package.  When a persistent object's _p_changed
attribute is None, the object is in the "ghost" state; it does not
yet contain data.  Accessing an attribute transparently loads the object
state.

    >>> bob3._p_changed is None
    True
    >>> bob3.name
    u'Bob'
    >>> bob3._p_changed
    False

Clean up the extra connections.
    
    >>> conn3.close()
    >>> conn2.close()


Packing
-------

Storages need to be able to follow chains of references in order to pack,
so let's make sure the referencesf() function still does what it should.

    >>> from ZODB.serialize import referencesf
    >>> data, serial = storage.load(conn1.root()._p_oid, '')
    >>> referencesf(data) == [bob._p_oid]
    True
    >>> data, serial = storage.load(bob._p_oid, '')
    >>> referencesf(data) == [alice._p_oid]
    True
    >>> data, serial = storage.load(alice._p_oid, '')
    >>> referencesf(data)
    []


Reference Capabilities And Limitations
---------------------------------------

Objects serialized using this package can refer to any other Persistent
objects.  (Note that we already demonstrated that pickled objects
can refer to protobuf serialized objects when we added "bob" to the
root object in the database.)

    >>> from persistent.mapping import PersistentMapping
    >>> len(bob.guardians)
    1
    >>> ref = bob.guardians.add()
    >>> bob.protobuf_refs.set(ref, PersistentMapping())
    >>> transaction.commit()
    >>> len(bob.guardians)
    2

However, objects serialized using this package can not hold a reference
to non-Persistent objects.  (This is a limitation of keas.pbpersist, but
not a limitation of keas.pbstate, which does not depend on ZODB.)  We don't
detect bad references until the first phase of transaction commit.

    >>> len(bob.guardians)
    2
    >>> ref = bob.guardians.add()
    >>> bob.protobuf_refs.set(ref, {})
    >>> len(bob.guardians)
    3
    >>> transaction.commit()
    Traceback (most recent call last):
    ...
    POSError: Protobuf reference target is not a Persistent object: {}

Aborting the transaction undoes the damage.

    >>> transaction.abort()
    >>> len(bob.guardians)
    2

We can make weak references, though!

    >>> from persistent.wref import WeakRef
    >>> ref = bob.guardians.add()
    >>> bob.protobuf_refs.set(ref, WeakRef(PersistentMapping()))
    >>> transaction.commit()
    >>> len(bob.guardians)
    3

We can make cross-database references.

    >>> storage_beta = DemoStorage()
    >>> db_beta = DB(storage_beta, database_name='beta', databases=db.databases)
    >>> root_beta = conn1.get_connection('beta').root()
    >>> root_beta['obj'] = PersistentMapping()
    >>> transaction.commit()
    >>> ref = bob.guardians.add()
    >>> bob.protobuf_refs.set(ref, root_beta['obj'])
    >>> transaction.commit()

We can make a reference to an instance of a class with a __getnewargs__
method.  ZODB does not store the class metadata in references to instances
of such classes.

    >>> from keas.pbpersist.tests import Adder
    >>> adder = Adder(7)
    >>> conn1.root()['adder'] = adder
    >>> ref = bob.guardians.add()
    >>> bob.protobuf_refs.set(ref, adder)
    >>> transaction.commit()

We can make a reference to an instance of a class with a __getnewargs__
method, even when that instance is in another database.

    >>> adder = Adder(6)
    >>> root_beta['adder'] = adder
    >>> transaction.commit()
    >>> ref = bob.guardians.add()
    >>> bob.protobuf_refs.set(ref, adder)
    >>> transaction.commit()

Just to be sure all those references work, access all of them in another
connection.

    >>> conn2 = db.open()
    >>> bob2 = conn2.root()['bob']
    >>> len(bob2.guardians)
    6

    >>> bob2.protobuf_refs.get(bob2.guardians[0])
    <keas.pbpersist.tests.PContact object at ...>

    >>> bob2.protobuf_refs.get(bob2.guardians[1])
    {}
    >>> bob2.protobuf_refs.get(bob2.guardians[1])._p_jar is conn2
    True

    >>> bob2.protobuf_refs.get(bob2.guardians[2])
    <persistent.wref.WeakRef object at ...>
    >>> bob2.protobuf_refs.get(bob2.guardians[2])()
    {}

    >>> bob2.protobuf_refs.get(bob2.guardians[3])
    {}
    >>> bob2.protobuf_refs.get(bob2.guardians[3])._p_jar is conn2
    False

    >>> bob2.protobuf_refs.get(bob2.guardians[4])
    <keas.pbpersist.tests.Adder object at ...>
    >>> bob2.protobuf_refs.get(bob2.guardians[4]).add(3)
    10

    >>> bob2.protobuf_refs.get(bob2.guardians[5])
    <keas.pbpersist.tests.Adder object at ...>
    >>> bob2.protobuf_refs.get(bob2.guardians[5]).add(3)
    9

    >>> conn2.close()


Edge Cases
----------

We can't currently store instances of a ProtobufState Persistent class that
implements __getnewargs__().

    >>> from keas.pbpersist.tests import PContactWithGetNewArgs
    >>> fail_contact = PContactWithGetNewArgs()
    >>> fail_contact.name = u'Loser'
    >>> conn1.root()['fail_contact'] = fail_contact
    >>> transaction.commit()
    Traceback (most recent call last):
    ...
    POSError: ProtobufSerializer can not serialize classes using __getnewargs__ or __getinitargs__


Clean Up
--------

The tests are finished.  Close the object database.

    >>> transaction.abort()
    >>> conn1.close()
    >>> db.close()
