.. -*-restructuredtext-*-

========
ConCoord
========

Overview
========

ConCoord is a novel coordination service that provides replication and
synchronization support for large-scale distributed systems. ConCoord
employs an object-oriented approach, in which the system creates
and maintains live replicas for Python objects written by the user.
ConCoord converts these Python objects into Paxos Replicated State
Machines (RSM) and enables clients to do method invocations on them
transparently as if they are local objects. ConCoord uses these
replicated objects to implement coordination and synchronization
constructs in large-scale distributed systems, in effect establishing
a transparent way of providing a coordination service.

:Authors:
    - Deniz Altinbuken (deniz@systems.cs.cornell.edu)
    - Emin Gun Sirer (egs@systems.cs.cornell.edu)
:Version: 1.0.1
:Date: 2013-10-05

Requirements
============

The minimum requirements for ConCoord are::

  - python 2.7.2 or later
  - dnspython-1.9.4
  - msgpack-python

Installation
============

ConCoord can be installed from source with::

  $ python setup.py install

ConCoord is also available for install through PyPI::

  $ pip install concoord

Tutorial
========

Getting Started
---------------

To use ConCoord you need a Python object that can be used for the
coordination of your distributed system. In the ConCoord distribution,
we offer ready-to-use objects that cover the most common coordination
needs. So first, let's start a ConCoord instance with an object in
the distribution, namely Counter under concoord/object/counter.py.

Starting Nodes
--------------

To distribute the local object you should start at least one replica
and one acceptor.

Starting Replica Nodes
~~~~~~~~~~~~~~~~~~~~~~

To start a bootstrap replica node that doesn't need to be connected to
another replica, use the following command::

  $ concoord replica -o concoord.object.counter.Counter -a 127.0.0.1 -p 14000

To start replica nodes to join an active ConCoord instance, use the
following command to connect to another replica::

  $ concoord replica -o concoord.object.counter.Counter -b 127.0.0.1:14000 -a 127.0.0.1 -p 14001

Starting Acceptor Nodes
~~~~~~~~~~~~~~~~~~~~~~~

To start an acceptor node that connects to the bootstrap replica at
127.0.0.1:14000, use the following command::

  $ concoord acceptor -b 127.0.0.1:14000

To run ConCoord in durable mode, where acceptors write to disk, you
can set the -w option::

  $ concoord acceptor -b 127.0.0.1:14000 -w

Note that you can specify the port and the address of any node with
options -p and -a respectively. The nodes can also be run in the debug
mode or with a logger with the commands shown below::

    Usage: concoord [-h] [-a ADDR] [-p PORT] [-b BOOTSTRAP] [-o OBJECTNAME]
                    [-l LOGGER] [-w] [-d]

where,

  -h, --help            show this help message and exit
  -a ADDR, --addr ADDR  addr for the node
  -p PORT, --port PORT  port for the node
  -b BOOTSTRAP, --boot BOOTSTRAP
                        address:port:type triple for the bootstrap peer
  -o OBJECTNAME, --objectname OBJECTNAME
                        client object dotted name
  -l LOGGER, --logger LOGGER
                        logger address
  -w, --writetodisk     writing to disk on/off
  -d, --debug           debug on/off

Starting Nameserver Nodes
~~~~~~~~~~~~~~~~~~~~~~~~~

You can dynamically locate nodes in a given ConCoord instance using
DNS queries if the instance includes nameserver nodes. There are three
ways you can run a ConCoord Nameserver.

* **Standalone Nameserver** Keeps track of the view and responds to DNS
  queries itself. Requires su privileges to bind to Port 53.

* **Slave Nameserver** Keeps track of the view and updates a master
  nameserver that answers to DNS queries on behalf of the slave
  nameserver. Requires an active master nameserver.

* **Route53 Nameserver** Keeps track of the view and updates an Amazon
  Route53 account. Amazon Route53 answers to DNS queries on behalf of
  the slave nameserver. Requires a ready-to-use Amazon Route53
  account.

Standalone Nameserver
+++++++++++++++++++++

Before starting a standalone nameserver node manually, first make sure
that you have at least one replica and one acceptor running. Once your
replica and acceptor nodes are set up, you can start the nameserver to
answer queries for counterdomain.com as follows::

  $ sudo concoord nameserver -n counterdomain.com -o concoord.object.counter.Counter -b 127.0.0.1:14000 -t 1

When you set up the nameserver delegations, you can send queries for
counterdomain.com and see the most current set of nodes as follows::

  $ dig -t a counterdomain.com                   # returns set of Replicas

  $ dig -t srv _concoord._tcp.counterdomain.com  # returns set of Replicas with ports

  $ dig -t txt counterdomain.com                 # returns set of all nodes

  $ dig -t ns counterdomain.com                  # returns set of nameservers

If you want to run the nameserver without proper delegation setup, you
can query the nameserver bound to 127.0.0.1 specifically as follows::

  $ dig -t txt counterdomain.com @127.0.0.1      # returns set of all nodes

Slave Nameserver
++++++++++++++++

Before starting a slave nameserver node manually, you should have a
master nameserver set up and running. The master nameserver should be
set up to answer the queries for its slave nameservers. We provide
OpenReplica Nameserver (concoord/openreplica/openreplicanameserver.py)
as a ready to deploy master nameserver and a Nameserver Coordination Object
(concoord/object/nameservercoord.py) in our example objects to keep
track of slave nameserver information. Using this coordination object,
the master nameserver can keep track of its slave nameserver
delegations and the slave nameserver can update the master every time
the view of its system changes.

Once your master nameserver is set up, you can start the slave
nameserver as follows::

  $ concoord nameserver -n counterdomain.com -o concoord.object.counter.Counter -b 127.0.0.1:14000 -t 2 -m masterdomain.com

When the slave nameserver starts running, you can send queries for
counterdomain.com and see the most current set of nodes as follows::

  $ dig -t a counterdomain.com                   # returns set of Replicas

  $ dig -t srv _concoord._tcp.counterdomain.com  # returns set of Replicas with ports

  $ dig -t txt counterdomain.com                 # returns set of all nodes

  $ dig -t ns counterdomain.com                  # returns set of nameservers

Amazon Route 53 Nameserver
++++++++++++++++++++++++++

Before starting a nameserver connected to Amazon Route 53, you should
have a Route 53 account set up and ready to receive requests. After
your Route 53 account is ready, the nameserver can update the master
every time the view of its system changes automatically.

To use Amazon Route 53 you can pass your credentials into the methods
that create connections or edit them in the configuration file.

* AWS_ACCESS_KEY_ID - Your AWS Access Key ID
* AWS_SECRET_ACCESS_KEY - Your AWS Secret Access Key

Once you make sure that your Route53 account is set up and your
credentials are updated, you can start the nameserver as follows::

  $ concoord nameserver -n counterdomain.com -o concoord.object.counter.Counter -b 127.0.0.1:14000 -t 3 -o configfilepath

Connecting to ConCoord Objects
------------------------------

Once you have a ConCoord instance running with your object, it is easy
to access your object.

The proxy for the Counter object is also included in the distribution.
You can import and use this proxy object in your code. Depending on
how you set your nameserver node up, you can access your object with
the ipaddr:port pair or the domainname. In the example below, the
ipaddr:port of both replica nodes are used. As a result, the client
will be able to do method invocations on the object as long as one of
the replicas is alive::

  >>> from concoord.proxy.counter import Counter
  >>> c = Counter("127.0.0.1:14000, 127.0.0.1:14001")
  >>> c.increment()
  >>> c.increment()
  >>> c.getvalue()
  2

At any point to reinitialize an object after it is deployed on
replicas, you should call __concoordinit__ function::

  >>> from concoord.proxy.counter import Counter
  >>> c = Counter("127.0.0.1:14000, 127.0.0.1:14001")
  >>> c.increment()
  >>> c.__concoordinit__()
  >>> c.increment()
  >>> c.getvalue()
  1

ADVANCED TUTORIAL
=================

ConCoordifying Python Objects
-----------------------------
For cases when the objects included in the ConCoord distribution do
not meet your coordination needs, ConCoord lets you convert your
local Python objects into distributable objects very easily.

To walk through the ConCoord approach, you will use a different
Counter object that increments and decrements by ten, namely
tencounter.py. Once you install ConCoord, you can create coordination
objects and save them anywhere in your filesystem. After you create
tencounter.py, you can save tencounter.py under /foo/tencounter.py::

  class TenCounter:
    def __init__(self, value=0):
     self.value = value

    def decrement(self):
      self.value -= 10

    def increment(self):
      self.value += 10

    def getvalue(self):
      return self.value

    def __str__(self):
      return "The counter value is %d" % self.value

Once you have created the object, update your PYTHONPATH accordingly,
so that the objects can be found and imported::

  $ export PYTHONPATH=$PYTHONPATH:/foo/

Clients will use a proxy object to do method calls on the object.
To create the proxy object automatically, you can use the following
command::

  $ concoord object -o tencounter.TenCounter


  Usage: concoord object [-h] [-o OBJECTNAME] [-t SECURITYTOKEN] [-p PROXYTYPE]
                         [-s] [-v]

where,
  -h, --help					show this help message and exit
  -o OBJECTNAME, --objectname OBJECTNAME	client object dotted name module.Class
  -t SECURITYTOKEN, --token SECURITYTOKEN	security token
  -p PROXYTYPE, --proxytype PROXYTYPE		0:BASIC, 1:BLOCKING,
     			    			2:CLIENT-SIDE BATCHING, 3:SERVER-SIDE BATCHING
  -s, --safe            			safety checking on/off
  -v, --verbose         			verbose mode on/off

This script will create a proxy file under the directory that the
object resides (i.e. /foo/)::

/foo/tencounterproxy.py := the proxy that can be used by the client

IMPORTANT NOTE: ConCoord objects treat __init__ functions specially in
two ways:

1) When Replicas go live, the object is instantiated calling the
__init__ without any arguments. Therefore, while implementing
coordination objects, the __init__ method should be implemented to be
able to run without explicit arguments. You can use defaults to
implement an __init__ method that accepts arguments.

2) In the proxy created, the __init__ function is used to initialize
the Client-Replica connection. This way, multiple clients can connect
to a ConCoord instance without reinitializing the object. During proxy
generation, the original __init__ function is renamed as
__concoordinit__, to reinitialize the object the user can call
__concoordinit__ at any point.

After this point on, you can use TenCounter just like we use Counter before.

Creating Source Bundles
-----------------------

You can create bundles to use at the server and client sides using the
Makefile provided under concoord/

Remember to add the objects you have created in these bundles.

Creating A Server Bundle
~~~~~~~~~~~~~~~~~~~~~~~~

To create a bundle that can run replica, acceptor and nameserver
nodes::

  $ make server

Creating A Client Bundle
~~~~~~~~~~~~~~~~~~~~~~~~

To create a bundle that can run a client and connect to an existing
ConCoord instance:

  $ make client

Logging
-------

We have two kinds of loggers for ConCoord::
* Console Logger
* Network Logger

Both of these loggers are included under utils.py. To start the
NetworkLogger, use the logdaemon.py on the host you want to keep the
Logger.

Synchronization & Threading
---------------------------

ConCoord provides a distributed and fault-tolerant threading
library. The library includes:

*  Lock
*  RLock
*  Semaphore
*  BoundedSemaphore
*  Barrier
*  Condition

The implementations of distributed synchronization objects follow the
implementations in the Python threading library. We will walk through
an example below using the Semaphore object under
concoord/object/semaphore.py

In the blocking object implementation, the method invocations that use
an object from the threading library requires an extra argument
_concoord_command. This argument is used by the calling Replica node
to relate any blocking/unblocking method invocation to a specific
client. This way, even if the client disconnects and reconnects, the
ConCoord instance will remain in a safe state::

  from concoord.threadingobject.dsemaphore import DSemaphore

  class Semaphore:
    def __init__(self, count=1):
      self.semaphore = DSemaphore(count)

    def __repr__(self):
      return repr(self.semaphore)

    def acquire(self, _concoord_command):
      try:
	return self.semaphore.acquire(_concoord_command)
      except Exception as e:
        raise e

    def release(self, _concoord_command):
      try:
        return self.semaphore.release(_concoord_command)
      except Exception as e:
        raise e

    def __str__(self):
      return str(self.semaphore)

To create the proxy for this blocking object we will use the following command::

  $ concoord object -o semaphore.Semaphore -p 1

This command creates the proxy that supports blocking operations. Now
you can use blocking objects just like basic ConCoord objects. First,
we start the replica, acceptor and nameserver nodes the same way we
did before as follows::

  $ concoord replica -o semaphore.Semaphore -a 127.0.0.1 -p 14000

  $ concoord acceptor -b 127.0.0.1:14000

  $ sudo concoord nameserver -n semaphoredomain -o semaphore.Semaphore -b 127.0.0.1:14000 -t 1

To test the functionality, you can use multiple clients or print out the ``Semaphore`` object as follows::

  >>> from semaphoreproxy import Semaphore
  >>> s = Semaphore("127.0.0.1:14000")
  >>> s.acquire()
  True
  >>> i = 10
  >>> i += 5
  >>> s
  <DSemaphore count=0>
  >>> s.release()
  >>> s
  <DSemaphore count=1>
  >>>

HOMEPAGE
========

Visit http://openreplica.org to see concoord in action and to get more
information on concoord.

CONTACT
=======

If you believe you have found a bug or have a problem you need
assistance with, you can get in touch with us by emailing
concoord@systems.cs.cornell.edu