.. JSON-delta documentation master file, created by
   sphinx-quickstart on Sun Apr 13 20:14:28 2014.
   You can adapt this file completely to your liking, but it should at least
   contain the root `toctree` directive.

JSON-delta: a diff/patch pair for JSON-serialized data structures
=================================================================

JSON-delta is a multi-language software suite for computing deltas
between JSON-serialized data structures, and applying those deltas as
patches.  It enables separate programs at either end of a
communications channel (e.g. client and server over HTTP, or two
processes talking to one another using bidirectional IPC) to
manipulate a data structure while minimizing communications overhead.

By example:
-----------
Consider the example JSON-LD entry for John Lennon
from http://json-ld.org/::

   {
    "@context": "http://json-ld.org/contexts/person.jsonld",
    "@id": "http://dbpedia.org/resource/John_Lennon",
    "name": "John Lennon",
    "born": "1940-10-09",
    "spouse": "http://dbpedia.org/resource/Cynthia_Lennon"
   }

Suppose we have a piece of software that updates this record to
show his date of death, like so::

   {
    "@context": "http://json-ld.org/contexts/person.jsonld",
    "@id": "http://dbpedia.org/resource/John_Lennon",
    "name": "John Lennon",
    "born": "1940-10-09",
    
    "died": "1980-12-07",

    "spouse": "http://dbpedia.org/resource/Cynthia_Lennon"
    }

Further suppose that we wish to communicate this update to another
piece of software whose only job is to store information about John
Lennon in JSON-LD format.  (Yes, I know this is getting unlikely, but
stay with me.)  If this Lennon-record-keeper accepts updates in
json-delta format, all you have to do is send the following over the
wire::

[[["died"], "1980-12-07"]]

This is a complete diff in json-delta format.  It is itself a
JSON-serializable data structure: specifically, it is a sequence of
what I refer to as **diff stanzas** for some reason.  The
format for a diff stanza is ``[<key path>, (<update>)]`` (The
parentheses mean that the ``<update>`` part is optional.  I'll get
to that in a minute).  A key path is a sequence of keys specifying
where in the data structure the node you want to alter is found, much
like those emitted by JSON.sh_.  The stanza may be thought of as an
instruction to update the node found at that path so that its content
is equal to ``<update>``.

.. _JSON.sh: https://github.com/dominictarr/JSON.sh

Now, let's do some more supposing.  Suppose the software we're
communicating with is dedicated to storing information about the
Beatles in general.  Also, suppose we've remembered that it was
actually on the 8th of December 1980 that John Lennon died, not the
7th.  Finally, suppose we live in an Orwellian dystopia, and Cynthia
Lennon has been declared a non-person who must be expunged from all
records.  Unfortunately, json-delta is incapable of overthrowing
corrupt and despotic governments, so let's make one last supposition,
that what we're interested in is updating the record kept by the
software on the other end of the wire, which looks like this::

 [
  {
   "@context": "http://json-ld.org/contexts/person.jsonld",
   "@id": "http://dbpedia.org/resource/John_Lennon",
   "name": "John Lennon",
   "born": "1940-10-09", 

   "died": "1980-12-07",
   
   "spouse": "http://dbpedia.org/resource/Cynthia_Lennon"
  },
  {"name": "Paul McCartney"},
  {"name": "George Harrison"},
  {"name": "Ringo Starr"}
 ]

(Allegations of bias in favor of specific Beatles on the part of
the maintainer of this record are punished by the aforementioned
despotic government.  `All glory to
Arstotzka! <http://papersplea.se/>`_)

To make the changes we’ve decided on (correcting John's date of death,
and expunging Cynthia Lennon from the record), we need to send the
following sequence::

 [
  [[0, "died"], "1980-12-08"],
  [[0, "spouse"]]
 ]

Now, of course, you see what I meant when I said I’d tell you why
``<update>`` is optional later.  If a stanza includes no update material,
it is interpreted as an instruction to delete the node the key-path
points to.

Note also that there is no difference between a stanza that adds a
node, and one that changes one.

The intention is to save as much communications bandwidth as possible
without sacrificing the ability to communicate arbitrary modifications
to the data structure (this format can be used to describe a change
from any JSON-serialized object into any other).  The worst-case
scenario, where there is no commonality between the two structures, is
that the protocol adds seven octets of overhead, because a diff can
always be expressed as ``[[[],<target>]]``, meaning “substitute
``<target>`` for the data structure that is to be modified”.

Implementations
---------------

JSON-delta is my language learning project: whenever I decide to learn
a new programming language, I do so by implementing JSON-delta in it.
As such, there are five implementations, of varying degrees of
fullness, available:

========== ========= ==== ======= ======= ======
Language   Patch     Diff Compact U-patch U-diff
========== ========= ==== ======= ======= ======
Python 2     ✓       ✓      ✓       ✓     ✓
Python 3     ✓       ✓      ✓       ✓     ✓
Javascript   ✓       ✓      ✗       ✗     ✗
Racket       ✓       ✓      ✗       ✗     ✗
Perl       (sort of) ✗      ✗       ✗     ✗
========== ========= ==== ======= ======= ======

As you may be able to guess, I’m most comfortable programming in
Python… Anyway, here’s what those enigmatic column headers mean:

Patch
  The implementation can manipulate data structures according to
  a diff in the format specified above.

Diff
  The implementation can calculate deltas between two data
  structures in the format specified above.

Compact
  Diffs produced by the implementation are as small as I can
  possibly make them, using Needleman-Wunsch sequence alignment to
  optimize stanzas modifying JSON arrays.

U-diff
  The implementation is capable of emitting diffs in a format
  reminiscent of the output of ``diff -u``, which is designed to be
  more human-readable than the JSON format, to facilitate debugging.

U-diff
  The implementation can apply U-format patches.

-------

I initially developed JSON-delta to work with a `Django
<https://www.djangoproject.com/>`_ web-app, allowing client-side
Javascript to manipulate a data structure exposed by the server, so
the Javascript and Python implementations are the most robust.

Development of the Perl version stalled when I discovered that, due to
Perl not having as ramified a type ontology as Javascript, `numeric
values do not round-trip cleanly
<http://search.cpan.org/~makamaka/JSON-2.53/lib/JSON.pm#MAPPING>`_.
This meant that there was nothing I could do to make my test suite
pass.  This made me sad.

Downloads
---------

The `Python implementation
<https://pypi.python.org/pypi/json-delta/>`__ (compatible with version
2.7 and later, including 3) are available from PyPI.

You can download the Javascript versions here (
`full <http://phil-roberts.name/json_delta/json_delta_0.1.js>`__, 
`minified <http://phil-roberts.name/json_delta/json_delta_minified_0.1.js>`__)

(Please, for the sake of my bandwidth bills, serve your own copy
rather than hot-linking mine!)

The racket and perl implementations are very alpha (remember, they
were written by a beginner!)  If you want to check them out, I
recommend looking at the source repo (no pun intended.): development
of json-delta takes place against `a master repository
<https://pikacode.com/phijaro/json_delta/>`__ containing all
implementations.

Further Reading
===============

.. toctree::
   :maxdepth: 2

   json_delta
   json_diff.1
   json_patch.1
   json_cat.1
   license

Indices and tables
==================

* :ref:`genindex`
* :ref:`search`

