This example text file contains explanatory text in reStructuredText format,
plus doctests for the obfuscate module. To run the tests, you can import
doctest and then call docttest.testfile('examples.txt') in the current
directory, or you can run the obfuscate test suite which will automatically
run these tests.

::

    **********************************************************
    **  DISCLAIMER:                                         **
    **  These routines are not cryptographically secure     **
    **  and should not be used for applications requiring   **
    **  security against adversarial attacks.               **
    **********************************************************

First import the ciphers from the obfuscate module::

    >>> from obfuscate import *



Brief introduction to classical cryptographic terms
===================================================

Cipher
    A method of secret writing which encrypts or obfuscates a message by
    replacing or shuffling letters, or both. "Letters" generally refers to
    the ordinary letters ABC...Z but can sometimes include a wider range
    of characters (e.g. digits, punctuation). In some cases, the cipher
    operates on digraphs (pairs of characters) or trigraphs (three
    characters) at a time.

Code
    A method of secret writing that uses special words to refer to secret
    meanings. For instance, the message ``Send troops to Bavaria`` might be
    re-written as ``Send flowers to Aunt May``, where *Aunt May* is code for
    Bavaria and *flowers* for troops.

Cryptography
    The study of secret or hidden writings, that is, of codes and ciphers.

Key
    A piece of secret information which unlocks the cipher, making it simple
    to decrypt. It is a fundamental principle of cryptography that the
    strength of a cipher must not rely on the nature of the cipher remaining
    secret from the attacker.

Monoalphabetic cipher
    A substitution cipher where (for any given key) each letter is always
    changed to the same alternative letter. It contrasts with
    *polyalphabetic* ciphers.

Polyalphabetic cipher
    A substitution cipher where (for any given key) each letter may be
    changed to different letters each time it is seen within the one
    message. It contrasts with *monoalphabetic* ciphers.

Steganography
    Disguising a secret message in an innocuous or hidden form so that it
    is not recognised as a message.

Substitution cipher
    A cipher which replaces letters with other letters. E.g. ``escape`` might
    be written as ``ftdbqf``, that is, each letter is encrypted by shifting
    it to the next letter.

Transposition cipher
    A cipher which shuffles letters around without changing them, so that the
    word ``escape`` might become ``epacse``.



Self-inverting monoalphabetic substitution ciphers
==================================================

rot13 and family
----------------

The rot* family of ciphers are self-reversing, keyless monoalphabetic
ciphers suitable for obfuscating messages such as punchlines to jokes and
spoilers, where readers want to avoid reading the message.

Use ``rot13`` where the message you are obfuscating is made up of letters.
``rot13`` shifts each letter by thirteen places:

    A <-> N, B <-> O, ..., Z <-> M.

You can pass either a string or iterator as message. When passed a string
message, ``rot13`` returns a string:

    >>> rot13("That's what she said!")
    "Gung'f jung fur fnvq!"

Calling ``rot13`` again reverses the obfuscation:

    >>> rot13("Gung'f jung fur fnvq!")
    "That's what she said!"

When passed a non-string iterable, it returns an iterator:

    >>> it = rot13(iter('ATTACK AT DAWN!'))
    >>> it.next()
    'N'
    >>> it.next()
    'G'
    >>> ''.join(it)
    'GNPX NG QNJA!'

Iterables need not be finite:

    >>> import itertools
    >>> stream = itertools.cycle("an infinite string of characters")
    >>> it = rot13(stream)
    >>> _ = [stream.next() for _ in range(76)]  # Advance the stream.
    >>> [it.next() for _ in range(6)]
    ['f', 'g', 'e', 'v', 'a', 't']


The other rot* ciphers operate in the same fashion. Use ``rot5`` for
obfuscating digits with a shift of five:

    >>> rot5('5283-4891')
    '0738-9346'

    >>> it = rot5(['1a', '2b', '34c'])
    >>> it.next()
    '6a'
    >>> it.next()
    '7b'
    >>> it.next()
    '89c'

``rot18`` is equivalent to a ``rot5`` on digits combined with a ``rot13``
on letters, which makes the name somewhat of a misnomer -- it is not a shift
of 18.

    >>> rot18('abc678')
    'nop123'
    >>> list(rot18(list('nop123')))
    ['a', 'b', 'c', '6', '7', '8']

``rot47`` operates over a wider range of letters, digits and other characters,
shifting characters between ASCII 33 and 126 by forty-seven places:

    >>> rot47('4582 strings')
    'cdga DEC:?8D'
    >>> it = rot47(['a', 'b', 'c', '1', '2', '3'])
    >>> [it.next() for _ in range(6)]
    ['2', '3', '4', '`', 'a', 'b']

atbash
------

``atbash`` is a cipher used to hide esoteric knowledge in Jewish and Hebrew
mysticism going back at least to the Book of Jeremiah (approximately 580 BC).
Like the rot* family, it is keyless, self-reversing and monoalphabetic. The
original atbash cipher reversed the Hebrew alphabet (aleph, beth, ... shin,
tav). In English, the letters ABC...Z are reversed to ZYX...A, which gives
the mapping A <-> Z, B <-> Y, C <-> X, ... M <-> N:

    >>> atbash('BABYLON THE GREAT')
    'YZYBOLM GSV TIVZG'

``atbash`` also accepts iterables:

    >>> it = atbash(('Adam', 'Eve'))
    >>> it.next()
    'Zwzn'
    >>> it.next()
    'Vev'



Other monoalphabetic substitution ciphers
=========================================

These ciphers are not self-reversing, and consequently have separate
encrypt and decrypt methods. They require a key to operate.


Caesar cipher
-------------

The Caesar cipher is so called because it was used by Julius Caesar. It uses
an integer key between 0 and 25. (Keys outside of this range are taken modulo
26). ``Caesar`` operates on letters of the alphabet, leaving other characters
untouched. Letters are shifted by the number of positions given by the key,
hence a key of 13 is equivalent to rot13, and a key of 0 (or 26, 52, ...) is
the null cipher.

    >>> Caesar.encrypt('Friends, Romans, Countrymen...', 6)
    'Lxoktjy, Xusgty, Iuatzxeskt...'
    >>> Caesar.decrypt('Lxoktjy, Xusgty, Iuatzxeskt...', 6)
    'Friends, Romans, Countrymen...'

If you have many messages to obfuscate with the same key, you can instantiate
the ``Caesar`` class with a key, then call the method on the instance:

    >>> caesar6 = Caesar(6)
    >>> caesar6.encrypt('attack at dawn')
    'gzzgiq gz jgct'
    >>> caesar6.decrypt('gzzgiq gz jgct')
    'attack at dawn'

This is marginally more efficient, as the internal encryption and decryption
tables are only calculated once.

Instances also accept an alternate key, overriding the original:

    >>> caesar6.encrypt('attack at dawn', 7)
    'haahjr ha khdu'

Naturally decrypting with the wrong key gives nonsense:

    >>> caesar6.decrypt('haahjr ha khdu')  # Encrypted with key=7.
    'buubdl bu ebxo'

Instead of a single string, ``encrypt`` and ``decrypt`` can be given an
arbitrary iterable containing strings as argument, and return an iterator:

    >>> stream = ['Beware ', 'the 3 ', 'Goths!']
    >>> it = Caesar.encrypt(stream, 5)
    >>> it.next()
    'Gjbfwj '
    >>> it.next()
    'ymj 3 '
    >>> it.next()
    'Ltymx!'


Keyword cipher
--------------

The Keyword cipher takes a password or passphrase as key, removes duplicate
letters and pads it with the rest of the alphabet, and uses that to define a monoalphabetic mapping.

    >>> Keyword.encrypt('Send help immediately', 'key')
    'Qalz dain fjjazfkraiw'
    >>> Keyword.decrypt('Qalz dain fjjazfkraiw', 'key')
    'Send help immediately'

Optionally, you can instantiate the class with a key:

    >>> instance = Keyword('aardvark')
    >>> instance.encrypt('ATTACK AT DAWN')
    'ACCADQ AC VAGU'
    >>> instance.decrypt('ACCADQ AC VAGU')
    'ATTACK AT DAWN'

Instances also accept an alternate key, which overrides the original:

    >>> instance.encrypt('ATTACK AT DAWN', 'parrot')
    'PJJPRZ PJ OPMD'

The ``encrypt`` and ``decrypt`` methods accept arbitrary iterables containing
strings, and return an iterator:

    >>> stream = ['Spanish ambassador ', 'sending gold.']
    >>> it = Keyword('mary queen of scots').encrypt(stream)
    >>> it.next()
    'Dxmvodn mtamddmywb '
    >>> it.next()
    'dqvyove ewcy.'


Affine cipher
-------------

The Affine cipher is based on a mathematical generalisation of monoalphabetic
substitution ciphers. It requires a set of three integers ``(a, b, m)`` as
key, although ``m`` takes the default value of 26 if not given.

The value for ``m`` controls which characters are obfuscated:

+------+-----------------------------------------------+
|  m   |  characters obfuscated                        |
+======+===============================================+
|  10  |  digits 0...9 only                            |
|  26  |  case-preserving letters A...Z (the default)  |
|  52  |  case-sensitive letters                       |
|  62  |  case-sensitive letters plus digits           |
|  94  |  ASCII bytes 33 to 126                        |
| 256  |  all extended ASCII bytes 0 to 255            |
+------+-----------------------------------------------+

Other values for ``m`` are not permitted, and characters out of those ranges
are left untouched.

``Affine`` can obfuscate both strings and iterables:

    >>> Affine.encrypt('Better than rot13!', (3, 2, 26))
    'Fohhob hxcp bsh13!'
    >>> stream = ['Fohhob hxcp ', 'bsh13!']
    >>> it = Affine.decrypt(stream, (3, 2, 26))
    >>> it.next()
    'Better than '
    >>> it.next()
    'rot13!'

If you are reusing the same key many times, the ``Affine`` class can also be
instantiated with the key:

    >>> inst = Affine(3, 21, 52)
    >>> inst.encrypt('Send James to Paris.')
    'XHiE wvfHx Al OvuTx.'
    >>> inst.decrypt('XHiE wvfHx Al OvuTx.')
    'Send James to Paris.'

Instances accept an alternate key:

    >>> inst.encrypt('Agents 86 and 99 arriving on Thursday.', (11, 9, 62))
    'Vn1CGv XB jCQ 88 jkkJ2JCn NC iyRkvQjz.'


Playfair family
---------------

The Playfair family of ciphers operate on digraphs (pairs of characters)
rather than single characters. This makes them more resistant to frequency
analysis, as individual characters may be mapped to different letters each
time they are seen. However, Playfair remains monoalphabetic as each digraph
is always mapped to the same alternative.

The Playfair ciphers are all lossy: characters which cannot be obfuscated are
deleted or changed, and the output may be padded to avoid double characters
and an odd number of letters.

The original Playfair cipher uses a 5*5 character table to obfuscate letters
A...Z excluding J. Double letters may be split by a padding character, and
the entire message is padded to an even length:

    >>> Playfair.encrypt('Flee city, J.P. returned!', 'parrot')
    'LudyfdlrxKAofoqtscir'
    >>> Playfair.decrypt('LudyfdlrxKAofoqtscir', 'parrot')
    'FlexecityIPreturnedx'

Optionally, you can instantiate the class with a key before calling the
``encrypt`` and ``decrypt`` methods without a key:

    >>> instance = Playfair('playfair')
    >>> instance.encrypt('Flee at once!')
    'Pakuhpnqsiku'
    >>> instance.decrypt('Pakuhpnqsiku')
    'Flexeatoncex'

Or pass an alternative key to the instance.

Only unique letters are significant in Playfair keys:

    >>> instance.encrypt('Flee at once!', 'king george')
    'Hmnzncpbeanz'
    >>> instance.decrypt('Hmnzncpbeanz', 'kingeor')
    'Flexeatoncex'

The ``encrypt`` and ``decrypt`` methods also accept iterables containing
strings. Each item of the iterable is encrypted in isolation:

    >>> stream = ['send reinforcements', "we're going to advance"]
    >>> it = Playfair.encrypt(stream, 'King George')
    >>> it.next()
    'unkhcingdraocukgut'
    >>> it.next()
    'zicikbngbyrbpkhamc'

The ``Playfair6`` cipher uses a 6*6 character table to obfuscate A...Z
(including J) plus 0...9. Padding occurs as for ``Playfair``, and digits are
significant in keys:

    >>> Playfair6.encrypt('Flee city, J.P. returned!', 'parrot51')
    'Lvb0fdkrzIAokeqtubbz'
    >>> Playfair6.decrypt('Lvb0fdkrzIAokeqtubbz', 'parrot51')
    'FlexecityJPreturnedx'

The ``Playfair16`` cipher uses a 16*16 table, and can obfuscate the full
range of all 256 extended ASCII bytes. As above, padding still takes place,
but otherwise all characters are obfuscated:

    >>> inst = Playfair16('5omeTh1nG+@wful')
    >>> s = inst.encrypt('Flee city, J.P. returned!')
    >>> s
    'Uo5\xa3\x00\x14dWvz2\x1bR%R,\x15\x7f1qT}GTW/'
    >>> inst.decrypt(s)
    'Fle\xa0e city, J.P. returned!'



Polyalphabetic substitution ciphers
===================================

frob
----

``frob`` is a self-reversing cipher that operates by XORing the message to a
string key. If key is a single character (possibly repeated) then frob is
effectively monoalphabetic, otherwise it is polyalphabetic.

    >>> frob('Secret Message', '+!*^')
    'xDI,NU\n\x13NRY?LD'
    >>> frob('xDI,NU\n\x13NRY?LD', '+!*^')
    'Secret Message'

The key is optional: by default, frob uses a single asterisk '*' as the key,
similar to the GNU memfrob utility:

    >>> it = frob(list('magic'))
    >>> it.next()
    'G'
    >>> ''.join(it)
    'KMCI'


Vigenere cipher
---------------

The Vigenere cipher is also polyalphabetic. It operates on letters in strings
and iterables containing strings:

    >>> Vigenere.encrypt('Secret Message', 'python')
    'Idwzth Cdmapuu'
    >>> Vigenere.decrypt('Idwzth Cdmapuu', 'python')
    'Secret Message'

Note that when operating on an iterable, each element is encrypted
independently, resetting the key for each call. This may lead to results
different from those when operating on a single string:

    >>> Vigenere.encrypt('advance immediately', 'parrot')
    'qenscwu jeetxyblwas'
    >>> it = Vigenere.encrypt(['advance', 'immediately'], 'parrot')
    >>> it.next()
    'qenscwu'
    >>> it.next()
    'ynewscquwdn'

If you mix results from one with the other, you may get an invalid answer:

    >>> Vigenere.decrypt('qenscwu ynewscquwdn', 'parrot')
    'advance xvmhympceot'

For repeated use, you can optionally instantiate the class with the key:

    >>> instance = Vigenere('swordfish')
    >>> instance.encrypt('attack at dawn')
    'tqisgq jm lttc'
    >>> instance.decrypt('tqisgq jm lttc')
    'attack at dawn'

Instances also accept an alternate key:

    >>> instance.encrypt('attack at dawn', 'aardvark')
    'buleyl se ebor'
    >>> instance.decrypt('buleyl se ebor', 'aardvark')
    'attack at dawn'



Variant Beaufort cipher
-----------------------

Calling the ``Vigenere`` cipher in reverse is known as **the variant Beaufort
cipher**. Use the ``Vigenere.decrypt`` method to encrypt the plaintext, and
``Vigenere.encrypt`` to decrypt:

    >>> Vigenere.encrypt('Send help', 'key')  # standard Vigenere
    'Djmo mdwu'
    >>> Vigenere.decrypt('Send help', 'key')  # variant Beaufort
    'Hzos cfak'
    >>> Vigenere.encrypt('Hzos cfak', 'key')
    'Send help'



Transposition ciphers
=====================

RowTranspose cipher
-------------------

``RowTranspose`` is a transposition cipher. Rather than modifying each
character, it reversibly shuffles characters around. It takes a single
integer, the number of rows, as a key:

    >>> RowTranspose.encrypt('secret message', 2)
    'smeecsrseatg e'
    >>> RowTranspose.decrypt('smeecsrseatg e', 2)
    'secret message'

The length of the message must be a multiple of the number of rows, otherwise
it will be padded at the end (with spaces by default):

    >>> RowTranspose.encrypt('meet at the dance tomorrow', 4)
    'm noetcreherte o  twado tam '
    >>> RowTranspose.decrypt('m noetcreherte o  twado tam ', 4)
    'meet at the dance tomorrow  '
    >>> RowTranspose.encrypt('meet at the dance tomorrow', 3)
    'mhteeoe mtdo aranrtco ewt  '
    >>> RowTranspose.decrypt('mhteeoe mtdo aranrtco ewt  ', 3)
    'meet at the dance tomorrow '

Optionally, you can instantiate the class with a key, and then either use the
default key, or provide an alternative:

    >>> instance = RowTranspose(7)
    >>> message = 'Meet behind railway station at 3pm.'
    >>> instance.encrypt(message)
    'Mbdlso ee wtn3ehraa ptiaytam ni it.'
    >>> instance.encrypt(message, 5)
    'Mhitaeilatenwt tdai3  yopbr nmeas .'

    >>> instance.decrypt('Mbdlso ee wtn3ehraa ptiaytam ni it.')
    'Meet behind railway station at 3pm.'
    >>> instance.decrypt('Mhitaeilatenwt tdai3  yopbr nmeas .', 5)
    'Meet behind railway station at 3pm.'

Both ``encrypt`` and ``decrypt`` methods accept iterables of strings:

    >>> stream = ['attack at dawn', 'send flowers']
    >>> it = RowTranspose(3).encrypt(stream)
    >>> it.next()
    'akdt atawatnc  '
    >>> it.next()
    's wefenlrdos'


RailFence cipher
----------------

The RailFence cipher is a transposition cipher in use during the American
Civil War. It is based on the idea of writing out a message on a series of
rail fences, moving down and up across the rails. This takes an integer, the
number of rails, as the primary key, and either a string or iterable as
message:

    >>> message = "Advance to the east ridge"
    >>> RailFence.encrypt(message, 4)
    'Aehtedc tes gvnt  ardaoei'

    >>> stream = ['Hello world', 'Goodbye']
    >>> it = RailFence.encrypt(stream, rails=4)
    >>> it.next()
    'Hwe olordll'
    >>> it.next()
    'Geoyobd'

``RailFence`` accepts a second key, which further transposes the message.
This second key can either be a word of the same number of characters as
the number of rails, or a permutation of the list ``[0, 1, 2, ..., rails-1]``.

    >>> RailFence.encrypt(message, rails=4, key='EAST')
    'dc tes gAehtevnt  ardaoei'
    >>> RailFence.encrypt('nothing is what it appears', 3, 'spy')
    'ohn swa tapasniihiprtg t e'

Optionally, you can instantiate the class with the rails and optional key:

    >>> instance = RailFence(3)
    >>> instance.encrypt('WE ARE DISCOVERED FLEE AT ONCE')
    'WRIVDETCEAEDSOEE LEA NE  CRF O'

FIX ME: This test fails.
instance.encrypt('WE ARE DISCOVERED FLEE AT ONCE', key=[2, 0, 1])
'  CRF OWRIVDETCEAEDSOEE LEA NE'

Instances also accept an alternate number of rails:

    >>> instance.encrypt('WE ARE DISCOVERED FLEE AT ONCE', 5)
    'WIDTEDSE A   CRF OAEOELENERVEC'



Steganography and padding
=========================

Chaff
-----

Messages can be padded with junk characters, to hide the true message, using
the ``Chaff`` class. Instantiate a ``Chaff`` instance with an integer width
(the average separation between characters after padding) and an optional
source of junk characters. Then call the pad method with a message and a key
to add the chaff to the message. The key controls exactly how much chaff
surrounds each message character:

    >>> chaff = Chaff(3, '?'*1000)
    >>> chaff.pad('MESSAGE', 'aardvark')
    '???M?????E?????S????S?????A??G?????E?????'

The message can also be an iterable of strings:

    >>> it = chaff.pad(['HELLO', 'GOODBYE'], 'swordfish')
    >>> it.next()
    '???H???E?L?L????O'
    >>> it.next()
    '?????G?????O????O?D??B????Y???E'

Use the unpad method to remove chaff from the message:

    >>> chaff.unpad('***M*****E*****S****S*****A**G*****E*****', 'aardvark')
    'MESSAGE'

    >>> blocks = ['???H???E?L?L????O', '?????G?????O????O?D??B????Y???E']
    >>> it = chaff.unpad(blocks, 'swordfish')
    >>> it.next()
    'HELLO'
    >>> it.next()
    'GOODBYE'

Naturally in practice you want to use chaff which looks like the message,
otherwise an attacker can visually pick out the letters of the message:

    >>> chaff = Chaff(3, 'RNHGSWOUMMSQZBGDSERHPIMYAEBQUIVGSL')
    >>> chaff.pad('MESSAGE', 'aardvark')
    'RNHMGSWOUEMMSQZSBGDSSERHPIAMYGAEBQUEIVGSL'
    >>> chaff.unpad('RNHMGSWOUEMMSQZSBGDSSERHPIAMYGAEBQUEIVGSL', 'aardvark')
    'MESSAGE'


