Metadata-Version: 1.0
Name: s3
Version: 0.1.0
Summary: Python module which connects to Amazon's S3 REST API
Home-page: https://bitbucket.org/prometheus/s3/
Author: Paul Wexler
Author-email: paul@prometheusresearch.com
License: MIT
Description: 
        =========
        s3
        =========
        
        .. contents::
        
        Overview
        ========
        
        s3 is a connector to S3, Amazon's Simple Storage System REST API.
        
        Use it to upload, download, delete, copy, test files for existence in S3, or 
        update their metadata.
        
        S3 files may have metadata in addition to their content.  Metadata is a set 
        of key/value pairs.  Metadata may be set when the file is uploaded or it can be 
        updated subsequently.
        
        Installation
        ============
        
        From PyPi
        ::
        
            $ pip install s3 
        
        From source
        ::
        
            $ hg clone ssh://hg@bitbucket.org/prometheus/s3
            $ pip install -e s3 
        
        The installation is successful if you can import s3.  The following command 
        must produce no errors:
        ::
        
            $ python -c 'import s3'
        
        API to remote storage
        =====================
        
        S3 Filenames
        ------------
        
        An S3 file name consists of a bucket and a key.  This pair of 
        strings uniquely identifies the file within S3.  
        
        The S3Name class is instantiated with a key and a bucket; the key 
        is required and the bucket defaults to None.
        
        The RemoteStore class methods take a *remote_name* argument which 
        can be either a string which is the key, or an instance of the 
        S3Name class.  When no bucket is given (or the bucket is None) then 
        the default_bucket established when the connection is instantiated 
        is used.  If no bucket is given (or the bucket is None) and there 
        is no default bucket then a ValueError is raised.
        
        In other words, the S3Name class provides a means of using a bucket 
        other than the default_bucket.
        
        Headers and Metadata
        --------------------
        
        Additional http headers may be sent using the methods which write 
        data.  These methods accept an optional *headers* argument which 
        is a python dict.  The headers control various aspects of how the 
        file may be handled.  S3 supports a variety of headers.  These are 
        not discussed here.  See Amazon's S3 documentation for more info
        on S3 headers.  Those headers whose key begins with the special 
        preifx: 'x-amz-meta-' are considered to be metadata headers and are
        used to set the metadata attributes of the file.
        
        The methods which read files also return metadata which consists of 
        only those response headers which begin with 'x-amz-meta-'.
        
        Storage Methods
        ---------------
        
        The arguments *remote_source*, *remote_destination*, and 
        *remote_name* may be either a string, or an S3Name instance.
        
        *local_name* is a string and is the name of the file on the local 
        system.  This string is passed directly to open().
        
        *headers* is a python dict used to encode additional request 
        headers.
        
        All methods return on success or raise StorageError on failure.
        
        **storage.copy(remote_source, remote_destination, headers={})**
            Copy *remote_source* to *remote_destination*.  The destination 
            metadata is copied from *headers* when it contains metadata; 
            otherwise it is copied from the source metadata.
        **storage.delete(remote_name)**
            Delete *remote_name* from storage.
        **exists, metadata = storage.exists(remote_name)**
            Test if *remote_name* exists in storage, retrieve its metadata 
            if it does.
            exists - boolean, metadata - dict.
        **metadata = storage.read(remote_name, local_name)**
            Download *remote_name* from storage, save it locally as 
            *local_name* and retrieve its metadata.
            metadata - dict.
        **storage.update_metadata(remote_name, headers)**
            Update the metadata associated with *remote_name* with the 
            metadata headers in *headers*.
        **storage.write(local_name, remote_name, headers={})**
            Upload *local_name* to storage as *remote_name*, and set its 
            metadata if any metadata headers are in *headers*.
        
        Usage
        =====
        
        Configuration
        -------------
        
        First configure your yaml file.
        
        - **access_key_id** and **secret_access_key** are generated by the S3 
          account manager.  They are effectively the username and password for the 
          account.
        
        - **bucket** is the name of the default bucket to use when referencing S3 
          files.  bucket names must be unique (on earth) so by convention we use a
          prefix on all our bucket names: com.prometheus.
          
        - **endpoint** is the Amazon server url to connect to.  See 
          http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region for a list
          of the available endpoints.
        
        - **tls** True => use https://, False => use http://.  Default is True.
        
        Here is an example s3.yaml
        ::
        
            ---
            s3: 
                access_key_id: "XXXXX"
                secret_access_key: "YYYYYYY"
                default_bucket: "ZZZZZZZ"
                endpoint: "s3-us-west-2.amazonaws.com"
        
        Next configure your S3 bucket permissions.  Eventually, s3 will support bucket 
        management.  Until then use Amazon's web interface:
        
        - Log onto your Amazon account.
        - Create a bucket or click on an existing bucket.
        - Click on Properties.
        - Click on Permissions.
        - Click on Edit Bucket Policy.
        
        Here is a example policy with the required permissions:
        ::
        
            {
        	    "Version": "2008-10-17",
        	    "Id": "Policyxxxxxxxxxxxxx",
        	    "Statement": [
        		    {
        			    "Sid": "Stmtxxxxxxxxxxxxx",
        			    "Effect": "Allow",
        			    "Principal": {
        				    "AWS": "arn:aws:iam::xxxxxxxxxxxx:user/XXXXXXX"
        			    },
        			    "Action": [
        				    "s3:AbortMultipartUpload",
        				    "s3:GetObjectAcl",
        				    "s3:GetObjectVersion",
        				    "s3:DeleteObject",
        				    "s3:DeleteObjectVersion",
        				    "s3:GetObject",
        				    "s3:PutObjectAcl",
        				    "s3:PutObjectVersionAcl",
        				    "s3:ListMultipartUploadParts",
        				    "s3:PutObject",
        				    "s3:GetObjectVersionAcl"
        			    ],
        			    "Resource": [
        				    "arn:aws:s3:::com.prometheus.cgtest-1/*",
        				    "arn:aws:s3:::com.prometheus.cgtest-1"
        			    ]
        		    }
        	    ]
            }
        
        Examples
        --------
        
        Once the yaml file is configured and the bucket policy is set, you can 
        instantiate a S3Connection and you use that connection to instantiate a 
        Storage instance.
        ::
        
            import s3
            import yaml
            
            with open('s3.yaml', 'r') as fi:
                config = yaml.load(fi)
        
            connection = s3.S3Connection(**config['s3'])    
            storage = s3.Storage(connection)
        
        Then you call methods on the Storage instance.  
        
        The following code uploads a file named "example" from the local filesystem as 
        "example-in-s3" in s3.  It then checks that "example-in-s3" exists in storage, 
        downloads the file as "example-from-s3", compares the original with the 
        downloaded copy to ensure they are the same, deletes "example-in-s3", and 
        finally checks that it is no longer in storage.
        ::
        
            import subprocess
            try:
                storage.write("example", "example-in-s3")
                exists, metadata = storage.exists("example-in-s3")
                assert exists
                metadata = storage.read("example-in-s3", "example-from-s3")
                assert 0 == subprocess.call(['diff', "example", "example-from-s3"])
                storage.delete("example-in-s3")
                exists, metadata = storage.exists("example-in-s3")
                assert not exists
            except StorageError, e:
                print 'failed:', e
                
        The following code again uploads "example" as "example-in-s3".  This time it 
        uses the bucket "my_other_bucket" explicitly, and it sets some metadata and 
        checks that the metadata is set correctly.  Then it changes the metadata 
        and checks that as well.
        ::
        
            headers = {
                'x-amz-meta-state': 'unprocessed',
                }
            remote_name = s3.S3Name("example-in-s3", bucket="my_other_bucket")
            try:
                storage.write("example", remote_name, headers=headers)
                exists, metadata = storage.exists(remote_name)
                assert exists
                assert metadata == headers
                headers['x-amz-meta-state'] = 'processed'
                storage.update_metadata(remote_name, headers)
                metadata = storage.read(remote_name, "example-from-s3")
                assert metadata == headers
            except StorageError, e:
                print 'failed:', e
        
        
Keywords: amazon,aws,s3,upload,download
Platform: Any
Classifier: Programming Language :: Python
Classifier: Intended Audience :: Developers
