Metadata-Version: 1.0
Name: parse2plone
Version: 0.6
Summary: Easily import static HTML files into Plone.
Home-page: http://aclark4life.github.com/Parse2Plone
Author: Alex Clark
Author-email: aclark@aclark.net
License: UNKNOWN
Description: Introduction
        ============
        
        ``Parse2Plone`` is an lxml/soup parser (in the form of a Buildout recipe that
        creates a script for you) to easily get content from static HTML files into Plone.
        
        Warning
        -------
        
        This is a **Buildout recipe**! By itself it does nothing. If you do not know what
        Buildout is, please see: http://www.buildout.org/.
        
        Getting started
        ---------------
        
        Because it always drives me nuts when you have to dig for a recipe's options,
        here they are::
        
        [import]
        recipe = parse2plone
        #path = Plone
        #html_file_ext = html
        #image_file_ext = gif jpg jpeg png
        #target_tags = a div h1 h2 p
        #illegal_chars = _
        
        Everything but the ``recipe`` parameter is commented out; the parameters
        listed are configured with default values. Uncomment/edit these if you
        would like to change the default behavior, they are (hopefully) self-explanatory.
        Now you can just cut and paste to get started, or keep reading if you would like
        to know more.
        
        Installation
        ------------
        
        You can install ``Parse2Plone`` by editing your *buildout.cfg* file like
        so:
        
        - First, add an ``import`` section::
        
        [import]
        recipe = parse2plone
        
        - Then, add the ``import`` section to the list of parts::
        
        [buildout]
        ...
        parts =
        ...
        import
        
        - Now run ``bin/buildout`` as usual.
        
        Execution
        ---------
        
        You can run ``Parse2Plone`` like so::
        
        $ bin/plone run bin/import /path/to/files
        
        Demonstration
        -------------
        
        If you have a site in /var/www/html that contains the following::
        
        /var/www/html/index.html
        /var/www/html/about/index.html
        
        You should run::
        
        $ bin/plone run bin/import /var/www/html
        
        And the following will be created:
        
        - http://localhost:8080/Plone/index.html
        - http://localhost:8080/Plone/about/index.html
        
        Explanation
        -----------
        
        Why did you create ``Parse2Plone`` when the following packages (and probably many
        more) already exist:
        
        - http://pypi.python.org/pypi/collective.transmogrifier
        - http://pypi.python.org/pypi/transmogrify.filesystem
        - http://pypi.python.org/pypi/transmogrify.htmlcontentextractor
        
        Here are a few reasons:
        
        - Because ``Parse2Plone`` is aimed at lowering the bar for folks who don't already
        know (or want to know) what a "transmogrifier blueprint" is but can update
        their *buildout.cfg* file, run ``Buildout``, and then run a single import command
        to import static content from the file system all without having to think very much.
        
        - collective.transmogrify provides a framework for creating reusable pipes
        (whose definitions are called blueprints). ``Parse2Plone`` provides
        a single, non-reusable "pipe/blueprint".
        
        - The author had an itch to scratch; it will be nice for him to be able to say
        "just go write a script" and then point to an example.
        
        Consternation
        -------------
        
        Here are some trouble-shooting comments/tips.
        
        lxml
        ~~~~
        
        ``Parse2Plone`` requires ``lxml`` which in turn requires ``libxml2`` and
        ``libxslt``. If you do not have ``lxml`` installed "globally" (i.e. in your
        system Python's site-packages directory) then Buildout will try to install it
        for you.
        
        At this point ``lxml`` will look for the libxml2/libxslt2 development
        libraries to build against, and if you don't have them installed on your system
        already *your mileage may vary* (i.e. Buildout will fail).
        
        Database access
        ~~~~~~~~~~~~~~~
        
        Before running ``parse2plone``, you must either stop your Plone site or
        use ZEO. Otherwise ``parse2plone`` will not be able to access the
        database.
        
        Modification
        ------------
        
        Modifying the default behavior of ``parse2plone`` is easy; use the command
        line options or add parameters to your ``buildout.cfg`` file.
        
        Both approaches allow customization of the same set of options, but the
        command line arguments will trump any settings found in your ``buildout.cfg`` file.
        
        Command line
        ~~~~~~~~~~~~
        
        The following ``parse2plone`` command line options are available.
        
        Path (``--path``, ``-p``)
        '''''''''''''''''''''''''
        
        You can specify an alternate path to the Plone site object located within
        the database ('/Plone' by default) with ``--path`` or ``-p``::
        
        $ bin/plone run bin/import /path/to/files --path=/path/to/Plone
        $ bin/plone run bin/import /path/to/files -p MyPloneSite
        
        Buildout
        ~~~~~~~~
        
        The following ``parse2plone`` recipe options are available.
        
        Parameters
        ''''''''''
        
        You can configure the following parameters in your ``buildout.cfg`` file:
        
        - ``path`` - Specify an alternate location for the Plone site object in the
        database.
        - ``html_file_ext`` - Specify HTML file extensions. ``parse2plone`` will
        import HTML files with these extensions.
        - ``illegal_chars`` - Specify illegal characters. ``parse2plone`` will ignore
        files that contain these characters.
        - ``image_file_ext`` - Specify image file extensions. ``parse2plone`` will
        import image files with these extensions.
        - ``target_tags`` - Specify target tags. ``parse2plone`` will parse the
        contents of HTML tags listed.
        
        Example
        '''''''
        
        Instead of accepting the default behaviour, in your ``buildout.cfg`` file you
        may specify the following configuration::
        
        [import]
        recipe = parse2plone
        path = Plone2
        html_file_ext = htm
        image_file_ext = png
        target_tags = p
        
        This will configure ``parse2plone`` to (only) import images ending in
        ``.png``, and content in ``p`` tags from files ending in ``.htm`` to a Plone
        site object named ``Plone2``.
        
        Communication
        -------------
        
        Questions, comments, or concerns? Email: aclark@aclark.net
        
        History
        -------
        
        0.6 - 10/25/2010
        ~~~~~~~~~~~~~~~~
        
        - No really, revert 'add Plone to install_requires'
        - Add configurable options for: ``path``, ``illegal_chars``,
        ``html_extensions``, ``image_extensions``, and ``target_tags``
        - Allow user to set all configurable options via both ``buildout.cfg``
        and command line arguments
        - Refactor utility functions
        - Add ``tests.py``
        
        0.5 - 10/22/2010
        ~~~~~~~~~~~~~~~~
        
        - Revert 'add Plone to install_requires'
        
        0.4 - 10/22/2010
        ~~~~~~~~~~~~~~~~
        
        - Add 'Plone' to install_requires
        
        0.3 - 10/22/2010
        ~~~~~~~~~~~~~~~~
        
        - Another setuptools fix
        
        0.2 - 10/22/2010
        ~~~~~~~~~~~~~~~~
        
        - Setuptools fix
        
        0.1 - 10/21/2010
        ~~~~~~~~~~~~~~~~
        
        - Initial release
        
Platform: UNKNOWN
Classifier: Framework :: Buildout
Classifier: Framework :: Plone
