==============================
Installation and configuration
==============================


.. contents:: Installation and Configuration
   :local:
   :depth: 1


One process application on a multiple core machine
==================================================

See :ref:`quick start on multiple core machine <quick_start>`.

.. _light_mode_config:

One process application on clusters
===================================

This mode is called light mode, it is the installation you need if you are 
not interested in the remote access and disconnection features (see :ref:`client_server_application_concept`)

Requirements
------------

The requirements are identical to the server installation requirements. 
If you do not intend to access an other computing resource with the remote 
access feature, the installation of Pyro can be skipped.

The requirements are thus:

* A distributed resource management system (DRMS) such as Grid Engine, Condor, 
  Torque/PBS, LSF..
* A implementation of `DRMAA <http://www.drmaa.org/>`_ 1.0 for the DRMS in C 
* Python *version 2.5 or more*
* `SIP <http://wiki.python.org/moin/SIP>`_ *version 4.10 or more*
* SQLite *version 3 or more*
* For the GUI: Qt *version 4.6.2 or more*, `PyQt <http://www.riverbankcomputing.co.uk/software/pyqt/intro>`_ *version 4.7.2 or more* and `matplotlib <http://matplotlib.sourceforge.net/>`_ *version 0.99 or more*

More details about the implementation of DRMAA can be found in the server 
installation section (see :ref:`server_requirements`).


Installation
------------


1. *install the soma-workflow python module and compile the sip module => under construction*

2. Choose a resource identifier for the computing resource, ex: "My laptop light mode"

3. Create a configuration file (see :ref:`light_mode_configuration`) at the location $HOME/.soma-workflow.cfg. You can also choose your own path for the configuration file and set the "SOMA_WORKFLOW_CONFIG" environment variable with this path or put it in the /etc/ directory.  


.. _light_mode_configuration:

Configuration
-------------

This section defines the required and optional configuration items for the light 
mode. 

The configuration file syntax is the `ConfigParser 
<http://docs.python.org/library/configparser.html>`_  syntax. All the 
configuration items needed are defined in one section. You can choose the 
section name, for example "My laptop light mode".

Configuration file example: ::

  [My laptop light mode]

  LIGHT_MODE    = True

  TRANSFERED_FILE_DIR = path/soma_workflow.db
  DATABASE_FILE       = path/transfered_files
  

If you want to use the same application as a client to other computing resources, 
make sure that Pyro was installed and add the configuration items required in the 
configuration file as described here: :ref:`client_configuration`.

Configuration file example: ::
 
  [My laptop light mode]

  LIGHT_MODE    = True

  TRANSFERED_FILE_DIR = path/soma_workflow.db
  DATABASE_FILE       = path/transfered_files
  
 
  [Titan]

  CLUSTER_ADDRESS     = titan.mylab.fr
  SUBMITTING_MACHINES = titan0
  
  QUEUES = test long 

  
  [LabDesktops]

  CLUSTER_ADDRESS     = mydesktop.mylab.fr
  SUBMITTING_MACHINES = mydesktop



Required configuration items
^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  **LIGHT_MODE**
    This item can be set up with any value, however it must be defined to use 
    soma_workflow in the light mode.
  
  **DATABASE_FILE**
    Path of the SQLite database file. The file will be created the first time 
    the application is launch.  

  **TRANSFERED_FILE_DIR**
    Path of the directory where the transfered files will be copied. The
    directory must be empty and will be managed entirely by soma-workflow.
    This item is required to let you run workflow with file transfer for 
    test purposes in the light mode. 

    .. warning::
      Do not copy any file in this directory. Soma_workflow manages the 
      entire directory and might delete any external file.

Optional configuration items
^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Many optional configuration item can be added to customize the installation, see
:ref:`server_configuration` for a full list of the items and their description.
  

Start the GUI
-------------

The command *soma_workflow_gui* starts the GUI.

.. seealso:: :ref:`gui` for the GUI documentation.


Use the Python API
-------------------

Import the soma.workflow.client module to use the Python API.

.. seealso:: :ref:`client_api` for soma-workflow Python API documentation or :ref:`examples` for a fast start.

.. _client_intall_config:


Client-server application: Client
=================================

This page describes the installation and configuration of the soma-workflow
client. The installation of a soma-workflow client supposes that at least one 
soma-workflow database servers is installed on a remote or a local computing 
resources (see :ref:`server`).


Requirements
------------

* Python *version 2.5 or more*
* `Pyro <http://www.xs4all.nl/~irmen/pyro3/>`_ *version 3.10 or more*
* `Paramiko <http://www.lag.net/paramiko/>`_ *version 1.7 or more*. Paramiko in   
  only required if the computing resource is remote. 
* For the GUI: Qt *version 4.6.2 or more*, `PyQt <http://www.riverbankcomputing.co.uk/software/pyqt/intro>`_ *version 4.7.2 or more* and `matplotlib <http://matplotlib.sourceforge.net/>`_ *version 0.99 or more*

Installation
------------

1. *Install the soma-workflow python modules and command. => under construction*

2. Create a configuration file (see :ref:`client_configuration`) at the location $HOME/.soma-workflow.cfg. You can also choose your own path for the configuration file and set the "SOMA_WORKFLOW_CONFIG" environment variable or put it in the /etc/ directory.

..
  The client only need the soma-workflow modules to be installed and the
  soma_workflow_gui file to be located in the PATH


.. _client_configuration:

Configuration
-------------

The configuration syntax is the `ConfigParser <http://docs.python.org/library/configparser.html>`_  syntax.

There is one section for each computing resource (that is for each soma-workflow
database server). 

Only three items are required:
 
  * CLUSTER_ADDRESS
  * SUBMITTING_MACHINES
  * QUEUES 

The two first items are mandatory and the last one is 
optional. The values of these configuration items are set up at each 
soma-workflow server installation (see :ref:`server_configuration`). Ask these item 
values to the soma-workflow administrator if you did not install the server yourself.  

Configuration file example: ::

  [Titan]

  CLUSTER_ADDRESS     = titan.mylab.fr
  SUBMITTING_MACHINES = titan0
  
  QUEUES = test long 


  [LabDesktops]

  CLUSTER_ADDRESS     = mydesktop.mylab.fr
  SUBMITTING_MACHINES = mydesktop


Start the GUI
-------------

The command *soma_workflow_gui* starts the GUI.

.. seealso:: :ref:`gui` for the GUI documentation.


Use the Python API
------------------

Import the soma.workflow.client module to use the Python API.

.. seealso:: :ref:`client_api` for soma-workflow Python API documentation or :ref:`examples` for a fast start.




.. _server:


Client-server application: Server 
=================================

This page explains how to configure, install and run the soma-workflow
database server. 

In the client-server mode the communication between the processes is done using 
`Pyro <http://www.xs4all.nl/~irmen/pyro3/>`_. The server is registered on a Pyro
name server. The Workflow Engines querie the Pyro name server for the
location the database server. 

.. _server_requirements:

Requirements
------------

.. image:: images/third_parties.*
  :scale: 50

Here is the list of the server dependencies:

* A distributed resource management system (DRMS) such as Grid Engine, Condor, 
  Torque/PBS, LSF..
* A implementation of `DRMAA <http://www.drmaa.org/>`_ 1.0 for the DRMS in C 
* Python *version 2.5 or more*
* `SIP <http://wiki.python.org/moin/SIP>`_ *version 4.10 or more*
* `Pyro <http://www.xs4all.nl/~irmen/pyro3/>`_ *version 3.10 or more*
* `SQLite <http://docs.python.org/library/sqlite3.html>`_ *version 3 or more*

The implementations of DRMAA tested successfully with soma-workflow:
  
  ===================  ==========================
  DRMS                 DRMAA implementation
  ===================  ==========================
  Torque 2.0.0         FedStage PBS DRMAA 1.0*
  LSF 7.0              FedStage LSF DRMAA 1.0.3
  Grid Engine 6.2u5&6  Embeded implementation
  Condor 7.4.0         Embeded implementation
  ===================  ==========================

\* set soma-workflow DRMAA_IMPLEMENTATION configuration item to 'PBS' when 
using this implementation
 
.. _server_installation:

Server installation
-------------------

1. *install the soma-workflow python module and compile the sip module => under construction*

2. Choose a resource identifier for the computing resource, ex: "Titan"

3. Create a configuration file (see :ref:`server_configuration`) at the location $HOME/.soma-workflow.cfg. You can also choose your own path for the configuration file and set the "SOMA_WORKFLOW_CONFIG" environment variable with this path or put it in the /etc/ directory.  

2. Start a Pyro name server with the command pyro-ns -m 

4. Run the command python -m soma.workflow.start_database_server "Titan". The command will:

  * start the database server
  * create the SQLite database if the database file does not exist 
  * register the database server to the Pyro name server


.. _server_configuration:

Server configuration
--------------------

This section defines the required and optional configuration items. 

The configuration file syntax is the `ConfigParser <http://docs.python.org/library/configparser.html>`_  syntax. All the configuration items are defined in one section. The name of 
the section is the resource identifier (ex: "Titan").


Configuration file example: ::

  [Titan]

  DATABASE_FILE        = path/soma_workflow.db
  TRANSFERED_FILES_DIR = path/transfered_files
  NAME_SERVER_HOST     = titan0
  SERVER_NAME          = soma_workflow_server_for_titan
  
  # optional limitation of the jobs in various queues
  MAX_JOB_IN_QUEUE = {10} test{50} long{3}

  # optional logging
  SERVER_LOG_FILE   = path/logs/log_server
  SERVER_LOG_FORMAT = %(asctime)s => line %(lineno)s: %(message)s
  SERVER_LOG_LEVEL  = ERROR
  ENGINE_LOG_DIR    = path/logs/
  ENGINE_LOG_FORMAT = %(asctime)s => line %(lineno)s: %(message)s
  ENGINE_LOG_LEVEL  = ERROR

  # remote access information
  CLUSTER_ADDRESS     = titan.mylab.fr
  SUBMITTING_MACHINES = titan0
  


Configuration items required on the server side:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  
  **DATABASE_FILE**
    Path of the SQLite database file. The file will be created the first time 
    the database server will be started. 

  **TRANSFERED_FILES_DIR**
    Path of the directory where the transfered files will be copied. The
    directory must be empty and will be managed entirely by soma-workflow. 

    .. warning::
      Do not copy any file in this directory. Soma_workflow manages the 
      entire directory and might delete any external file.

  **NAME_SERVER_HOST**
    Host where the Pyro name server runs.
    
  **SERVER_NAME**
    Name of the database server regitered on the Pyro name server.

.. _conf_server_option:

Configuration items optional on the server side:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  **MAX_JOB_IN_QUEUE**
    Maximum number of job in each queue. If a queue does not appear here, 
    soma-workflow considers that there is no limitation.
    The syntax is "{default_queue_max_nb_jobs} queue_name1{max_nb_jobs_1} queue_name_2{max_nb_job_2}". Example: "{5} test{20}"

  **PATH_TRANSLATION_FILES**
    Specify here the shared resource path translation files, mandatory to use 
    the SharedResourcePath objects (see :ref:`shared-resource-path-concept`). 
    Each translation file is associated with a namespace. That way several 
    applications can use the same  identifiers without risk.
    The syntax is "namespace_1{translation_file_path_11} 
    namespace1{translation_file_path_12} namespace2{translation_file_path_2}"

  **DRMAA_IMPLEMENTATION**
    Set this item to "PBS" if you use FedStage PBS DRMAA 1.0 implementation,
    otherwise it doesn not has to be set.
    Soma-workflow is designed to be independent of the DRMS and the DRMAA 
    implementation. However, we found two bugs in the FedStage PBS DRMAA 1.0 
    implementation, and correct it temporarily writing specific code for this 
    implementation in soma-workflow at 2 locations (soma.workflow.engine Drmaa 
    class: __init__ and submit_job method). 

Logging configuration:

  **SERVER_LOG_FILE** 
    Server log file path.

  **SERVER_LOG_LEVEL** 
    Server logging level as defined in the `logging 
    <http://docs.python.org/library/logging.html>`_ module.

  **SERVER_LOG_FORMAT** 
    Server logging format as defined in the `logging <http://docs.python.org/library/logging.html>`_ module.

  **ENGINE_LOG_DIR** 
    Directory path where to store Workflow Engine log files.

  **ENGINE_LOG_LEVEL** 
    Workflow Engine logging level as defined in the `logging 
    <http://docs.python.org/library/logging.html>`_ module.

  **ENGINE_LOG_FORMAT** 
    Workflow Engine logging level as defined in the `logging 
    <http://docs.python.org/library/logging.html>`_ module.

Parallel job configuration:
^^^^^^^^^^^^^^^^^^^^^^^^^^^

The items described here concern the parallel job configuration. A parallel job
uses several CPUs and involves parallel code: MPI, OpenMP for example. 

.. warning:: 
  The documentation is under construction.


..
  OCFG_PARALLEL_COMMAND = "drmaa_native_specification"
  OCFG_PARALLEL_JOB_CATEGORY = "drmaa_job_category"

  .. 
    PARALLEL_DRMAA_ATTRIBUTES = [OCFG_PARALLEL_COMMAND, OCFG_PARALLEL_JOB_CATEGORY]

  OCFG_PARALLEL_PC_MPI="MPI"
  OCFG_PARALLEL_PC_OPEN_MP="OpenMP"

  ..
    PARALLEL_CONFIGURATIONS = [OCFG_PARALLEL_PC_MPI, OCFG_PARALLEL_PC_OPEN_MP]

  OCFG_PARALLEL_ENV_MPI_BIN = 'SOMA_JOB_MPI_BIN'
  OCFG_PARALLEL_ENV_NODE_FILE = 'SOMA_JOB_NODE_FILE'

  ..
    PARALLEL_JOB_ENV = [OCFG_PARALLEL_ENV_MPI_BIN, OCFG_PARALLEL_ENV_NODE_FILE]
 

Configuration items required on the client side:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  **CLUSTER_ADDRESS** 
    Address of the host computing resource host to log on.
    Address of the host which has to be used to access the cluster remotely.

  **SUBMITTING_MACHINES** 
    Address of the submitting hosts, that is the hosts from which the jobs 
    are supposed to be submitted. In most of the cases, there is only one 
    submitting host. The addresses are local on the cluster.
    Syntax: "host1 host2 host3"


.. _conf_client_option:

Configuration items optional on the client side:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  **QUEUES**
    List of the available queues. This item is only used in the GUI to make 
    easier the selection of the queue when submitting a workflow.
    Syntax: "queue1 queue2 queue3"
    

