==============================
Installation and configuration
==============================


.. contents:: Installation and Configuration
   :local:
   :depth: 1


Mono process application on a multiple core machine
===================================================

**No configuration is needed to use Soma-workflow on a multiple core machine.**

See `Soma-workflow main page <http://brainvisa.info/soma-workflow>`_ for installation. 


.. _client_intall_config:


Client-server application: Client
=================================

This page describes the installation and configuration of the Soma-workflow
client. The installation of a Soma-workflow client supposes that at least one 
Soma-workflow database servers is installed on a remote or a local computing 
resources (see :ref:`server`).


Requirements
------------

* Python *version 2.5 or more*
* `Pyro <http://www.xs4all.nl/~irmen/pyro3/>`_ *version 3.10 or more*
* `Paramiko <http://www.lag.net/paramiko/>`_ *version 1.7 or more*. Paramiko in   
  only required if the computing resource is remote. 
* For the GUI: Qt *version 4.6.2 or more*, `PyQt <http://www.riverbankcomputing.co.uk/software/pyqt/intro>`_ *version 4.7.2 or more*, or `PySide <http://www.pyside.org>`_ *version 1.1.1 or more* and optionally `matplotlib <http://matplotlib.sourceforge.net/>`_ *version 0.99 or more*

Installation
------------

1. Download: `soma-workflow 2.2 <ftp://ftp.cea.fr/pub/dsv/anatomist/binary/soma-workflow-2.2.tar.bz2>`_

2. Installation:

      :: 

        $ python setup.py build

        $ python setup.py install --prefix my_directory
        or 
        $ sudo python setup.py install

3. Create a configuration file (see :ref:`client_configuration`) at the location $HOME/.soma-workflow.cfg. You can also choose your own path for the configuration file and set the "SOMA_WORKFLOW_CONFIG" environment variable or put it in the /etc/ directory.

Start the GUI with the command *soma_workflow_gui* and/or import the soma.workflow.client module to use the Python API.

.. seealso:: :ref:`gui` for the GUI documentation, :ref:`client_api` for Soma-workflow Python API documentation and :ref:`examples`.


..
  The client only need the Soma-workflow modules to be installed and the
  soma_workflow_gui file to be located in the PATH


.. _client_configuration:

Configuration
-------------

The configuration syntax is the `ConfigParser <http://docs.python.org/library/configparser.html>`_  syntax.

There is one section for each computing resource (that is for each Soma-workflow
database server). 

Only four items are required:
 
  * CLUSTER_ADDRESS
  * SUBMITTING_MACHINES
  * QUEUES 
  * LOGIN

The two first items are mandatory and the last ones are 
optional. The values of these configuration items are set up at each 
Soma-workflow server installation (see :ref:`server_configuration`). Ask these item 
values to the Soma-workflow administrator if you did not install the server yourself.  

Configuration file example (2 computing resources are configured): ::

  [Titan]

  CLUSTER_ADDRESS     = titan.mylab.fr
  SUBMITTING_MACHINES = titan0
  
  QUEUES = test long 
  LOGIN = my_login_on_titan


  [LabDesktops]

  CLUSTER_ADDRESS     = mydesktop.mylab.fr
  SUBMITTING_MACHINES = mydesktop

*Configuration items required on the client side*

  **CLUSTER_ADDRESS** 
    Address of the host computing resource host to log on.
    Address of the host which has to be used to access the cluster remotely.

  **SUBMITTING_MACHINES** 
    Address of the submitting hosts, that is the hosts from which the jobs 
    are supposed to be submitted. In most of the cases, there is only one 
    submitting host. The addresses are local on the cluster.
    Syntax: "host1 host2 host3"


.. _conf_client_option:

*Configuration items optional on the client side:*

  **QUEUES**
    List of the available queues. This item is only used in the GUI to make 
    easier the selection of the queue when submitting a workflow.
    Syntax: "queue1 queue2 queue3"

  **LOGIN**
    To pre-fill the login field in the GUI when login to the resource. 



.. _server:


Client-server application: Server 
=================================

This page explains how to configure, install and run the Soma-workflow
database server. 

In the client-server mode the communication between the processes is done using 
`Pyro <http://www.xs4all.nl/~irmen/pyro3/>`_. The server is registered on a Pyro
name server. The Workflow Engines query the Pyro name server for the
location the database server. 

.. _server_requirements:

Requirements
------------

.. image:: images/third_parties.*
  :scale: 40

Here is the list of the server dependencies:

* A distributed resource management system (DRMS) such as Grid Engine, Condor, 
  Torque/PBS, LSF..
* A implementation of `DRMAA <http://www.drmaa.org/>`_ 1.0 for the DRMS in C 
* Python *version 2.5 or more*
* `SIP <http://wiki.python.org/moin/SIP>`_ *version 4.10 or more*
* `Pyro <http://www.xs4all.nl/~irmen/pyro3/>`_ *version 3.10 or more*
* `SQLite <http://docs.python.org/library/sqlite3.html>`_ *version 3 or more*

The implementations of DRMAA tested successfully with Soma-workflow:
  
  ===================  ==========================
  DRMS                 DRMAA implementation
  ===================  ==========================
  Torque 2.0.0         FedStage PBS DRMAA 1.0*
  LSF 7.0              FedStage LSF DRMAA 1.0.3
  Grid Engine 6.2u5&6  Embeded implementation
  Condor 7.4.0         Embeded implementation
  ===================  ==========================

\* set Soma-workflow DRMAA_IMPLEMENTATION configuration item to 'PBS' when 
using this implementation

.. _sip_drmaa:

Installation to use with DRMAA
------------------------------

The use of Soma-workflow with DRMAA requires the generation of a SIP Python 
binding for the DRMAA C library. This section describes a way to compile and 
install this binding together with Soma-workflow:

1. Download bv_maker (a CMake customization of the `BrainVISA project <http://brainvisa.info>`_)
    
    ..
         svn --username brainvisa --password Soma2009 export https://bioproj.extra.cea.fr/neurosvn/brainvisa/development/brainvisa-cmake/branches/1.0 /tmp/brainvisa-cmake
        cd /tmp/brainvisa-cmake
        cmake -DCMAKE_INSTALL_PREFIX=. .
        make install


  ::
    
    svn --username brainvisa --password Soma2009 export https://bioproj.extra.cea.fr/neurosvn/brainvisa/source_views/brainvisa-cmake /tmp/brainvisa-cmake
    cd /tmp/brainvisa-cmake
    cmake -DCMAKE_INSTALL_PREFIX=. latest_release
    make install

2. Create the file: $HOME/.brainvisa/bv_maker.cfg with the content: 

  ::

    [source <source_directory>]
      + soma-workflow latest_release

    [build <build_directory>]
      soma-workflow latest_release <source_directory>

  replacing <source_directory> and <build_directory> by the paths you choose
  for Soma-workflow. For example:

  ::

    [source $HOME/soma-workflow/source]
      + soma-workflow latest_release

    [build $HOME/soma-workflow/build]
      soma-workflow latest_release $HOME/soma-workflow/source

  Note that you can also replace "latest_release" to obtain another version: for 
  example: "2.1.0" or trunk" for the version in development or "bug_fix" 
  for the bug fixes of the latest release.


3. Download Soma-workflow sources (use username "brainvisa" and password "Soma2009" if needed), configure and build with CMake and make:

  ::

    /tmp/brainvisa-cmake/bin/bv_maker sources 

    /tmp/brainvisa-cmake/bin/bv_maker configure

    /tmp/brainvisa-cmake/bin/bv_maker build

4. You can remove the temporary directory

  ::
    
    rm -rf /tmp/brainvisa-cmake

5. Set the environment variables:
  
  ::

    . <build_directory>/bin/bv_env.sh <build_directory>

  
 
.. _server_installation:

Server creation
---------------

1. Install the Soma-workflow python modules and generate the SIP Python binding for the DRMAA C library if needed. (see :ref:`sip_drmaa`)

2. Choose a resource identifier for the computing resource, ex: "Titan"

3. Create a configuration file (see :ref:`server_configuration`) at the location $HOME/.soma-workflow.cfg. You can also choose your own path for the configuration file and set the "SOMA_WORKFLOW_CONFIG" environment variable with this path or put it in the /etc/ directory.  

2. Start a Pyro name server with the command pyro-ns -m 

4. Run the command python -m soma.workflow.start_database_server "Titan". The command will:

  * start the database server
  * create the SQLite database if the database file does not exist 
  * register the database server to the Pyro name server


.. _server_configuration:

Server configuration
--------------------

This section defines the required and optional configuration items. 

The configuration file syntax is the `ConfigParser <http://docs.python.org/library/configparser.html>`_  syntax. All the configuration items are defined in one section. The name of 
the section is the resource identifier (ex: "Titan").


Configuration file example: ::

  [Titan]

  DATABASE_FILE        = path/soma_workflow.db
  TRANSFERED_FILES_DIR = path/transfered_files
  NAME_SERVER_HOST     = titan0
  SERVER_NAME          = soma_workflow_server_for_titan
  
  # optional limitation of the jobs in various queues
  MAX_JOB_IN_QUEUE = {10} test{50} long{3}

  # optional logging
  SERVER_LOG_FILE   = path/logs/log_server
  SERVER_LOG_FORMAT = %(asctime)s => line %(lineno)s: %(message)s
  SERVER_LOG_LEVEL  = ERROR
  ENGINE_LOG_DIR    = path/logs/
  ENGINE_LOG_FORMAT = %(asctime)s => line %(lineno)s: %(message)s
  ENGINE_LOG_LEVEL  = ERROR

  # remote access information
  CLUSTER_ADDRESS     = titan.mylab.fr
  SUBMITTING_MACHINES = titan0
  


Configuration items required on the server side:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  
  **DATABASE_FILE**
    Path of the SQLite database file. The file will be created the first time 
    the database server will be started. 

  **TRANSFERED_FILES_DIR**
    Path of the directory where the transfered files will be copied. The
    directory must be empty and will be managed entirely by Soma-workflow. 

    .. warning::
      Do not copy any file in this directory. Soma_workflow manages the 
      entire directory and might delete any external file.

  **NAME_SERVER_HOST**
    Host where the Pyro name server runs.
    
  **SERVER_NAME**
    Name of the database server regitered on the Pyro name server.

.. _conf_server_option:

Configuration items optional on the server side:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  **MAX_JOB_IN_QUEUE**
    Maximum number of job in each queue. If a queue does not appear here, 
    Soma-workflow considers that there is no limitation.
    The syntax is "{default_queue_max_nb_jobs} queue_name1{max_nb_jobs_1} queue_name_2{max_nb_job_2}". Example: "{5} test{20}"

  **PATH_TRANSLATION_FILES**
    Specify here the shared resource path translation files, mandatory to use 
    the SharedResourcePath objects (see :ref:`shared-resource-path-concept`). 
    Each translation file is associated with a namespace. That way several 
    applications can use the same  identifiers without risk.
    The syntax is "namespace_1{translation_file_path_11} 
    namespace1{translation_file_path_12} namespace2{translation_file_path_2}"

  **DRMAA_IMPLEMENTATION**
    Set this item to "PBS" if you use FedStage PBS DRMAA 1.0 implementation,
    otherwise it does not has to be set.
    Soma-workflow is designed to be independent of the DRMS and the DRMAA 
    implementation. However, we found two bugs in the FedStage PBS DRMAA 1.0 
    implementation, and correct it temporarily writing specific code for this 
    implementation in Soma-workflow at 2 locations (soma.workflow.engine Drmaa 
    class: __init__ and submit_job method). 

  **NATIVE_SPECIFICATION**
    Some specific option/function of the computing resource you want to use
    might not be available among the list of Soma-workflow Job attributes. 
    Use the native specification attribute to use these specific functionalities.
    Once configured it will be applied to every jobs submitted to the resource
    unless a different value is specified in the Job attribute native_specification.

    *Example:* Specification of a job walltime and more:
      * using a PBS cluster: NATIVE_SPECIFICATION= -l walltime=10:00:00,pmem=16gb
      * using a SGE cluster: NATIVE_SPECIFICATION= -l h_rt=10:00:00


Logging configuration:

  **SERVER_LOG_FILE** 
    Server log file path.

  **SERVER_LOG_LEVEL** 
    Server logging level as defined in the `logging 
    <http://docs.python.org/library/logging.html>`_ module.

  **SERVER_LOG_FORMAT** 
    Server logging format as defined in the `logging <http://docs.python.org/library/logging.html>`_ module.

  **ENGINE_LOG_DIR** 
    Directory path where to store Workflow Engine log files.

  **ENGINE_LOG_LEVEL** 
    Workflow Engine logging level as defined in the `logging 
    <http://docs.python.org/library/logging.html>`_ module.

  **ENGINE_LOG_FORMAT** 
    Workflow Engine logging level as defined in the `logging 
    <http://docs.python.org/library/logging.html>`_ module.

Parallel job configuration:
^^^^^^^^^^^^^^^^^^^^^^^^^^^

The items described here concern the parallel job configuration. A parallel job
uses several CPUs and involves parallel code: MPI, OpenMP for example. 

.. warning:: 
  The documentation is under construction.


..
  OCFG_PARALLEL_COMMAND = "drmaa_native_specification"
  OCFG_PARALLEL_JOB_CATEGORY = "drmaa_job_category"

  .. 
    PARALLEL_DRMAA_ATTRIBUTES = [OCFG_PARALLEL_COMMAND, OCFG_PARALLEL_JOB_CATEGORY]

  OCFG_PARALLEL_PC_MPI="MPI"
  OCFG_PARALLEL_PC_OPEN_MP="OpenMP"

  ..
    PARALLEL_CONFIGURATIONS = [OCFG_PARALLEL_PC_MPI, OCFG_PARALLEL_PC_OPEN_MP]

  OCFG_PARALLEL_ENV_MPI_BIN = 'SOMA_JOB_MPI_BIN'
  OCFG_PARALLEL_ENV_NODE_FILE = 'SOMA_JOB_NODE_FILE'

  ..
    PARALLEL_JOB_ENV = [OCFG_PARALLEL_ENV_MPI_BIN, OCFG_PARALLEL_ENV_NODE_FILE]
 


    
.. _light_mode_config:

Mono process application on clusters
====================================

This mode is called light mode, it is the installation you need if you are 
not interested in the remote access and disconnection features (see :ref:`client_server_application_concept`)

Requirements
------------

The requirements are identical to the server installation requirements. 
If you do not intend to access an other computing resource with the remote 
access feature, the installation of Pyro can be skipped.

The requirements are thus:

* A distributed resource management system (DRMS) such as Grid Engine, Condor, 
  Torque/PBS, LSF..
* A implementation of `DRMAA <http://www.drmaa.org/>`_ 1.0 for the DRMS in C
* Python *version 2.5 or more*
* `SIP <http://wiki.python.org/moin/SIP>`_ *version 4.10 or more*
* SQLite *version 3 or more*
* For the GUI: Qt *version 4.6.2 or more*, `PyQt <http://www.riverbankcomputing.co.uk/software/pyqt/intro>`_ *version 4.7.2 or more*, or `PySide <http://www.pyside.org>`_ *version 1.1.1* or more, and optionally `matplotlib <http://matplotlib.sourceforge.net/>`_ *version 0.99 or more*

More details about the implementation of DRMAA can be found in the server 
installation section (see :ref:`server_requirements`).


Installation
------------


1. Install the Soma-workflow python modules and generate the SIP Python binding for the DRMAA C library if needed. (see :ref:`sip_drmaa`)

2. Choose a resource identifier for the computing resource, ex: "Local Titan"

3. Create a configuration file (see :ref:`light_mode_configuration`) at the location $HOME/.soma-workflow.cfg. You can also choose your own path for the configuration file and set the "SOMA_WORKFLOW_CONFIG" environment variable with this path or put it in the /etc/ directory.  

Start the GUI with the command *soma_workflow_gui* and/or import the soma.workflow.client module to use the Python API.

.. seealso:: :ref:`gui` for the GUI documentation, :ref:`client_api` for Soma-workflow Python API documentation and :ref:`examples`.

.. _light_mode_configuration:

Configuration
-------------

This section defines the required and optional configuration items for the light 
mode. 

The configuration file syntax is the `ConfigParser 
<http://docs.python.org/library/configparser.html>`_  syntax. All the 
configuration items are defined in one section. The section name corresponds to
the name you chose for the resource: "Local Titan" in the example.

Configuration file example: ::

  [Local Titan]

  LIGHT_MODE    = True

  TRANSFERED_FILE_DIR = path/soma_workflow.db
  DATABASE_FILE       = path/transfered_files
  
..
  If you want to use the same application as a client to other computing resources, 
  make sure that Pyro was installed and add the configuration items required in the 
  configuration file as described here: :ref:`client_configuration`.

  Configuration file example: ::
  
    [Local Titan]

    LIGHT_MODE    = True

    TRANSFERED_FILE_DIR = path/soma_workflow.db
    DATABASE_FILE       = path/transfered_files

    
    [LabDesktops]

    CLUSTER_ADDRESS     = mydesktop.mylab.fr
    SUBMITTING_MACHINES = mydesktop



*Required configuration items:*

  **LIGHT_MODE**
    This item can be set up with any value, however it must be defined to use 
    soma_workflow in the light mode.
  
  **DATABASE_FILE**
    Path of the SQLite database file. The file will be created the first time 
    the application is launch.  

  **TRANSFERED_FILE_DIR**
    Path of the directory where the transfered files will be copied. The
    directory must be empty and will be managed entirely by Soma-workflow.
    This item is required to let you run workflow with file transfer for 
    test purposes in the light mode. 

    .. warning::
      Do not copy any file in this directory. Soma_workflow manages the 
      entire directory and might delete any external file.

*Optional configuration items:*

  Many optional configuration item can be added to customize the installation, see
  :ref:`server_configuration` for a full list of the items and their description.
