
.. _server:

======================================
Installation and configuration: Server 
======================================

This page explains how to configure, install and run the soma-workflow
database server. 

In the client-server mode the communication between the processes is done using 
`Pyro <http://www.xs4all.nl/~irmen/pyro3/>`_. The server is registered on a Pyro
name server. The Workflow Engines querie the Pyro name server for the
location the database server. 

.. _server_requirements:

Requirements
============

.. image:: images/third_parties.*
  :scale: 50

Here is the list of the server dependencies:

* A distributed resource management system (DRMS) such as Grid Engine, Condor, 
  Torque/PBS, LSF..
* A implementation of `DRMAA <http://www.drmaa.org/>`_ 1.0 for the DRMS in C 
* Python *version 2.5 or more*
* `SIP <http://wiki.python.org/moin/SIP>`_ *version 4.10 or more*
* `Pyro <http://www.xs4all.nl/~irmen/pyro3/>`_ *version 3.10 or more*
* `SQLite <http://docs.python.org/library/sqlite3.html>`_ *version 3 or more*

The implementations of DRMAA tested successfully with soma-workflow:
  
  ===================  ==========================
  DRMS                 DRMAA implementation
  ===================  ==========================
  Torque 2.0.0         FedStage PBS DRMAA 1.0*
  LSF 7.0              FedStage LSF DRMAA 1.0.3
  Grid Engine 6.2u5&6  Embeded implementation
  Condor 7.4.0         Embeded implementation
  ===================  ==========================

\* set soma-workflow DRMAA_IMPLEMENTATION configuration item to 'PBS' when 
using this implementation
 
.. _server_installation:

Server installation
===================

1. *install the soma-workflow python module and compile the sip module => under construction*

2. Choose a resource identifier for the computing resource, ex: "Titan"

3. Create a configuration file (see :ref:`server_configuration`) at the location $HOME/.soma-workflow.cfg. You can also choose your own path for the configuration file and set the "SOMA_WORKFLOW_CONFIG" environment variable with this path or put it in the /etc/ directory.  

2. Start a Pyro name server with the command pyro-ns -m 

4. Run the command python -m soma.workflow.start_database_server "Titan". The command will:

  * start the database server
  * create the SQLite database if the database file does not exist 
  * register the database server to the Pyro name server


.. _server_configuration:

Server configuration
====================

This section defines the required and optional configuration items. 

The configuration file syntax is the `ConfigParser <http://docs.python.org/library/configparser.html>`_  syntax. All the configuration items are defined in one section. The name of 
the section is the resource identifier (ex: "Titan").


Configuration file example: ::

  [Titan]

  DATABASE_FILE        = path/soma_workflow.db
  TRANSFERED_FILES_DIR = path/transfered_files
  NAME_SERVER_HOST     = titan0
  SERVER_NAME          = soma_workflow_server_for_titan
  
  # optional limitation of the jobs in various queues
  MAX_JOB_IN_QUEUE = {10} test{50} long{3}

  # optional logging
  SERVER_LOG_FILE   = path/logs/log_server
  SERVER_LOG_FORMAT = %(asctime)s => line %(lineno)s: %(message)s
  SERVER_LOG_LEVEL  = ERROR
  ENGINE_LOG_DIR    = path/logs/
  ENGINE_LOG_FORMAT = %(asctime)s => line %(lineno)s: %(message)s
  ENGINE_LOG_LEVEL  = ERROR

  # remote access information
  CLUSTER_ADDRESS     = titan.mylab.fr
  SUBMITTING_MACHINES = titan0
  


Configuration items required on the server side:
------------------------------------------------
  
  **DATABASE_FILE**
    Path of the SQLite database file. The file will be created the first time 
    the database server will be started. 

  **TRANSFERED_FILES_DIR**
    Path of the directory where the transfered files will be copied. The
    directory must be empty and will be managed entirely by soma-workflow. 

    .. warning::
      Do not copy any file in this directory. Soma_workflow manages the 
      entire directory and might delete any external file.

  **NAME_SERVER_HOST**
    Host where the Pyro name server runs.
    
  **SERVER_NAME**
    Name of the database server regitered on the Pyro name server.

.. _conf_server_option:

Configuration items optional on the server side:
------------------------------------------------

  **MAX_JOB_IN_QUEUE**
    Maximum number of job in each queue. If a queue does not appear here, 
    soma-workflow considers that there is no limitation.
    The syntax is "{default_queue_max_nb_jobs} queue_name1{max_nb_jobs_1} queue_name_2{max_nb_job_2}". Example: "{5} test{20}"

  **PATH_TRANSLATION_FILES**
    Specify here the shared resource path translation files, mandatory to use 
    the SharedResourcePath objects (see :ref:`shared-resource-path-concept`). 
    Each translation file is associated with a namespace. That way several 
    applications can use the same  identifiers without risk.
    The syntax is "namespace_1{translation_file_path_11} 
    namespace1{translation_file_path_12} namespace2{translation_file_path_2}"

  **DRMAA_IMPLEMENTATION**
    Set this item to "PBS" if you use FedStage PBS DRMAA 1.0 implementation,
    otherwise it doesn not has to be set.
    Soma-workflow is designed to be independent of the DRMS and the DRMAA 
    implementation. However, we found two bugs in the FedStage PBS DRMAA 1.0 
    implementation, and correct it temporarily writing specific code for this 
    implementation in soma-workflow at 2 locations (soma.workflow.engine Drmaa 
    class: __init__ and submit_job method). 

Logging configuration:

  **SERVER_LOG_FILE** 
    Server log file path.

  **SERVER_LOG_LEVEL** 
    Server logging level as defined in the `logging 
    <http://docs.python.org/library/logging.html>`_ module.

  **SERVER_LOG_FORMAT** 
    Server logging format as defined in the `logging <http://docs.python.org/library/logging.html>`_ module.

  **ENGINE_LOG_DIR** 
    Directory path where to store Workflow Engine log files.

  **ENGINE_LOG_LEVEL** 
    Workflow Engine logging level as defined in the `logging 
    <http://docs.python.org/library/logging.html>`_ module.

  **ENGINE_LOG_FORMAT** 
    Workflow Engine logging level as defined in the `logging 
    <http://docs.python.org/library/logging.html>`_ module.

Parallel job configuration:
---------------------------

The items described here concern the parallel job configuration. A parallel job
uses several CPUs and involves parallel code: MPI, OpenMP for example. 

.. warning:: 
  The documentation is under construction.


..
  OCFG_PARALLEL_COMMAND = "drmaa_native_specification"
  OCFG_PARALLEL_JOB_CATEGORY = "drmaa_job_category"

  .. 
    PARALLEL_DRMAA_ATTRIBUTES = [OCFG_PARALLEL_COMMAND, OCFG_PARALLEL_JOB_CATEGORY]

  OCFG_PARALLEL_PC_MPI="MPI"
  OCFG_PARALLEL_PC_OPEN_MP="OpenMP"

  ..
    PARALLEL_CONFIGURATIONS = [OCFG_PARALLEL_PC_MPI, OCFG_PARALLEL_PC_OPEN_MP]

  OCFG_PARALLEL_ENV_MPI_BIN = 'SOMA_JOB_MPI_BIN'
  OCFG_PARALLEL_ENV_NODE_FILE = 'SOMA_JOB_NODE_FILE'

  ..
    PARALLEL_JOB_ENV = [OCFG_PARALLEL_ENV_MPI_BIN, OCFG_PARALLEL_ENV_NODE_FILE]
 

Configuration items required on the client side:
------------------------------------------------

  **CLUSTER_ADDRESS** 
    Address of the host computing resource host to log on.
    Address of the host which has to be used to access the cluster remotely.

  **SUBMITTING_MACHINES** 
    Address of the submitting hosts, that is the hosts from which the jobs 
    are supposed to be submitted. In most of the cases, there is only one 
    submitting host. The addresses are local on the cluster.
    Syntax: "host1 host2 host3"


.. _conf_client_option:

Configuration items optional on the client side:
------------------------------------------------

  **QUEUES**
    List of the available queues. This item is only used in the GUI to make 
    easier the selection of the queue when submitting a workflow.
    Syntax: "queue1 queue2 queue3"
    
