BDMPI - Big Data Message Passing Interface  Release 0.1
Setting up BDMPI

System requirements

BDMPI has been developed to run on Linux systems. It can potentially run on non-Linux systems as long as they support POSIX inter-processor communication constructs. However, it has not been tested on anything else other than Linux. Besides the OS, the following software packages are required in order to build and use BDMPI:

  1. GCC 4.x or higher (http://gnu.gcc.org).
  2. CMake 2.8 or higher (http://www.cmake.org).
  3. MPICH 3.0.4 or higher (http://www.mpich.org).
  4. Perl 5 or higher (http://www.perl.org).

All of the above packages are available for most Linux distributions as installable packages. Note that BDMPI has not been tested with OpenMPI, though in principle, it should work with it.

In terms of hardware, the systems on which BDMPI is running should have enough local storage. Note that even though BDMPI can use a network attached file system for temporary storage, its performance may degrade.


System configuration

In order to execute BDMPI programs on a single workstation or a cluster of workstations, the underlying system must be configured to execute MPI jobs. Before trying to run any BDMPI programs follow the instructions in MPICH's documentation on how to setup the system for running MPI jobs. This usually involves enabling password-less ssh remote process execution and setting up a shared file system.

In addition, the following system configuration parameters need to be modified (the names of the files discussed are valid for at least the Ubuntu 12.04.4 LTS distribution):

  1. Increase the nofile, msgqueue, and memlock process limits in the /etc/security/limits.conf file. Specifically, you should add the following limits:
    *       soft    nofile          1000000
    *       hard    nofile          1000000
    *       soft    msgqueue        unlimited
    *       hard    msgqueue        unlimited
    *       soft    memlock         1048576
    *       hard    memlock         1048576
    
    Note that the limit for the memlock parameter can be adjusted up or down based on the available memory in your system. However, you should probably leave at least 1GB of non-lockable memory.
  2. Increase the number of default POSIX message queues. This is done by modifying the /etc/sysctl.conf file to add/modify the following lines:
    msg_default     512
    msg_max         1024
    msgsize_default 256
    msgsize_max     512
    queues_max      1024
    
    Besides directly editing the /etc/sysctl.conf file, the above changes can also be done by executing the following commands:
    sudo sysctl fs.mqueue.msg_default=512
    sudo sysctl fs.mqueue.msg_max=1024
    sudo sysctl fs.mqueue.msgsize_default=256
    sudo sysctl fs.mqueue.msgsize_max=512
    sudo sysctl fs.mqueue.queues_max=1024
    
    Note that if your system is already configured with higher values for any of the above parameters, you should not change them.
  3. Increase the size of the swap file as it will be used for storing the data of the slave processes that are blocked. The size of the swap file depends on the size of the jobs that will be run, the extent to which you allow BDMPI to use its own storage backed memory allocation, and the extent to which your program explicitly manages the size of its memory resident data (e.g., by relying on explicit out-of-core execution). Further details about these three cases are provided in Execution & memory model.

Building and installing BDMPI

BDMPI is distributed as a source package, which needs to be compiled and installed on the systems that it will run. BDMPI's uses CMake to generate the various system-specific Makefiles for building it. Instructions on how to use cmake are provided in the BUILD.txt file, which is included verbatim here:

------------------------------------------------------------------------------
Building BDMPI requires CMake 2.8, found at http://www.cmake.org/, as
well as GNU make. Assumming CMake and GNU make are installed, two
commands should suffice to build BDMPI:

     $ make config
     $ make


Configuration
-------------
BDMPI is primarily configured by passing options to make config. For
example:

     $ make config cc=gcc-4.2

would configure BDMPI to be built using GCC 4.2.

Common configuration options are:
  cc=[compiler]   - The C compiler to use [default is determined by CMake]
  prefix=[PATH]   - Set the installation prefix [/usr/local/ by default]

Advanced debugging related options:
  gdb=1       - Build with support for GDB [off by default]
  debug=1     - Enable debugging support [off by default]
  assert=1    - Enable asserts [off by default]



Installation
------------
To install BDMPI, run

    $ make install

The default installation prefix is /usr/local. To pick an installation 
prefix for BDMPI pass prefix=[path] to make config. For example,

    $ make config prefix=~/local

will cause BDMPI to be installed in ~/local when make install is run.


Other make commands
-------------------
   $ make uninstall 
          Removes all files installed by 'make install'.
   
   $ make clean 
          Removes all object files but retains the configuration options.
   
   $ make distclean 
          Performs clean and completely removes the build directory.

------------------------------------------------------------------------------