Main Content

Use Different MPI Builds on UNIX Systems

Build MPI

On Linux® operating systems, you can use an MPI build that differs from the one provided with Parallel Computing Toolbox™. This topic outlines the steps for creating an MPI build for use with the generic scheduler interface. If you already have an alternative MPI build, proceed to Use Your MPI Build.

  1. Unpack the MPI sources into the target file system on your machine. For example, suppose you have downloaded mpich2-distro.tgz and want to unpack it into /opt for building:

    # cd /opt
    # mkdir mpich2 && cd mpich2
    # tar zxvf path/to/mpich2-distro.tgz
    # cd mpich2-1.4.1p1
  2. Build your MPI using the enable-shared option (this is vital, as you must build a shared library MPI, binary compatible with MPICH2-1.4.1p1 for R2013b to R2018b, or MPICH3.2.1 for R2019a and later). For example, the following commands build an MPI with the nemesis channel device and the gforker launcher.

    #./configure -prefix=/opt/mpich2/mpich2-1.4.1p1 \
     --enable-shared --with-device=ch3:nemesis \
     --with-pm=gforker 2>&1 | tee log
    # make 2>&1 | tee -a log
    # make install 2>&1 | tee -a log

Use Your MPI Build

When your MPI build is ready, this stage highlights the steps to use it with a generic scheduler. To get your cluster working with a different MPI build, follow these steps.

  1. Test your build by running the mpiexec executable. The build should be ready to test if its bin/mpiexec and lib/libmpich.so are available in the MPI installation location.

    Following the example in Build MPI, /opt/mpich2/mpich2-1.4.1p1/bin/mpiexec and /opt/mpich2/mpich2-1.4.1p1/lib/libmpich.so are ready to use, so you can test the build with:

    $ /opt/mpich2/mpich2-1.4.1p1/bin/mpiexec -n 4 hostname
  2. Create an mpiLibConf (Parallel Computing Toolbox) function to direct Parallel Computing Toolbox to use your new MPI. Write your mpiLibConf.m to return the appropriate information for your build. For example:

    function [primary, extras] = mpiLibConf
    primary = '/opt/mpich2/mpich2-1.4.1p1/lib/libmpich.so';
    extras  = {};

    The primary path must be valid on the cluster; and your mpiLibConf.m file must be higher on the cluster workers’ path than matlabroot/toolbox/parallel/mpi. (Sending mpiLibConf.m as an attached file for this purpose does not work. You can get the mpiLibConf.m function on the worker path by either moving the file into a folder on the path, or by having the scheduler use cd in its command so that it starts the MATLAB® worker from within the folder that contains the function.)

  3. Determine necessary daemons and command-line options.

    • Determine all necessary daemons (often something like mpdboot or smpd). The gforker build example in this section uses an MPI that needs no services or daemons running on the cluster, but it can use only the local machine.

    • Determine the correct command-line options to pass to mpiexec.

  4. To set up your cluster to use your new MPI build, modify your communicating job wrapper script to pick up the correct mpiexec. Additionally, there might be a stage in the wrapper script where the MPI process manager daemons are launched.

    The communicating job wrapper script must:

    • Determine which nodes are allocated by the scheduler.

    • Start required daemon processes. For example, for the MPD process manager this means calling "mpdboot -f <nodefile>".

    • Define which mpiexec executable to use for starting workers.

    • Stop the daemon processes. For example, for the MPD process manager this means calling "mpdallexit".

    For examples of communicating job wrapper scripts, see Sample Plugin Scripts (Parallel Computing Toolbox).