Matlab - Hodgkin cluster tutorial

This is a tutorial I wrote when I started to run things on the cluster in collaboration with Dillon Hambrook, it might be useful for someone else. It is based on Configuring MATLAB to Use MClust on Hodgkin written by Karim Ali (March, 2011).

Requirements

In order to use MATLAB to submit jobs to Hodgkin, you must have the Parallel Computing Toolbox. Also, if you have MATLAB R2010b or later, the configuration is greatly simplified compared to older versions. This document will not cover how to setup older versions of MATLAB.

Remote Submission functions

Currently, Hodgkin uses LSF to do the scheduling, so in order to submit jobs to the cluster, matlab uses functions to communicate with LSF that live in

MATLABROOT/toolbox/distcomp/examples/integration/lsf/remoteSubmission

So you need to either copy these files to a local directory (ie MATLABROOT/toolbox/local) or need to add this directory to the MATLAB path (ie go to Matlab -> File -> Set Path).

Setting up Cluster Storage

  1. Go to \\huxley.resrch.uleth.ca
  2. Open the directory with your user name
  3. Create a folder in this directory called 'jobs'
  4. Copy \\huxley.resrch.uleth.ca\workspace\Linux\Matlab\ to \\huxley.resrch.uleth.ca\username\

When running jobs on the cluster, it will use the scripts in this Matlab directory. This means that any scripts that need editing must be edited from this directory.

Figure 1: Configurations manager

In order to set up you configuration for parallel jobs, open the configuration manager (figure1): Parallel -> Manage configurations. Then create a new configuration (in this case Hodgkin cluster) with the parameters in the following figures:

Figure 2: Scheduler configuration

It is important to check that the fields in the red square match your settings (the current clusterMatlabRoot for example can change. i.e. mdcs5.1, mdcs5.2, etc depending on the current version of Matlab). Additionally, make sure to put your username accordingly in the path where the job data will be stored.

Once your settings are correct, you should be able to validate successfully your configuration (figure 1). Then you should be asked for your authentication details, username and password for the cluster.

Submit a simple job

There are different ways to submit jobs in matlab, here I describe how to do it in a simple way:

job = createJob()
task = createTask(job, @rand, 1, {10});
submit(job);

where job is the job ID, @rand is the pointer to the function this job will carry out, the first 1 is the output parameter number and {10} is the input to the function rand.

You enquire about the status of your job like this:

job.State

or only typing job on the command line. You should be able to see if your job is pending, running or finished. Once the job has finished, you can see its output with:

output = job.getAllOutputArguments

or

output = get(task, 'OutputArguments')
disp(output{1})

Finally, you should destroy the finished jobs:

destroy(job)

Note: sometimes it is useful to tell the scheduler to destroy all the jobs assigned. To talk to your scheduler:

sch = findResource('scheduler', 'configuration', 'config name')
destroy(sch.Jobs)

where , 'config name' is the name of your scheduler configuration (see figure 1). In this case, 'Hodgkin Cluster'.

Submit parallel jobs to the cluster

There are two ways to run parallel computations on the cluster each of which operates in different ways; however, both methods require the user to specify how many parallel workers the job will use.

Matlabpool JobsThe most basic way to run a job in parallel is to use the matlabpool and parfor functionality.

To run a simple parallel function that benefits from running in parallel but does not require communication between workers, use a matlabpool job. This is the only type of job which will benefit from the use of parfor loops.

The following function calculates the cross-correlation between 2 vectors 10000 times: first in parallel, and second serially.

function [times]=simpleParallelFunction()
iterations=10000;
r1=rand(1,1000);
r2=rand(1,1000);
tic;
parfor k=1:iterations
    xcorr(r1,r2);
end
times(1)=toc;

tic
for k=1:iterations
    xcorr(r1,r2);
end
times(2)=toc;
	
To run this function we will create a matlabpool job with 20 workers and run it on the cluster:
mJob=createMatlabPoolJob('configuration','Hodgkin Cluster');
createTask(mJob,@simpleParallelFunction,1,{});
set(mJob,'MaximumNumberOfWorkers',20);
set(mJob,'MinimumNumberOfWorkers',20);
submit(mJob);
waitForState(mJob,'finished')
resultsM=getAllOutputArguments(mJob);
save('mJob.mat','resultsM');
	
This code submits the job to the cluster and waits until the job is finished to get the results and save them to a file on the current directory. You should find that it was much faster to perform the cross-correlations in parallel, rather than in serial.

Parallel Jobs are jobs in which the each task is ran once on each lab available to the job. It is possible to communicate between labs running a parallel job which allows the user more control over the execution of code when compared to a matlabpool job.

The following code creates a parallel job with 10 labs and instructs each lab to generate a (3,1) vector of random numbers:

jm=findResource('scheduler','configuration','Hodgkin Cluster');
pJob=createParallelJob(jm);
createTask(pJob,@rand,1,{[3,1]}); % call a function with parameters
% set the number of workers for this job
set(pJob,'MaximumNumberOfWorkers',10);
set(pJob,'MinimumNumberOfWorkers',10);

submit(pJob);
waitForState(pJob,'finished')
resultsP=getAllOutputArguments(pJob);
save('parJob.mat','resultsP'); % save the results
	
This code waits until the job has finished and then saves the output to a file on the current directory. For more information on communicating between labs in a parallel job see the section in the matlab help files titled Program Communicating Jobs.