How to use MATLAB Parallel Toolbox¶
One of the features users may want to use in jubail is Matlab’s Parallel Toolbox.
MATLAB Parallel Toolbox allows you to run parallel computations in multicore systems, parallel computations can be achieved either using parallel for-loops or GPU computing , for more info visit the MATLAB documentation (Get Started with Parallel Computing Toolbox )
For computer cluster and grid support you would need to use the MATLAB Distributed Computing Server which is not currently supported at NYUAD.
In any case, for all these things to work you will need to define a set of resources (cpus or workers) where the tasks will be executed, the easiest way is to use the matlabpool
command this command has been replaced in recent versions with parpool
, again refer to the MATLAB documentation for details.
Running a Serial MATLAB Job¶
A serial MATLAB job is one that requires only a single CPU-core. Here is an example of a trivial,
one-line serial MATLAB script (hello_world.m
):
fprintf('Hello world.\n')
The Slurm script (job.slurm) below can be used for serial jobs:
#!/bin/bash
#SBATCH --job-name=matlab # create a short name for your job
#SBATCH --nodes=1 # node count
#SBATCH --ntasks=1 # total number of tasks across all nodes
#SBATCH --cpus-per-task=1 # cpu-cores per task (>1 if multi-threaded tasks)
#SBATCH --time=00:01:00 # total run time limit (HH:MM:SS)
#SBATCH --mail-type=all # send email on job start, end and fault
#SBATCH --mail-user=<YourNetID>@nyu.edu
#Load matlab module
module purge
module load matlab
#Run Matlab via command line
matlab -nojvm -singleCompThread -batch hello_world
By invoking MATLAB with -nojvm -singleCompThread -nodisplay -nosplash
, the GUI and the background java processes are
suppressed as is the creation of multiple threads.
To run the MATLAB script, simply submit the job to the scheduler with the following command:
sbatch job.slurm
Note
Matlab GUI can be launched from the Interactive Apps
section of our HPC Web Interface,
https://ood.hpc.abuhdabi.nyu.edu (VPN Required). More info on the HPC Web Interface can be found here.
It is recommended you use the GUI for testing/debugging purposes. For the actual/production runs, command line/batch mode is
recommended as it reduces the overhead created by processes/java threads in the background.
Running a Multi-threaded MATLAB Job with the Parallel Computing Toolbox¶
Most of the time, running MATLAB in single-threaded mode (as described above) will meet your needs.
However, if your code makes use of the Parallel Computing Toolbox (e.g., parfor
) or you have intense
computations that can benefit from the built-in multi-threading provided by MATLAB’s BLAS implementation,
then you can run in multi-threaded mode.
One can use up to all the CPU-cores on a single node in this mode.
Multi-node jobs are not possible with the version of MATLAB that we have so your Slurm script should always
use #SBATCH --nodes=1
. Here is an example from MathWorks of using multiple cores (for_loop.m
):
poolobj = parpool;
fprintf('Number of workers: %g\n', poolobj.NumWorkers);
tic
n = 200;
A = 500;
a = zeros(n);
parfor i = 1:n
a(i) = max(abs(eig(rand(A))));
end
toc
The Slurm script (job.slurm
) below can be used for this case:
#!/bin/bash
#SBATCH --job-name=parfor # create a short name for your job
#SBATCH --nodes=1 # node count
#SBATCH --ntasks=1 # total number of tasks across all nodes
#SBATCH --cpus-per-task=4 # cpu-cores per task (>1 if multi-threaded tasks)
#SBATCH --time=00:00:30 # total run time limit (HH:MM:SS)
#SBATCH --mail-type=all # send email on job start, end and fault
#SBATCH --mail-user=<YourNetID>@nyu.edu
#Load Matlab
module purge
module load matlab
#Run the matlab script
matlab -batch for_loop
Note that -singleCompThread
and -nojvm
does not appear in the Slurm script in contrast to the serial case.
One must tune the value of --cpus-per-task
for optimum performance, use the smallest value for --cpus-per-task
that gives you a significant performance boost because the more resources you
request the longer your queue time will be.
Note
Number of matlab workers will equal the number of --cpus-per-task
up to a maximum of 12 workers
Overriding the 12 core limit¶
By default MATLAB will restrict you to 12 worker threads. You can override this when making the parallel pool with the following line, for example, with 24 threads:
poolobj = parpool('local', 24);
If you use more than one thread then make sure that your code can take advantage of all the CPU-cores. The amount of time that a job waits in the queue is proportional to the requested resources. Furthermore, your fairshare value is decreased in proportion to the requested resources.
Tip
More the number of matlab workers, more are the chances of overhead and hence reduced speedup. If you have a matlab code with independent computations, then Job arrays and Parallel Job Array are one of the most easiest and efficient ways of parallelizing your computations. Follow the corresponding highlighted links for a much more detailed example. You can also contact us if you need any further help with this.
How Do I Know If My MATLAB Code is Parallelized?¶
A parfor
statement is a clear indication of a parallelized MATLAB code. However,
there are cases when the parallelization is not obvious. One example would be a code that uses
linear algebra operations such as matrix multiplication. In this case MATLAB will use the BLAS library
which offers multithreaded routines.
There are two common ways to deteremine whether or not a MATLAB code can take advantage of parallelism without knowing anything about the code.
The first to is run the code using 1 CPU-core and then do a second run using, say, 4 CPU-cores. Look to see if there is a significant difference in the execution time of the two codes.
The second method is to launch the job using, say, 4 CPU-cores then ssh to the compute node where the job is running and use htop -u $USER to inspect the CPU usage. To get the name of the compute node where your job is running use the following command:
squeue
The rightmost column labeled NODELIST(REASON)
gives the name of the node where your job is running.
SSH to this node, for example:
ssh dn034
Once on the compute node, run the following command:
htop -u $USER
If your job is running in parallel you should see a process using much more than 100%
in the %CPU
column. For 4 CPU-cores this number would ideally be 400%
Running Matlab on GPUs¶
Many routines in MATLAB have been written to run on a GPU. Below is a MATLAB script (svd_matlab.m) that performs a matrix decomposition using a GPU:
gpu = gpuDevice();
fprintf('Using a %s GPU.\n', gpu.Name);
disp(gpuDevice);
X = gpuArray([1 0 2; -1 5 0; 0 3 -9]);
whos X;
[U,S,V] = svd(X)
fprintf('trace(S): %f\n', trace(S))
quit;
The Slurm script (job.slurm
) below can be used for this case:
#!/bin/bash
#SBATCH --job-name=matlab-svd # create a short name for your job
#SBATCH --nodes=1 # node count
#SBATCH --ntasks=1 # total number of tasks across all nodes
#SBATCH --cpus-per-task=1 # cpu-cores per task (>1 if multi-threaded tasks)
#SBATCH --time=00:01:00 # total run time limit (HH:MM:SS)
#SBATCH -p nvidia # Request nvidia partition for GPU nodes
#SBATCH --gres=gpu:1 # number of gpus per node
#SBATCH --mail-type=begin # send email when job begins
#SBATCH --mail-type=end # send email when job ends
#SBATCH --mail-user=<NetID>@nyu.edu
#Load Matlab Module
module purge
module load matlab
#Run your matlab script
matlab -nojvm -singleCompThread -batch svd_matlab
In the above Slurm script, notice the new lines: #SBATCH -p nvidia
and #SBATCH --gres=gpu:1
The job can be submitted to the scheduler with:
sbatch job.slurm
Be sure that your MATLAB code is able to use a GPU before submitting your job. See this Getting started guide on MATLAB and GPUs.