Topics Map > Services > Research Computing and Support > CCAST

Running MATLAB on HPC Clusters

A tutorial on running serial and parallel MATLAB jobs on CPUs and GPUs

MATLAB is a high-level programming language and numerical computing environment developed by MathWorks. It is used for numerical computation, visualization, and programming.

MATLAB is available to all CCAST users who are affiliated with North Dakota State University. There are multiple MATLAB versions installed on CCAST systems. On Thunder, check all available software modules by typing:
$ module avail

NOTE: Terminal commands are denoted by inline code prefixed with $, such as $ module avail in the above example. Variable inputs are denoted by capital letters in brackets, e.g., [JOB ID].
  1. Running serial jobs on CPUs
  2. Running parallel jobs on CPUs
    1. Implicit parallelism
    2. Explicit parallelism
  3. Running MATLAB on GPUs
  4. Miscellaneous: Q&A
There are two ways to run MATLAB on Thunder: one uses batch mode, and the other one is via Open OnDemand. That latter allows you to run MATLAB interactively and graphically; see Open OnDemand - How to Access for instructions. In this document, we only discuss running serial and parallel MATLAB jobs on CPUs and GPUs via batch mode.

A MATLAB job needs MATLAB scripts that you intend to run and a job submission script to submit it to the job scheduler (PBS Pro on Thunder). See the CCAST User Guide for more information on running jobs on CCAST systems in general.

Example files

All the source codes and job submission scripts discussed in this document can be found in the following compressed file: /gpfs1/projects/ccastest/examples/MATLAB_Tutorial_examples.tar.gz (on Thunder).
 
To copy the examples to your SCRATCH directory (/gpfs1/scratch/[USERNAME]):
$ cp /gpfs1/projects/ccastest/training/examples/MATLAB_Tutorial_examples.tar.gz $SCRATCH

To uncompress the *tar.gz file:
$ cd $SCRATCH
$ tar -xvf MATLAB_Tutorial_examples.tar.gz

In the following, we examine a few MATLAB examples more specifically:

1. Running serial jobs on CPUs


Example 1: “Hello World !”

In this simple example, the system prints the phrase “Hello World !” in the output file.

The MATLAB script: HelloWorld.m

% display information
fprintf('Hello World !\n\n');

The job submission script: matlab_job.pbs

#!/bin/bash
#PBS -q default
#PBS -N matlab_test
##serial jobs: only 1 processor core is requested
#PBS -l select=1:mem=2gb:ncpus=1
#PBS -l walltime=00:10:00
##replace "x-ccast-prj" below with "x-ccast-prj-[your project group name]"
#PBS -W group_list=x-ccast-prj

module load matlab/R2020a

cd $PBS_O_WORKDIR

##change the input filename as needed
matlab -nodesktop -nodisplay -r "run HelloWorld.m"

exit 0

On Thunder, you need to open this file using a UNIX/Linux text editor and edit the line “#PBS -W” to be sure that your project group name is correct. If you do not remember your project group name, execute the command “id” or “groups” when you are on Thunder.

To submit the job:
$ qsub matlab_job.pbs

To check the status of the job (It may show nothing if the job has completed):
$ qstat -u $USER

To view the error file:
$ cat matlab_test.e[JOB ID]

To view the output file:
$ cat matlab_test.o[JOB ID]

The expected output is:

                            < M A T L A B (R) >
                  Copyright 1984-2020 The MathWorks, Inc.
              R2020a Update 3 (9.8.0.1396136) 64-bit (glnxa64)
                                May 27, 2020
To get started, type doc.
For product information, visit www.mathworks.com.

Hello World !

Example 2: “Summing the elements of a vector”

In this example, the job creates a vector outside of a loop and uses the vector index to do the sum operation of a vector. Then, the system prints the sum value in the output file.

The MATLAB script: sum.m

z = 0;
xAll = 0:0.1:10000;              % create a vector outside of a loop
for i= 1:length(xAll)
    x = xAll(i);
    z = z + x;
end
fprintf('%f\n', z);
% Copyright 2010 - 2014 The MathWorks, Inc.

The job submission script: matlab_job.pbs

#!/bin/bash
#PBS -q default
#PBS -N matlab_test
##serial jobs: only 1 processor core is requested
#PBS -l select=1:mem=2gb:ncpus=1
#PBS -l walltime=00:10:00
##replace "x-ccast-prj" below with "x-ccast-prj-[your project group name]"
#PBS -W group_list=x-ccast-prj

module load matlab/R2020a

cd $PBS_O_WORKDIR

##change the input filename as needed
matlab -nodesktop -nodisplay -r "run sum.m"

exit 0

The above script requests a single node for at most 10 minutes, with 2 GB of RAM. After editing the job submission script as needed (editing the line “#PBS -W”), to submit this job:
$ qsub matlab_job.pbs

To check the status of the job (It may show nothing if the job has completed):
$ qstat -u $USER 

To view the error file:
$ cat matlab_test.e[JOB ID]

To view the output file:
$ cat matlab_test.o[JOB ID]

The expected output is:

                            < M A T L A B (R) >
                  Copyright 1984-2020 The MathWorks, Inc.
              R2020a Update 3 (9.8.0.1396136) 64-bit (glnxa64)
                                May 27, 2020
To get started, type doc.
For product information, visit www.mathworks.com.
500005000.000000

2. Running parallel jobs on CPUs

MATLAB parallelisms include implicit multi-threaded parallelism and explicit parallelism. 

2.1 Implicit parallelism


MATLAB contains internal mechanisms that enable some code to run much faster by automatically parallelizing arithmetic and logical operations on data. This is called implicit parallelism (or multi-threaded or built-in multithreading) since we do not need to explicitly tell MATLAB to parallelize the operations. Implicit parallelization relies on the fact many operations are independent of each other and can therefore be processed in parallel. 

For implicit parallel computation, vector operation is the necessary trigger, but it is not sufficient. The application or algorithm, and the amount of computation also help MATLAB to determine whether an application will be performed with multithreads. This link provides more information on conditions of implicit parallelism. Such parallelism is implemented by multithreaded computations that are executed within a single node. Therefore, on Thunder, you should only select one node to enable implicit parallelism. 

Example 3: “Matrix Multiplication” 

This job executes multiplication operations by multiplying a matrix by another matrix. In this example, the implicit parallelism is triggered. This can be observed by log in to the compute node and check the CPU usage. On Thunder, to check the name of the compute node, you can execute:
$ qstat   -f   [JOB_ID]

To log into the compute node (You can only log into the compute node when your job is still running on that node. If your job is finished, you cannot log into it.):
$ ssh   node[NODE_ID]

For example, if your job is running on node0001, you need to use the command: 
$ ssh  node0001

Then execute command “top”: 
$ top

This will list the utilization percentage of CPU in the compute node. Then, you can tell if your job is running in parallel or not.  

matrix_multiplication.m

n = 5000;                     % set matrix size
A = rand(n);                % create random matrix
B = rand(n);                % create another random matrix
tic                                % calculate the elapsed time using tic and toc
C = A * B;                  % matrix multiplication
time=toc;                    % calculate the elapsed time using tic and toc
disp(['Processing time: ' num2str(time)] 's'); % display running time (unit: second)

matlab_job_submit.pbs (job submitting script)

#!/bin/bash
#PBS -q default
#PBS -N matlab_test
##change the ncpus number to run the job in implicit parallel
#PBS -l select=1:mem=4gb:ncpus=1
#PBS -l walltime=00:30:00
##replace "x-ccast-prj" below with "x-ccast-prj-[your project group name]"
#PBS -W group_list=x-ccast-prj

module load matlab/R2020a

cd $PBS_O_WORKDIR

##change the input filename as needed
matlab -nodesktop -nodisplay -r "run matrix_multiplication.m"

exit 0

To submit this job:
$ qsub matlab_job_submit.pbs

To check the job state (It may show nothing if the job has completed):
$ qstat -u $USER 

To view the error:
$ cat matlab_test.e[JOB ID]

To view the output:
$ cat matlab_test.o[JOB ID]

The expected output is:

                            < M A T L A B (R) >
                  Copyright 1984-2020 The MathWorks, Inc.
              R2020a Update 3 (9.8.0.1396136) 64-bit (glnxa64)
                                May 27, 2020
To get started, type doc.
For product information, visit www.mathworks.com.

Processing time: 6.8422s

To observe implicit parallel computing performance, when different number of cores are selected, you might get results like:

Cores number (ncpus) 1 2 3 4 5 6 7 8
Processing time 6.84s 2.56s 1.88s 1.66s 1.33s 1.174s 0.98s 0.99s

The results show the relationship between the processing time and the number of cores allocated. The running time will not decrease monotonously with the increase of the number of cores. Users need to decide the number of selected cores according to the actual situation of the project. In this example, 2 cores or 3 cores might be better choices.

2.2 Explicit parallelism


Explicit parallelism is characterized by the presence of explicit constructs in the programming language, aimed at describing the way in which the parallel computation will take place. In explicit parallelism, several instances of MATLAB simultaneously execute a single MATLAB command or function.

Parallel Computing Toolbox describes the explicit parallelism and lets you use parallel-enabled functions in MATLAB and other toolboxes. For more details on running your code in explicit parallelism, see Choose a Parallel Computing Solution

A common way to initiate a parallel computation in MATLAB is to use a parfor loop. Our examples are based on parfor loop. parfor executes for-loop iterations in parallel on workers in a parallel pool. Regarding how parfor can help increase your throughput, see Decide When to Use parfor.

Example 4: “Getting eigenvalues of square matrices” 

This job performs N trials of computing the largest eigenvalue for a M-by-M random matrix using parfor and outputs its processing time. 

parfor.m

% Performs N trials of computing the largest eigenvalue for an M-by-M random matrix
gcp;                   % open parallel pool if none is open
M=500;             % number of rows and columns of each matrix
N=12000;          % number of trials
tic;                     % calculate the elapsed time using tic and toc
a = zeros(N,1);  % initialize output vector
parfor i = 1:N    % use parallel processing by running parfor in a parallel pool
  a(i) = max(eig(rand(M)));      % vector of largest eigenvalues
end
time = toc;          % calculate the elapsed time using tic and toc
disp(['Parallel processing time: ' num2str(time)] 's');          % display running time
poolobj = gcp('nocreate');         % Get current parallel pool
delete(poolobj);                        % Shutting down parallel pool

matlab_job_submit.pbs (J=job submitting script)

#!/bin/bash
#PBS -q default
#PBS -N matlab_parallel_test
#PBS -l select=1:mem=8gb:ncpus=5
#PBS -l walltime=00:30:00
##replace "x-ccast-prj" below with "x-ccast-prj-[your project group name]"
#PBS -W group_list=x-ccast-prj

module load matlab/R2020a

cd $PBS_O_WORKDIR

##change the input filename as needed
matlab -nodesktop -nosplash -nodisplay -r "run parfor.m"

exit 0

To submit this job:
$ qsub matlab_job_submit.pbs

To check the job state (It may show nothing if the job has completed):
$ qstat -u $USER 

To view the error:
$ cat  matlab_test.e[JOB ID]

To view the output:
$ cat  matlab_test.o[JOB ID]

The expected output is:

                            < M A T L A B (R) >
                  Copyright 1984-2020 The MathWorks, Inc.
              R2020a Update 3 (9.8.0.1396136) 64-bit (glnxa64)
                                May 27, 2020
To get started, type doc.
For product information, visit www.mathworks.com.

Starting parallel pool (parpool) using the 'local' profile ...
Connected to the parallel pool (number of workers: 5).
Parallel processing time: 33.9835s
Parallel pool using the 'local' profile is shutting down.

To observe explicit parallel computing performance, different number of cores are selected, and you might get results like:

Number of workers (ncpus) 1 2 3 4 5
Processing time                 192.5s 72.8s 65.5s 49.4s 34.0s

Example 5: “for-loop and parfor-loop” 

This example job also performs N trials of computing the largest eigenvalue for an M-by-M random matrix, but it addresses another two problems. First, this job compares processing time between for-loop processing and parfor-loop processing. Second, when your code includes files that call other files, you need to use addpath adding the working directory into your MATLAB search path. 

compare.m

% set pathToData with your current work directory
% pathToData = [CURRENT_WORKDIR];
addpath(pathToData);
tic; a1 = ex_for(500,1200); t1 = toc;        % calculate a1 using for-loop
gcp;                                                           % open parallel pool if none is open
tic; a2 = ex_parfor (500,1200); t2 = toc;  % calculate a2 using parfor-loop
% Compare processing times
disp(['For-loop processing time:   ' num2str(t1) 's'])
disp(['Parfor-loop processing time: ' num2str(t2) 's'])
% Shutting down parallel pool
poolobj = gcp('nocreate');                         % get current parallel pool
delete(poolobj);                                        % shutting down parallel pool
% Copyright 2010 - 2014 The MathWorks, Inc.

ex_parfor.m

function a = ex_parfor(M, N)
a = zeros(N,1);
parfor i = 1:N
   vi  a(i) = max(eig(rand(M)));
end

ex_for.m

function a = ex_for(M, N)
a = zeros(N,1);
for i = 1:N
    a(i) = max(eig(rand(M)));
end

matlab_job_submit.pbs (job submitting script)

#!/bin/bash
#PBS -q default
#PBS -N matlab_parallel_test
#PBS -l select=1:mem=8gb:ncpus=2
#PBS -l walltime=00:30:00
##replace "x-ccast-prj" below with "x-ccast-prj-[your project group name]"
#PBS -W group_list=x-ccast-prj

module load matlab/R2020a

cd $PBS_O_WORKDIR

##pass $PBS_O_WORKDIR to MATLAB to add directories into MATLAB search path
sed -i "1c pathToData = '$PBS_O_WORKDIR';" compare.m

##change the input filename as needed
matlab -nodesktop -nosplash -nodisplay -r "run compare.m"

exit 0

To submit this job:
$ qsub matlab_job_submit.pbs

To check the job state (It may show nothing if the job has completed):
$ qstat -u $USER 

To view the error:
$ cat  matlab_test.e[JOB ID]

To view the output:
$ cat  matlab_test.o[JOB ID]

The expected output is:

                            < M A T L A B (R) >
                  Copyright 1984-2020 The MathWorks, Inc.
              R2020a Update 3 (9.8.0.1396136) 64-bit (glnxa64)
                                May 27, 2020
To get started, type doc.
For product information, visit www.mathworks.com.

Starting parallel pool (parpool) using the 'local' profile ...
Connected to the parallel pool (number of workers: 2).
For-loop processing time:   141.6204s 
Parfor-loop processing time: 72.8218s
Parallel pool using the 'local' profile is shutting down.

This job compares processing time between for-loop and parfor-loop under the same workload. The results show that parfor-loop processing time is much less than for-loop processing time. 

3. Running MATLAB on GPUs

A GPU is designed to quickly render high-resolution images and video concurrently. Because GPUs can perform parallel operations on multiple sets of data, they are also commonly used for non-graphical tasks such as machine learning and scientific computation. Learn about when to run MATLAB on GPU, see Accelerate Your Code by Running It on a GPU. To get started with MATLAB GPU computing, see Run MATLAB Functions on a GPU.

If all the functions that you want to use are supported on the GPU, you can simply use gpuArray to transfer input data to the GPU, and call gather to retrieve the output data from the GPU. In other words, hundreds of functions in MATLAB® and other toolboxes run automatically on a GPU if you supply a gpuArray argument. GPU-enabled functions include the discrete Fourier transform (fft), matrix multiplication (mtimes), left matrix division (mldivide), and hundreds of others. For more information, see Check GPU-Supported Functions and Run MATLAB Functions on a GPU.

Example 6: “Filter a matrix” 

This job filters a matrix on GPU using filter function and returns the processing time on GPU. Filters are data processing techniques that can smooth out high-frequency fluctuations in data or remove periodic trends of a specific frequency from data. In this MATLAB job, the filter function filters a matrix and returns the filtered data for each column. 

Note: For GPU, whenever you call functions with at least one gpuArray as a data input argument, the function executes on the GPU. To learn more about when a function runs on GPU or CPU, see Special Conditions for gpuArray Inputs.

matrix_filter.m

A = magic(5000);                              % magic(n) returns an n-by-n matrix 
f = ones(1,20)/20;                              % create array of all ones     
% Filter a matrix on GPU
AonGPU = gpuArray(A);                  % create an array stored on the GPU
tic;                                                      % calculate the elapsed time using tic and toc 
BonGPU = filter(f, 1, AonGPU);       % do filter operation on GPU
C=gather(BonGPU);                          % convert back to a numeric array on the CPU
wait(gpuDevice)                                 % wait for GPU calculation to complete    
tCompGpu = toc;                                % calculate the elapsed time using tic and toc
disp([' Processing time on GPU: ' num2str(tCompGpu) 's'])   % display processing time
% Copyright 2014 The MathWorks, Inc.

matlab_job_submit.pbs (job submitting script)

#!/bin/bash
#PBS -q gpus
#PBS -N matlab_gpu_test
#PBS -l select=1:ncpus=1:mem=4gb:ngpus=1
#PBS -l walltime=00:20:00
##change “x-ccast-prj” to “x-ccast-prj-[your project group name]"
#PBS -W group_list=x-ccast-prj

module load matlab/R2020a

cd $PBS_O_WORKDIR

##change the input filename as needed
matlab -nodesktop -nodisplay -r "run matrix_filter.m"

exit 0

To submit this job:
$ qsub matlab_job_submit.pbs

To check the job state(It may show nothing if the job has completed):
$ qstat -u $USER 

To view the error:
$ cat  matlab_test.e[JOB ID]

To view the output:
$ cat  matlab_test.o[JOB ID]

The expected output is:

                            < M A T L A B (R) >
                  Copyright 1984-2020 The MathWorks, Inc.
              R2020a Update 3 (9.8.0.1396136) 64-bit (glnxa64)
                                May 27, 2020

To get started, type doc.
For product information, visit www.mathworks.com.
Processing time on GPU: 3.9936s

From the output, you can see the computation time of filter function on GPU. To learn more about when to run MATLAB on GPU, you can browse Accelerate Your Code by Running It on a GPU.

For more information on parallel computing, see MATLAB website and Tutorials from MATLAB.

4. Miscellaneous: Q&A


How to solve errors about “Failed to open/locate matlabpool”?

Some configure files of MATLAB is located in your home directory (/gpfs1/home/[USERNAME]), which is a hidden file and is named as .matlab. You can check it by using the following command in your HOME directory.
$  cd $HOME
$  ls -al
$  cd  .matlab

When you get errors about “Failed to open/locate matlabpool”, you might solve such problems by renaming the directory(.matlab) - it should be re-created automatically for you.

When to use addpath function in your MATLAB script?

When your MATLAB script calls functions or input_files that are written in a separate file, you should add their path into MATLAB search path, which is discussed in Example 5: “Compare serial computing and parallel computing”.

See Also:




Keywords:ccast hpc thunder matlab parallelism "implicit parallelism"   Doc ID:107852
Owner:Khang H.Group:IT Knowledge Base
Created:2020-12-15 12:12 CDTUpdated:2021-07-06 10:14 CDT
Sites:IT Knowledge Base
Feedback:  0   0