Topics Map > Services > Research Computing and Support > CCAST

# Running MATLAB on HPC Clusters

A tutorial on running serial and parallel MATLAB jobs on CPUs and GPUs

MATLAB is a high-level programming language and numerical computing environment developed by MathWorks. It is used for numerical computation, visualization, and programming.

MATLAB is available to all CCAST users who are affiliated with North Dakota State University. There are multiple MATLAB versions installed on CCAST systems. On Thunder, check all available software modules by typing:

*$ module avail*

NOTE: Terminal commands are denoted by inline code prefixed with

*$*, such as*$ module avail*in the above example. Variable inputs are denoted by capital letters in brackets, e.g.,*[JOB ID]*.There are two ways to run MATLAB on Thunder: one uses batch mode, and the other one is via Open OnDemand. That latter allows you to run MATLAB interactively and graphically; see Open OnDemand - How to Access for instructions. In this document, we only discuss running serial and parallel MATLAB jobs on CPUs and GPUs via batch mode.

A MATLAB job needs MATLAB scripts that you intend to run and a job submission script to submit it to the job scheduler (PBS Pro on Thunder). See the CCAST User Guide for more information on running jobs on CCAST systems in general.

**Example files**

All the source codes and job submission scripts discussed in this document can be found in the following compressed file on Thunder:

*/gpfs1/projects/ccastest/training/examples/MATLAB_Tutorial_examples.tar.gz*.To copy the examples to your SCRATCH directory (

*/gpfs1/scratch/[USERNAME]*):*$ cp*

*/gpfs1/projects/ccastest/training/examples/MATLAB_Tutorial_examples.tar.gz*

*$SCRATCH*

To uncompress the

**tar.gz*file:*$ cd $SCRATCH*

*$ tar -xvf*

*MATLAB_Tutorial_examples.tar.gz*

In the following, we examine a few MATLAB examples more specifically:

### 1. Running serial jobs on CPUs

__Example 1__: “Hello World !”

In this simple example, the system prints the phrase “Hello World !” in the output file.

The MATLAB script:

*HelloWorld.m**% display information*

*fprintf('Hello World !\n\n');*

The job submission script:

*matlab_job.pbs**#!/bin/bash*

*#PBS -q default*

*#PBS -N matlab_test*

*##serial jobs: only 1 processor core is requested*

*#PBS -l select=1:mem=2gb:ncpus=1*

*#PBS -l walltime=00:10:00*

*##replace "x-ccast-prj" below with "x-ccast-prj-[your project group name]"*

*#PBS -W group_list=x-ccast-prj*

*module load matlab/R2020a*

*cd $PBS_O_WORKDIR*

*##change the input filename as needed*

*matlab -nodesktop -nodisplay -r "run HelloWorld.m"*

*exit 0*

On Thunder, you need to open this file using a UNIX/Linux text editor and edit the line “

*#PBS -W*” to be sure that your project group name is correct. If you do not remember your project group name, execute the command “*id*” or “*groups*” when you are on Thunder.To submit the job:

*$ qsub matlab_job.pbs*

To check the status of the job (It may show nothing if the job has completed):

*$ qstat -u $USER*

To view the error file:

*$ cat matlab_test.e[JOB ID]*

To view the output file:

*$ cat matlab_test.o[JOB ID]*

The expected output is:

*< M A T L A B (R) >*

*Copyright 1984-2020 The MathWorks, Inc.*

*R2020a Update 3 (9.8.0.1396136) 64-bit (glnxa64)*

*May 27, 2020*

*To get started, type doc.*

*For product information, visit www.mathworks.com.*

*Hello World !*

__Example 2__: “Summing the elements of a vector”

In this example, the job creates a vector outside of a loop and uses the vector index to do the sum operation of a vector. Then, the system prints the sum value in the output file.

The MATLAB script:

*sum.m**z = 0;*

*xAll = 0:0.1:10000; % create a vector outside of a loop*

*for i= 1:length(xAll)*

*x = xAll(i);*

*z = z + x;*

*end*

*fprintf('%f\n', z);*

*% Copyright 2010 - 2014 The MathWorks, Inc.*

The job submission script:

*matlab_job.pbs**#!/bin/bash*

*#PBS -q default*

*#PBS -N matlab_test*

*##serial jobs: only 1 processor core is requested*

*#PBS -l select=1:mem=2gb:ncpus=1*

*#PBS -l walltime=00:10:00*

*##replace "x-ccast-prj" below with "x-ccast-prj-[your project group name]"*

*#PBS -W group_list=x-ccast-prj*

*module load matlab/R2020a*

*cd $PBS_O_WORKDIR*

*##change the input filename as needed*

*matlab -nodesktop -nodisplay -r "run sum.m"*

*exit 0*

The above script requests a single node for at most 10 minutes, with 2 GB of RAM. After editing the job submission script as needed (editing the line “

*#PBS -W*”), to submit this job:*$ qsub matlab_job.pbs*

To check the status of the job (It may show nothing if the job has completed):

*$ qstat -u $USER*

To view the error file:

*$ cat matlab_test.e[JOB ID]*

To view the output file:

*$ cat matlab_test.o[JOB ID]*

The expected output is:

*< M A T L A B (R) >*

*Copyright 1984-2020 The MathWorks, Inc.*

*R2020a Update 3 (9.8.0.1396136) 64-bit (glnxa64)*

*May 27, 2020*

*To get started, type doc.*

*For product information, visit www.mathworks.com.*

*500005000.000000*

### 2. Running parallel jobs on CPUs

MATLAB parallelisms include implicit multi-threaded parallelism and explicit parallelism.

#### 2.1 Implicit parallelism

MATLAB contains internal mechanisms that enable some code to run much faster by automatically parallelizing arithmetic and logical operations on data. This is called implicit parallelism (or multi-threaded or built-in multithreading) since we do not need to explicitly tell MATLAB to parallelize the operations. Implicit parallelization relies on the fact many operations are independent of each other and can therefore be processed in parallel.

For implicit parallel computation, vector operation is the necessary trigger, but it is not sufficient. The application or algorithm, and the amount of computation also help MATLAB to determine whether an application will be performed with multithreads. This link provides more information on conditions of implicit parallelism. Such parallelism is implemented by multithreaded computations that are executed within a single node. Therefore, on Thunder, you should only select one node to enable implicit parallelism.

__Example 3__: “Matrix Multiplication”

This job executes multiplication operations by multiplying a matrix by another matrix. In this example, the implicit parallelism is triggered. This can be observed by log in to the compute node and check the CPU usage. On Thunder, to check the name of the compute node, you can execute:

*$ qstat -f [JOB_ID]*

To log into the compute node (You can only log into the compute node when your job is still running on that node. If your job is finished, you cannot log into it.):

*$ ssh node[NODE_ID]*

For example, if your job is running on

*node0001*, you need to use the command:*$ ssh node0001*

Then execute command “

*top*”:*$ top*

This will list the utilization percentage of CPU in the compute node. Then, you can tell if your job is running in parallel or not.

*matrix_multiplication.m*

*n = 5000; % set matrix size*

*A = rand(n); % create random matrix*

*B = rand(n); % create another random matrix*

*tic % calculate the elapsed time using tic and toc*

*C = A * B; % matrix multiplication*

*time=toc; % calculate the elapsed time using tic and toc*

*disp(['Processing time: ' num2str(time)] 's'); % display running time (unit: second)*

*matlab_job_submit.pbs*(job submitting script)

*#!/bin/bash*

*#PBS -q default*

*#PBS -N matlab_test*

*##change the ncpus number to run the job in implicit parallel*

*#PBS -l select=1:mem=4gb:ncpus=1*

*#PBS -l walltime=00:30:00*

*##replace "x-ccast-prj" below with "x-ccast-prj-[your project group name]"*

*#PBS -W group_list=x-ccast-prj*

*module load matlab/R2020a*

*cd $PBS_O_WORKDIR*

*##change the input filename as needed*

*matlab -nodesktop -nodisplay -r "run matrix_multiplication.m"*

*exit 0*

To submit this job:

*$ qsub matlab_job_submit.pbs*

To check the job state (It may show nothing if the job has completed):

*$ qstat -u $USER*

To view the error:

*$ cat matlab_test.e[JOB ID]*

To view the output:

*$ cat matlab_test.o[JOB ID]*

The expected output is:

*< M A T L A B (R) >*

*Copyright 1984-2020 The MathWorks, Inc.*

*R2020a Update 3 (9.8.0.1396136) 64-bit (glnxa64)*

*May 27, 2020*

*To get started, type doc.*

*For product information, visit www.mathworks.com.*

*Processing time: 6.8422s*

To observe implicit parallel computing performance, when different number of cores are selected, you might get results like:

Cores number (

*ncpus*) 1 2 3 4 5 6 7 8Processing time 6.84s 2.56s 1.88s 1.66s 1.33s 1.174s 0.98s 0.99s

The results show the relationship between the processing time and the number of cores allocated. The running time will not decrease monotonously with the increase of the number of cores. Users need to decide the number of selected cores according to the actual situation of the project. In this example, 2 cores or 3 cores might be better choices.

#### 2.2 Explicit parallelism

Explicit parallelism is characterized by the presence of explicit constructs in the programming language, aimed at describing the way in which the parallel computation will take place. In explicit parallelism, several instances of MATLAB simultaneously execute a single MATLAB command or function.

Parallel Computing Toolbox describes the explicit parallelism and lets you use parallel-enabled functions in MATLAB and other toolboxes. For more details on running your code in explicit parallelism, see Choose a Parallel Computing Solution.

A common way to initiate a parallel computation in MATLAB is to use a

*parfor*loop. Our examples are based on*parfor*loop.*parfor*executes for-loop iterations in parallel on workers in a parallel pool. Regarding how*parfor*can help increase your throughput, see Decide When to Use*parfor*.__Example 4__: “Getting eigenvalues of square matrices”

This job performs N trials of computing the largest eigenvalue for a M-by-M random matrix using

*parfor*and outputs its processing time.*parfor.m*

*% Performs N trials of computing the largest eigenvalue for an M-by-M random matrix*

*gcp; % open parallel pool if none is open*

*M=500; % number of rows and columns of each matrix*

*N=12000; % number of trials*

*tic; % calculate the elapsed time using tic and toc*

*a = zeros(N,1); % initialize output vector*

*parfor i = 1:N % use parallel processing by running parfor in a parallel pool*

*a(i) = max(eig(rand(M))); % vector of largest eigenvalues*

*end*

*time = toc; % calculate the elapsed time using tic and toc*

*disp(['Parallel processing time: ' num2str(time)] 's'); % display running time*

*poolobj = gcp('nocreate'); % Get current parallel pool*

*delete(poolobj); % Shutting down parallel pool*

*matlab_job_submit.pbs*(J=job submitting script)

*#!/bin/bash*

*#PBS -q default*

*#PBS -N matlab_parallel_test*

*#PBS -l select=1:mem=8gb:ncpus=5*

*#PBS -l walltime=00:30:00*

*##replace "x-ccast-prj" below with "x-ccast-prj-[your project group name]"*

*#PBS -W group_list=x-ccast-prj*

*module load matlab/R2020a*

*cd $PBS_O_WORKDIR*

*##change the input filename as needed*

*matlab -nodesktop -nosplash -nodisplay -r "run parfor.m"*

*exit 0*

To submit this job:

*$ qsub matlab_job_submit.pbs*

To check the job state (It may show nothing if the job has completed):

*$ qstat -u $USER*

To view the error:

*$ cat matlab_test.e[JOB ID]*

To view the output:

*$ cat matlab_test.o[JOB ID]*

The expected output is:

*< M A T L A B (R) >*

*Copyright 1984-2020 The MathWorks, Inc.*

*R2020a Update 3 (9.8.0.1396136) 64-bit (glnxa64)*

*May 27, 2020*

*To get started, type doc.*

*For product information, visit www.mathworks.com.*

*Starting parallel pool (parpool) using the 'local' profile ...*

*Connected to the parallel pool (number of workers: 5).*

*Parallel processing time: 33.9835s*

*Parallel pool using the 'local' profile is shutting down.*

To observe explicit parallel computing performance, different number of cores are selected, and you might get results like:

Number of workers (

*ncpus*) 1 2 3 4 5Processing time 192.5s 72.8s 65.5s 49.4s 34.0s

__Example 5__: “for-loop and parfor-loop”

This example job also performs N trials of computing the largest eigenvalue for an M-by-M random matrix, but it addresses another two problems. First, this job compares processing time between for-loop processing and parfor-loop processing. Second, when your code includes files that call other files, you need to use

*addpath*adding the working directory into your MATLAB search path.*compare.m*

*% set pathToData with your current work directory*

*% pathToData = [CURRENT_WORKDIR];*

*addpath(pathToData);*

*tic; a1 = ex_for(500,1200); t1 = toc; % calculate a1 using for-loop*

*gcp; % open parallel pool if none is open*

*tic; a2 = ex_parfor (500,1200); t2 = toc; % calculate a2 using parfor-loop*

*% Compare processing times*

*disp(['For-loop processing time: ' num2str(t1) 's'])*

*disp(['Parfor-loop processing time: ' num2str(t2) 's'])*

*% Shutting down parallel pool*

*poolobj = gcp('nocreate'); % get current parallel pool*

*delete(poolobj); % shutting down parallel pool*

*% Copyright 2010 - 2014 The MathWorks, Inc.*

*ex_parfor.m*

*function a = ex_parfor(M, N)*

*a = zeros(N,1);*

*parfor i = 1:N*

*vi a(i) = max(eig(rand(M)));*

*end*

*ex_for.m*

*function a = ex_for(M, N)*

*a = zeros(N,1);*

*for i = 1:N*

*a(i) = max(eig(rand(M)));*

*end*

*matlab_job_submit.pbs*(job submitting script)

*#!/bin/bash*

*#PBS -q default*

*#PBS -N matlab_parallel_test*

*#PBS -l select=1:mem=8gb:ncpus=2*

*#PBS -l walltime=00:30:00*

*##replace "x-ccast-prj" below with "x-ccast-prj-[your project group name]"*

*#PBS -W group_list=x-ccast-prj*

*module load matlab/R2020a*

*cd $PBS_O_WORKDIR*

*##pass $PBS_O_WORKDIR to MATLAB to add directories into MATLAB search path*

*sed -i "1c pathToData = '$PBS_O_WORKDIR';" compare.m*

*##change the input filename as needed*

*matlab -nodesktop -nosplash -nodisplay -r "run compare.m"*

*exit 0*

To submit this job:

*$ qsub matlab_job_submit.pbs*

To check the job state (It may show nothing if the job has completed):

*$ qstat -u $USER*

To view the error:

*$ cat matlab_test.e[JOB ID]*

To view the output:

*$ cat matlab_test.o[JOB ID]*

The expected output is:

*< M A T L A B (R) >*

*Copyright 1984-2020 The MathWorks, Inc.*

*R2020a Update 3 (9.8.0.1396136) 64-bit (glnxa64)*

*May 27, 2020*

*To get started, type doc.*

*For product information, visit www.mathworks.com.*

*Starting parallel pool (parpool) using the 'local' profile ...*

*Connected to the parallel pool (number of workers: 2).*

*For-loop processing time: 141.6204s*

*Parfor-loop processing time: 72.8218s*

*Parallel pool using the 'local' profile is shutting down.*

This job compares processing time between

*for*-loop and*parfor*-loop under the same workload. The results show that*parfor*-loop processing time is much less than*for*-loop processing time.### 3. Running MATLAB on GPUs

A GPU is designed to quickly render high-resolution images and video concurrently. Because GPUs can perform parallel operations on multiple sets of data, they are also commonly used for non-graphical tasks such as machine learning and scientific computation. Learn about when to run MATLAB on GPU, see Accelerate Your Code by Running It on a GPU. To get started with MATLAB GPU computing, see Run MATLAB Functions on a GPU.

If all the functions that you want to use are supported on the GPU, you can simply use

*gpuArray*to transfer input data to the GPU, and call gathers to retrieve the output data from the GPU. In other words, hundreds of functions in MATLAB® and other toolboxes run automatically on a GPU if you supply a*gpuArray*argument. GPU-enabled functions include the discrete Fourier transform (*fft*), matrix multiplication (*mtimes*), left matrix division (*mldivide*), and hundreds of others. For more information, see Check GPU-Supported Functions and Run MATLAB Functions on a GPU.__Example 6__: “Filter a matrix”

This job filters a matrix on GPU using

*filter*function and returns the processing time on GPU. Filters are data processing techniques that can smooth out high-frequency fluctuations in data or remove periodic trends of a specific frequency from data. In this MATLAB job, the*filter*function filters a matrix and returns the filtered data for each column.Note: For GPU, whenever you call functions with at least one

*gpuArray*as a data input argument, the function executes on the GPU. To learn more about when a function runs on GPU or CPU, see Special Conditions for gpuArray Inputs.*gpu.m*

*A = magic(5000); % magic(n) returns an n-by-n matrix*

*f = ones(1,20)/20; % create array of all ones*

*% Filter a matrix on GPU*

*AonGPU = gpuArray(A); % create an array stored on the GPU*

*tic; % calculate the elapsed time using tic and toc*

*BonGPU = filter(f, 1, AonGPU); % do filter operation on GPU*

*C=gather(BonGPU); % convert back to a numeric array on the CPU*

*wait(gpuDevice) % wait for GPU calculation to complete*

*tCompGpu = toc; % calculate the elapsed time using tic and toc*

*disp([' Processing time on GPU: ' num2str(tCompGpu) 's']) % display processing time*

*% Copyright 2014 The MathWorks, Inc.*

*matlab_job_submit.pbs*(job submitting script)

*#!/bin/bash*

*#PBS -q gpus*

*#PBS -N matlab_gpu_test*

*#PBS -l select=1:ncpus=1:mem=4gb:ngpus=1*

*#PBS -l walltime=00:20:00*

*##change “x-ccast-prj” to “x-ccast-prj-[your project group name]"*

*#PBS -W group_list=x-ccast-prj*

*##load CUDAToolkit for GPU usage*

*module load CUDAToolkit/10.0*

*module load matlab/R2020a*

*cd $PBS_O_WORKDIR*

*##change the input filename as needed*

*matlab -nodesktop -nodisplay -r "run gpu.m"*

*exit 0*

To submit this job:

*$ qsub matlab_job_submit.pbs*

To check the job state(It may show nothing if the job has completed):

*$ qstat -u $USER*

To view the error:

*$ cat matlab_test.e[JOB ID]*

To view the output:

*$ cat matlab_test.o[JOB ID]*

The expected output is:

*< M A T L A B (R) >*

*Copyright 1984-2020 The MathWorks, Inc.*

*R2020a Update 3 (9.8.0.1396136) 64-bit (glnxa64)*

*May 27, 2020*

*To get started, type doc.*

*For product information, visit www.mathworks.com.*

*Processing time on GPU: 3.9936s*

From the output, you can see the computation time of

*filter*function on GPU. To learn more about when to run MATLAB on GPU, you can browse Accelerate Your Code by Running It on a GPU.For more information on parallel computing, see MATLAB website and Tutorials from MATLAB.

### 4. Miscellaneous: Q&A

*How to solve errors about “Failed to open/locate matlabpool”?*Some configure files of MATLAB is located in your home directory (

*/gpfs1/home/[USERNAME]*), which is a hidden file and is named as*.matlab*. You can check it by using the following command in your HOME directory.*$ cd $HOME*

*$ ls -al*

*$ cd .matlab*

When you get errors about “

*Failed to open/locate matlabpool*”, you might solve such problems by renaming the directory(*.matlab*) - it should be re-created automatically for you.

**When to use addpath function in your MATLAB script?**When your MATLAB script calls functions or input_files that are written in a separate file, you should add their path into MATLAB search path, which is discussed in Example 5: “Compare serial computing and parallel computing”.