Essential information for HPC users who run bioinformatics software.
As noted in the CCAST User Guide, “To be able to run your jobs and run them efficiently, you need to have some basic knowledge of the application you are using. This includes whether the application is serial (i.e., runs on only one core) or parallel (i.e., can run on multiple cores). If it is parallel, what is the underlying parallel programming model: shared-memory (e.g., using OpenMP, Pthreads, etc.), distributed-memory (e.g., using MPI), or hybrid? You need such information to determine how you would like to request resources for your jobs.” Please carefully read the documentation for the application you want to run, and the installation, running, and testing notes provided in a list of articles in this series (Click on the name of the software to read the article).Example jobs for bioinformatics applications are available in /gpfs1/projects/ccastest/training/examples on Thunder. The best way to get started is to copy an example job from that directory to your SCRATCH directory, modify the PBS job script as needed, and test run it a few times to be familiar with running jobs on Thunder before running your own jobs. Note that the requested resources (ncpus, mem, etc.) in the example PBS scripts are not optimized for your jobs. Also, as a reminder, do NOT run jobs on the login node and do NOT run jobs from your HOME directory.
The software tools listed below have been installed on Thunder for all users to use. This means you do NOT need to install them yourselves. If users want to install and test their own version of certain software tools in their HOME or PROJECTS directory, they can do so by following the instructions in their respective articles. We encourage you to try and be familiar with such tasks since it is very likely that your research will require you to install new software that is currently not available on the Thunder cluster or to have a newer version of a certain software tool installed.
The articles also provide you with essential information about the applications. Look for keywords such as “MPI”, “threads”, etc. as they indicate whether the software is parallel and thus can run on multiple cores. Such information will help you decide how to request resources for your jobs.
 To use a certain application on Thunder, load appropriate environment variables by executing the command “module load <module name>” (usually within a PBS job script); e.g., “module load ABySS/2.0.2-gcc”. Be mindful that Linux commands are case-sensitive.
 Example jobs for many applications available on Thunder in the following directory: /gpfs1/projects/ccastest/training/examples
 The running, installation, and testing notes are available in each program-specific Knowledge Base article.