Topics Map > Services > Research Computing and Support > CCAST

Bioinformatics - SOAPdenovo2

Instructions on how to run (and, if needed, install a customized version of) SOAPdenovo2

SOAPdenovo2 is a novel short-read assembly method that can build a de novo draft assembly for the human-sized genomes. The program is specially designed to assemble Illumina GA short reads.

  1. Running SOAPdenovo2 on Thunder
  2. Install customized SOAPdenovo2 on Thunder
Please refer to the CCAST User Guide and the the article Running Bioinformatics Software on HPC Clusters for general information about using CCAST resources and running bioinformatics software on CCAST's HPC clusters.

1. Running SOAPdenovo2 on Thunder

Example: Assemble shorts reads into genomes

Location: /gpfs1/projects/ccastest/training/examples/SOAPdenovo2_example

File list

· soapdenovo2_job.pbs: job submission script  

· config.txt: configuration file

· frag_1.cor.fastq: paired-end reads in fastq format

· frag_2.cor.fastq: paired-end reads in fastq format


· Copy example directory to your SCRATCH directory

o    cp -r /gpfs1/projects/ccastest/training/examples/SOAPdenovo2_example $SCRATCH

· Go to the copied directory

o    cd  $SCRATCH/SOAPdenovo2_example

· Edit the job submission script as needed, then submit the job

o    qsub soapdenovo2_job.pbs

2. Install Customized SOAPdenovo2 on Thunder

Warning: This part is intended ONLY for those who want to install and test their own version in their HOME directory.


(a)    "-p" option: number of CPU for use, 8 by default.

(b)    No software dependencies


In the following pages, we assume that you want to install the software in a directory named “SOFTWARE” inside your HOME directory on the CCAST’s Thunder cluster. “USERNAME is your username on Thunder.

(a) Install

·       Go to the SOFTWARE directory: 

o    "cd /gpfs1/home/USERNAME/SOFTWARE

·       Git clone the SOAPdenovo2: 

o    "git clone

·       Build:

o    "cd SOAPdenovo2"

o    "make"

(b) Test

·       Make a test directory and go into it: 

o    "cd /gpfs1/scratch/USERNAME " 

o    "mkdir SOAPdenovo2_example"

o    "cd SOAPdenovo2_example"

·       Download the test data to current location:  

o    "wget{1,2}.cor.fastq

·       Write the config file: 

------------------------------------------- file config.txt -------------------------------------------








·       Write and submit the job 

o    "qsub soapdenovo2_job.pbs"

------------------------------------------- file soapdenovo2_job.pbs -------------------------------------------


#PBS -q default

#PBS -N test

##does not work for multiple nodes (i.e., select=1)

##change mem, ncpus, and walltime as needed:

#PBS -l select=1:mem=10gb:ncpus=4

#PBS -l walltime=1:00:00

## Replace “x-ccast-prj” with “x-ccast-prj-[your project group name here]”

#PBS -W group_list=x-ccast-prj



# Set path of your SOAPdenovo2 Binaries



$MY_SOAPDENOVO2/SOAPdenovo-63mer all -K 31 -p $NCPUS -s config.txt -o output


exit 0

See Also:

Keywords:ccast, hpc, thunder, bioinformatics, soapdenovo2   Doc ID:108081
Owner:Liu Y.Group:IT Knowledge Base
Created:2020-12-25 11:54 CDTUpdated:2020-12-29 02:10 CDT
Sites:IT Knowledge Base
Feedback:  0   0