Bioinformatics - StringTie

Instructions on how to run (and, if needed, install a customized version of) StringTie

StringTie is a fast and highly efficient assembler of RNA-Seq alignments into potential transcripts. 

  1. Running StringTie on Thunder
  2. Install customized StringTie on Thunder
Please refer to the CCAST User Guide and the the article Running Bioinformatics Software on HPC Clusters for general information about using CCAST resources and running bioinformatics software on CCAST's HPC clusters.

1. Running StringTie on Thunder

Example: Assemble alignment results into transcripts

Location: /gpfs1/projects/ccastest/training/examples/StringTie_example

File list

· stringtie_job.pbs: job submission script  

· mir-17-92.bam: alignment result in bam format


· Copy example directory to your SCRATCH directory

o    cp -r /gpfs1/projects/ccastest/training/examples/StringTie_example $SCRATCH

· Go to the copied directory

o    cd  $SCRATCH/StringTie_example

· Edit the job submission script as needed, then submit the job

o    qsub stringtie_job.pbs

2. Install Customized StringTie on Thunder

Warning: This part is intended ONLY for those who want to install and test their own version in their HOME directory.


(a)    The “-p” option specifies the number of processing threads (CPUs) to use for transcript assembly.

(b)    No software dependencies


In the following pages, we assume that you want to install the software in a directory named “SOFTWARE” inside your HOME directory on the CCAST’s Thunder cluster. “USERNAME is your username on Thunder.

(a) Install

· Go to your software directory: 

o    "cd /gpfs1/home/USERNAME/SOFTWARE

· Download and unzip StringTie: 

o    "wget

o    "tar -xzvf stringtie-1.3.6.tar.gz"

· Go to the StringTie source code directory: 

o    "cd stringtie-1.3.6

· Build: (the resultant executable ‘stringtie’ is standalone that is independent with other files) 

o    "make release

· Move executable to an optional directory for clean access: 

o    "mkdir stringtie_is_here

o    "cd stringtie_is_here

o    "cp ../stringtie-1.3.6/stringtie .

(b) Test

· Go to the scratch directory: 

o    "cd /gpfs1/scratch/USERNAME

· Make and go into test directory: 

o    "mkdir StringTie_example"

o    "cd StringTie_example"

· Download data file: (input BAM file must be sorted by their genomic location)

o    "wget

· Write and submit the job: (by default StringTie writes the GTF at standard output)

o    "qsub stringtie_job.pbs

---------------------------------------------- stringtie_job.pbs------------------------------------------------------


#PBS -q default

#PBS -N StringTie_test

##change mem, ncpus, and walltime as needed:

#PBS -l select=1:mem=10gb:ncpus=4

#PBS -l walltime=1:00:00

## Replace “x-ccast-prj” with “x-ccast-prj-[your project group name here]”

#PBS -W group_list=x-ccast-prj


# Add your StringTie binaries to $PATH

export PATH=$PATH:/gpfs1/home/USERNAME/SOFTWARE/stringtie_is_here

stringtie mir-17-92.bam -p $NCPUS -o out.gtf

exit 0


See Also: