Bioinformatics - Velvet

Instructions on how to run (and, if needed, install a customized version of) Velvet

Velvet is a sequence assembler for very short reads.

  1. Running Velvet on Thunder
  2. Install customized Velvet on Thunder
Please refer to the CCAST User Guide and the the article Running Bioinformatics Software on HPC Clusters for general information about using CCAST resources and running bioinformatics software on CCAST's HPC clusters.

1. Running Velvet on Thunder


Example: Extract a sequence from a pre-built database and blast this sequence against the database


Location: /gpfs1/projects/ccastest/training/examples/Velvet_example


File list

· velvet_job.pbs: job submission script   

· test_reads.fa: sequences in fasta format

 

Steps

· Copy example directory to your SCRATCH directory

o    cp -r /gpfs1/projects/ccastest/training/examples/Velvet_example $SCRATCH

· Go to the copied directory

o    cd  $SCRATCH/Velvet_example

· Edit the job submission script as needed, then submit the job

o    qsub velvet_job.pbs


2. Install Customized Velvet on Thunder

Warning: This part is intended ONLY for those who want to install and test their own version in their HOME directory.

Summary

(a) Velvet is a de novo genomic assembler specially designed for short read sequencing technologies, such as Solexa or 454.

(b) Only parts of the Velvet algorithm make use of multithreading (OpenMP), so don’t expect a linear increase in run time with respect to CPUs. Set number of threads with “export OMP_NUM_THREADS=$NCPUS


Details


(a) Install

·      Go to the SOFTWARE directory: 

o    "cd /gpfs1/home/USERNAME/SOFTWARE" 

·      Git clone the Velvet: 

o    "git clone https://github.com/dzerbino/velvet.git" 

·      Build with OPENMP option:

o    "cd velvet"

o    "make 'OPENMP=1'"

(b) Test

·      Make a test directory and go into it: 

o    "cd /gpfs1/scratch/USERNAME " 

o    "mkdir Velvet_example"

o    "cd Velvet_example"

·      Download the test data to current location:  

o    "wget http://rcs.bu.edu/examples/bioinformatics/velvet/test_reads.fa

·      Write and Submit the job 

o    "qsub velvet_job.pbs"

------------------------------------------- file velvet_job.pbs -------------------------------------------

#!/bin/bash

#PBS -q default

#PBS -N test

##does not work on multiple nodes; i.e., always set "select=1"

##change mem, ncpus, and walltime as needed:

#PBS -l select=1:mem=10gb:ncpus=4

#PBS -l walltime=02:00:00

##change "x-ccast-prj" to "x-ccast-prj-[your project group name]"

#PBS -W group_list=x-ccast-prj

 

#set path to your Velvet binaries as $MY_VELVET

export MY_VELVET=/gpfs1/home/USERNAME/SOFTWARE/velvet

 

cd $PBS_O_WORKDIR

 

$MY_VELVET/velveth out_dir 21 -fasta -short test_reads.fa

 

exit 0

 

See Also: