Bioinformatics - Ragout

Instructions on how to run (and, if needed, install a customized version of) Ragout

Ragout is a tool for chromosome-level scaffolding using multiple references. Given initial assembly fragments (contigs/scaffolds) and one or multiple related references (complete or draft), it produces a chromosome-scale assembly (as a set of scaffolds).

  1. Running Ragout on Thunder
  2. Install customized Ragout on Thunder
Please refer to the CCAST User Guide and the the article Running Bioinformatics Software on HPC Clusters for general information about using CCAST resources and running bioinformatics software on CCAST's HPC clusters.

1. Running Ragout on Thunder


Example: Assemble contigs into chromosome-level assembly


Location: /gpfs1/projects/ccastest/training/examples/Ragout_example


File list

· ragout_job.pbs: job submission script  

· ecoli.rcp: configuration (recipe) file

· mg1655_contigs.fasta: contigs to be assembled

· references/DH1.fasta: reference genome


Steps

· Copy example directory to your SCRATCH directory

o    cp -r /gpfs1/projects/ccastest/training/examples/Ragout_example $SCRATCH

· Go to the copied directory

o    cd  $SCRATCH/Ragout_example

· Edit the job submission script as needed, then submit the job

o    qsub ragout_job.pbs


2. Install Customized Ragout on Thunder

Warning: This part is intended ONLY for those who want to install and test their own version in their HOME directory.

Summary

(a)    C++ compiler with C++0x support (GCC 4.6+ / Clang 3.2+ / Apple Clang 4.2+); (System GCC is 4.8.5 – module load not necessary) 

(b)    pip; (need install)

(c)    Python 2.7; (CCAST default python 2.7.5)

(d)    python-networkx 2.2+; (a python package, need install)

(e)    Sibelia; (need install, can also use HAL Tools instead)

(f)     Option "-t"number of threads for synteny backend (Sibelia or HAL).

Details

In the following pages, we assume that you want to install the software in a directory named “SOFTWARE” inside your HOME directory on the CCAST’s Thunder cluster. “USERNAME is your username on Thunder.


(a) Install pip

· Go to the SOFTWARE directory: 

o    "cd /gpfs1/home/USERNAME/SOFTWARE" 

· Set the environmental variable PYTHONUSERBASE in which the '--user' option will use to install things:

o    "echo 'export PYTHONUSERBASE=/gpfs1/home/USERNAME/SOFTWARE/my_python_pkg' >> /gpfs1/home/USERNAME/.bashrc"  

· Reload the settings: 

o    "source /gpfs1/home/USERNAME/.bashrc

· Install pip: 

o    "python get-pip.py --user" 


(b) Install Ragout

· Git clone Ragout:

o    "git clone https://github.com/fenderglass/Ragout.git" 

· Go to the Ragout directory:  

o    "cd Ragout"

· Build Ragout:

o    "python setup.py build"


(c) Install python-networkx using pip

· "$PYTHONUSERBASE/bin/pip install -r requirements.txt --user"


(d) Install Sibelia using the provided script

· "python scripts/install-sibelia.py"


(e) Test

There are four built-in examples in "/gpfs1/home/USERNAME/SOFTWARE/Ragout/examples". Here take the "E.Coli" as a test example.

· Copy files from built in test directory to scratch: 

o    "cd /gpfs1/scratch/USERNAME

o    mkdir Ragout_test

o    cd Ragout_test

o    "cp -r /gpfs1/home/USERNAME/SOFTWARE/Ragout/examples/E.Coli ." 

· Write and submit the job:

o    "qsub ragout_job.pbs"

------------------------------------------- file ragout_job.pbs -------------------------------------------

#!/bin/bash

#PBS -q default

#PBS -N Ragout_Test

#PBS -l select=1:mem=10gb:ncpus=4

#PBS -l walltime=02:00:00

## Replace “x-ccast-prj” with “x-ccast-prj-[your project group name here]”

#PBS -W group_list=x-ccast-prj

cd $PBS_O_WORKDIR

# Set path of your Ragout Binaries

export MY_RAGOUT=/gpfs1/home/USERNAME/SOFTWARE/Ragout/bin

$MY_RAGOUT/ragout ecoli.rcp --outdir ./out --refine -t $NCPUS

exit 0

 

File "ecoli.rcp" is the recipe file provided by the user. Please refer to the usage: "https://github.com/fenderglass/Ragout/blob/master/docs/USAGE.md

 

See Also: