Topics Map > Services > Research Computing and Support > CCAST
Bioinformatics - Ragout
Instructions on how to run (and, if needed, install a customized version of) Ragout
Ragout is a tool for chromosome-level scaffolding using multiple references. Given initial assembly fragments (contigs/scaffolds) and one or multiple related references (complete or draft), it produces a chromosome-scale assembly (as a set of scaffolds).
Please refer to the CCAST User Guide and the the article Running Bioinformatics Software on HPC Clusters for general information about using CCAST resources and running bioinformatics software on CCAST's HPC clusters.1. Running Ragout on Thunder
Example: Assemble contigs into chromosome-level assembly
Location: /mmfs1/thunder/projects/ccastest/training/examples/Ragout_example
File list
· ragout_job.pbs: job submission script
· ecoli.rcp: configuration (recipe) file
· mg1655_contigs.fasta: contigs to be assembled
· references/DH1.fasta: reference genome
Steps
· Copy example directory to your SCRATCH directory
o “cp -r /mmfs1/thunder/projects/ccastest/training/examples/Ragout_example $SCRATCH”
· Go to the copied directory
o “cd $SCRATCH/Ragout_example”
· Edit the job submission script as needed, then submit the job
o “qsub ragout_job.pbs”
2. Install Customized Ragout on Thunder
Warning: This part is intended ONLY for those who want to install and test their own version in their HOME directory.
Summary
(a) C++ compiler with C++0x support (GCC 4.6+ / Clang 3.2+ / Apple Clang 4.2+); (System GCC is 4.8.5 – module load not necessary)
(b) pip; (need install)
(c) Python 2.7; (CCAST default python 2.7.5)
(d) python-networkx 2.2+; (a python package, need install)
(e) Sibelia; (need install, can also use HAL Tools instead)
(f) Option "-t": number of threads for synteny backend (Sibelia or HAL).
Details
In the following pages, we assume that you want to install the software in a directory named “SOFTWARE” inside your HOME directory on the CCAST’s Thunder cluster. “USERNAME” is your username on Thunder.
(a) Install pip
· Go to the SOFTWARE directory:
o "cd /mmfs1/home/USERNAME/SOFTWARE"
· Set the environmental variable PYTHONUSERBASE in which the '--user' option will use to install things:
o "echo 'export PYTHONUSERBASE=/mmfs1/home/USERNAME/SOFTWARE/my_python_pkg' >> /gpfs1/home/USERNAME/.bashrc"
· Reload the settings:
o "source /mmfs1/home/USERNAME/.bashrc"
· Install pip:
o "python get-pip.py --user"
(b) Install Ragout
· Git clone Ragout:
o "git clone https://github.com/fenderglass/Ragout.git"
· Go to the Ragout directory:
o "cd Ragout"
· Build Ragout:
o "python setup.py build"
(c) Install python-networkx using pip
· "$PYTHONUSERBASE/bin/pip install -r requirements.txt --user"
(d) Install Sibelia using the provided script
· "python scripts/install-sibelia.py"
(e) Test
There are four built-in examples in "/mmfs1/home/USERNAME/SOFTWARE/Ragout/examples". Here take the "E.Coli" as a test example.
· Copy files from built in test directory to scratch:
o "cd /mmfs1/thunder/scratch/USERNAME”
o “mkdir Ragout_test”
o “cd Ragout_test”
o "cp -r /gpfs1/home/USERNAME/SOFTWARE/Ragout/examples/E.Coli ."
· Write and submit the job:
o "qsub ragout_job.pbs"
------------------------------------------- file ragout_job.pbs -------------------------------------------
#!/bin/bash
#PBS -q default
#PBS -N Ragout_Test
#PBS -l select=1:mem=10gb:ncpus=4
#PBS -l walltime=02:00:00
## Replace “x-ccast-prj” with “x-ccast-prj-[your project group name here]”
#PBS -W group_list=x-ccast-prj
cd $PBS_O_WORKDIR
# Set path of your Ragout Binaries
export MY_RAGOUT=/mmfs1/home/USERNAME/SOFTWARE/Ragout/bin
$MY_RAGOUT/ragout ecoli.rcp --outdir ./out --refine -t $NCPUS
exit 0
File "ecoli.rcp" is the recipe file provided by the user. Please refer to the usage: "https://github.com/fenderglass/Ragout/blob/master/docs/USAGE.md"