Topics Map > Services > Research Computing and Support > CCAST

Bioinformatics - FSA

Instructions on how to run (and, if needed, install a customized version of) FSA

FSA is a probabilistic multiple sequence alignment algorithm which uses a "distance-based" approach to aligning homologous protein, RNA or DNA sequences.

  1. Running FSA on Thunder
  2. Install customized FSA on Thunder
Please refer to the CCAST User Guide and the the article Running Bioinformatics Software on HPC Clusters for general information about using CCAST resources and running bioinformatics software on CCAST's HPC clusters.

1. Running FSA on Thunder

Location/gpfs1/projects/ccastest/training/examples/FSA_example

Example: Conduct a multiple alignment on a set of sequences

File list:

· fsa_job.pbs: job submission script  

· tRNA.aln1.fasta: a set of tRNA sequences in fasta format


Steps:

· Copy the example directory to your SCRATCH directory

o   cp -r /gpfs1/projects/ccastest/training/examples/FSA_example $SCRATCH

· Go to the copied directory

o   cd  $SCRATCH/FSA_example

· Edit the job submission script as needed, then submit the job

o    qsub fsa_job.pbs


2. Install Customized FSA on Thunder

Warning: This part is intended ONLY for those who want to install and test their own version in their HOME directory.


Summary

(a) "Fixed-size Chunking" strategy for parallelization. No options in commands. 

(b) "--with-mummer" or "--with-exonerate" options to ./configure before compilation enables aligning long sequences. But need install mummer or exonerate.


Details

In the following pages, we assume that you want to install the software in a directory named “SOFTWARE” inside your HOME directory on the CCAST’s Thunder cluster. “USERNAME is your username on Thunder.


(a) Install

· Go to the SOFTWARE directory: 

o    "cd /gpfs1/home/USERNAME/SOFTWARE" 

· Download FSA, unzip and go to the unzipped directory: (refer to http://fsa.sourceforge.net)

o    "wget https://sourceforge.net/projects/fsa/files/fsa-1.15.9.tar.gz" 

o    "tar -xzvf fsa-1.15.9.tar.gz"  

o    "cd fsa-1.15.9"

· Configure and specify the install location:

o    "./configure --prefix=/gpfs1/home/USERNAME/SOFTWARE/fsa_install_here --with-mummer"

· Build and install:

o    "make && make install"


(b) Test

· Make a test directory and go into it: 

o    "cd /gpfs1/scratch/USERNAME" 

o    "mkdir FSA_example"

o    "cd FSA_example"

· Copy one example test data from source directory to current location:

o    "cp /gpfs1/home/USERNAME/SOFTWARE/fsa-1.15.9/examples/tRNA.aln1.fasta ."

· Write and submit the job 

o    "qsub fsa_job.pbs"

------------------------------------------- file fsa_job.pbs -------------------------------------------

#!/bin/bash

#PBS -q default

#PBS -N FSA_Test

#PBS -l select=1:mem=10gb:ncpus=4

#PBS -l walltime=02:00:00

## Replace “x-ccast-prj” with “x-ccast-prj-[your project group name here]”

#PBS -W group_list=x-ccast-prj

cd $PBS_O_WORKDIR

##set path to binaries

export MY_FSA=/gpfs1/home/USERNAME/SOFTWARE/fsa_install_here/bin

$MY_FSA/fsa tRNA.aln1.fasta >tRNA.aln1.mfa

exit 0


See Also:




Keywords:ccast, hpc, thunder, bioinformatics, fsa   Doc ID:108071
Owner:Liu Y.Group:IT Knowledge Base
Created:2020-12-23 16:47 CDTUpdated:2020-12-29 02:15 CDT
Sites:IT Knowledge Base
CleanURL:https://kb.ndsu.edu/fsa
Feedback:  0   0