Topics Map > Services > Research Computing and Support > CCAST

Bioinformatics - BCFtools

Instructions on how to run (and, if needed, install a customized version of) BCFtools

BCFtools is a program for variant calling and manipulating files in the Variant Call Format (VCF) and its binary counterpart BCF.

  1. Running BCFtools on Thunder
  2. Install customized BCFtools on Thunder
Please refer to the CCAST User Guide and the the article Running Bioinformatics Software on HPC Clusters for general information about using CCAST resources and running bioinformatics software on CCAST's HPC clusters.

1. Running BCFtools on Thunder


Example 1: Convert a bcf file into a vcf file.


Location: /gpfs1/projects/ccastest/training/examples/BCFtools_example1


File list:

· bcftools_job.pbs: job submission script 

· sim_variants.bcf: a simulated variant file in bcf format


Steps:

· Copy the example directory to your SCRATCH directory

o    cp -r /gpfs1/projects/ccastest/training/examples/BCFtools_example1 $SCRATCH

· Go to the copied directory

o    cd  $SCRATCH/BCFtools_example1

· Edit the job submission script as needed, then submit the job

o    qsub bcftools_job.pbs


Example 2: Use samtools to call variants from a reads mapping file and a reference genome, then convert the result into a vcf file.


Location: /gpfs1/projects/ccastest/training/examples/BCFtools_example2


File list:

· bcftools_job.pbs: job submission script 

· NC_008253.fna: a sample reference genome

· sim_reads_aligned.sorted.bam: mapping result from an aligner (e.g. Bowtie2 and BWA) in bam format


Steps:

· Copy the example directory to your SCRATCH directory

o    cp -r /gpfs1/projects/ccastest/training/examples/BCFtools_example2 $SCRATCH

· Go to the copied directory

o    cd  $SCRATCH/BCFtools_example2

· Edit the job submission script as needed, then submit the job

o    qsub bcftools_job.pbs


2. Install Customized ABySS on Thunder

Warning: This part is intended ONLY for those who want to install and test their own version in their HOME directory.

Summary


(a)    bzip2 is required for building; (available via ‘module load bzip2’) 

(b)    Option “--threads INT: Number of output compression threads to use in addition to the main thread. Only used when --output-type is b or z. Default: 0.


Details


In the following pages, we assume that you want to install the software in a directory named “SOFTWARE” inside your HOME directory on the CCAST’s Thunder cluster. “USERNAME is your username on Thunder.


(a) Install BCFtools


· Go to where you want to install: 

o    "cd /gpfs1/home/USERNAME/SOFTWARE

· Download from the author's website: 

o    "wget https://github.com/samtools/bcftools/releases/download/1.9/bcftools-1.9.tar.bz2

· Unzip the downloaded file to current location: 

o    "tar -jxvf bcftools-1.9.tar.bz2"

· Go to the unzipped directory:  

o    "cd bcftools-1.9

· Load bzip2:

o    "module load bzip2"

· Configure and specify where to install: 

o    "./configure --prefix=/gpfs1/home/USERNAME/SOFTWARE/bcftools_install_here

· Build: 

o    "make

· Install:

o    "make install


(b) Test 1: work alone

(only tested bcftools "call" command)


· Make and go to test directory: 

o    "cd /gpfs1/scratch/USERNAME

o    "mkdir bcftools_example1"

o    "cd bcftools_example1"

· Download sample data:  

o    "wget https://github.com/ecerami/samtools_primer/blob/master/tutorial/variants/sim_variants.bcf

· Submit the job 

o    qsub bcftools_job.pbs


--------------- file bcftools_job.pbs -----------------

#!/bin/bash

#PBS -q default

#PBS -N bcftools_test

#PBS -l select=1:mem=10gb:ncpus=4

#PBS -l walltime=02:00:00

## Replace “x-ccast-prj” with “x-ccast-prj-[your project group name here]”

#PBS -W group_list=x-ccast-prj

cd $PBS_O_WORKDIR

#Set path to executables & plugins

export PATH=$PATH:/gpfs1/home/USERNAME/SOFTWARE/bcftools_install_here/bin

export BCFTOOLS_PLUGINS=/gpfs1/home/USERNAME/SOFTWARE/bcftools_install_here/libexec/bcftools

#Please check http://biobits.org/samtools_primer.html for reference.

bcftools call -cv sim_variants.bcf --threads $NCPUS -o sim_variants.vcf

exit 0


(c) Test 2: work together with SAMtools 

(only tested bcftools "call" command)


· Make and go to test directory: 

o    "cd /gpfs1/scratch/USERNAME

o    "mkdir bcftools_example2"

o    "cd bcftools_example2"

· Download sample data: 

o    "wget https://github.com/ecerami/samtools_primer/blob/master/tutorial/genomes/NC_008253.fna

o    "wget https://github.com/ecerami/samtools_primer/blob/master/tutorial/alignments/sim_reads_aligned.sorted.bam"

· Submit the job 

o    qsub bcftools_job.pbs


--------------- file bcftools_job.pbs -----------------

#!/bin/bash

#PBS -q default

#PBS -N bcftools_test

#PBS -l select=1:mem=10gb:ncpus=4

#PBS -l walltime=02:00:00

## Replace “x-ccast-prj” with “x-ccast-prj-[your project group name here]”

#PBS -W group_list=x-ccast-prj

 

cd $PBS_O_WORKDIR

#load samtools module

module load SAMtools/1.6-gcc

#Set path to executables & plugins

export PATH=$PATH:/gpfs1/home/USERNAME/SOFTWARE/bcftools_install_here/bin

export BCFTOOLS_PLUGINS=/gpfs1/home/USERNAME/SOFTWARE/bcftools_install_here/libexec/bcftools

#Please check http://biobits.org/samtools_primer.html for reference.

samtools mpileup -gf NC_008253.fna sim_reads_aligned.sorted.bam | bcftools call -cv --threads $NCPUS -o sim_variants.vcf

exit 0


See Also:




Keywords:ccast, hpc, thunder, bioinformatics, bcftools   Doc ID:108008
Owner:Liu Y.Group:IT Knowledge Base
Created:2020-12-19 22:54 CSTUpdated:2020-12-29 01:36 CST
Sites:IT Knowledge Base
CleanURL:https://kb.ndsu.edu/bcftools
Feedback:  0   0