Manual2

Revision as of 03 December 2011 02:52 by admin (Comments | Contribs) | (Commands and configuration)

Prerequisites

  • Linux 64-bit environment

CentOS

  • Python 2.4.3 or higher

Python
- Under centos: yum install python

  • MUMmer 3.22 or higher

MUMmer

  • Blast 2.2.25+ or higher

Blast+


Installation

Just to unpack the tar file.

Available commands:

   tar xvf CISA.tar
   chmod 755 -R CICA1.0

Commands and configuration

Merge.py

Mergy.pu can convert data format to fit CISA. This is a essentail pre-work.

Available commands:

    ./Merge.py myconfig

The content of configuration file:

   count=3  the number of dataset you would like to merge 
   data=assembly1.fasta,title=Contig_m1  設定hyperlink到example file,方便直接下載
   data=assembly2.fasta,title=Contig_m2
   data=assembly3.fasta,title=Contig_m3
   Master_file=Contigs_m.fa
   min_length=100 (default:100)
   Gap=11  此行沒放在config example

The min_length means that contig which is longer than 100 will be conserved. The Gap is a optional variable.
if Gap attends, it will be used to split scoffolding by continuous 11 N.
if Gap is absent, the program will only merge data.

CISA.py

Available commands:

    ./CISA.py myconfig

The content of myconfig file

   genome=genome size
   infile=Contigs.fa
   nucmer=path/nucmer
   R2_Gap=0.9 (default:0.9)
   CISA=path/CISA1.0
   makeblastdb=path/makeblastdb
   blastn=path/blastn

genome

We suggest to use the longest length which is between attended contigs as genome variable.
The break point of CISA will be set to 1.1 * genome variable.

infile
File name with input.

nucmer
Executive file for nucmer. If nucmer has beed set into the path, nucmer variable can be skipped.

makeblastdb
Executive file for makeblastdb. If makeblastdb has beed set into the path, makeblastdb variable can be skipped.

blastn
Executive file for blastn. If blastn has beed set into the path, blastn variable can be skipped.

CISA
Home directory of CISA.

R2_Gap
Tolerant amount of gap during CISA2 step.

Example

  • Data Set:

Ecoli

  • Merge Contigs:

The content of the configuration file:
count=5
data=Abyss_contigs.fa,title=Abyss
data=Edena_contigs.fa,title=Edena
data=SOAPdenovo_contigs.fa,title=SOAP
data=CLC_contigs.fa,title=CLC
data=Velvet_contigs.fa,title=Velvet
Master_file=Contigs.fa

Command:
./Merge.py myconfig

  • Start to integrate:

The content of the configuration file:
genome=4626205
infile=Contigs.fa
nucmer=path/nucmer
R2_Gap=0.95
CISA=path/CISA1.0
makeblastdb=path/makeblastdb
blastn=path/blastn
4626205 which is the longest whole genome between different result from 5 assemblers is set into genome variable.

Command:
./CISA.py configuration_file