Line 1: |
|
|
+ |
= Prerequisites =
|
|
|
+ |
* Linux 64-bit environment
|
|
|
+ |
[http://www.centos.org CentOS]
|
|
|
+ |
* Python 2.4.3 or higher
|
|
|
+ |
[http://www.python.org Python]<br>
|
|
|
+ |
- Under centos: yum install python
|
|
|
+ |
* MUMmer 3.22 or higher
|
|
|
+ |
[http://mummer.sourceforge.net MUMmer]
|
|
|
+ |
* Blast 2.2.25+ or higher
|
|
|
+ |
[http://blast.ncbi.nlm.nih.gov Blast+]
|
|
|
|
|
|
|
|
|
|
|
+ |
= Installation =
|
|
|
+ |
Just only unpack the tar file.<br>
|
|
|
|
|
|
|
+ |
Available commands:
|
|
|
+ |
tar xvf CISA.tar.
|
|
|
|
|
|
|
+ |
= Commands and configuration =
|
|
|
|
|
|
|
+ |
<b><font size="5">Merge.py</font></b>
|
|
|
|
|
|
|
+ |
Mergy.pu can convert data format to fit CISA. '''This is a essentail pre-work'''.
|
|
|
|
|
|
|
+ |
Available commands:
|
|
|
+ |
./Merge.py configuration_file
|
|
|
|
|
|
|
+ |
The content of configuration file:
|
|
|
|
|
|
|
+ |
count=3
|
|
|
+ |
data=assembly1.fasta,title=Contig_m1
|
|
|
+ |
data=assembly2.fasta,title=Contig_m2
|
|
|
+ |
data=assembly3.fasta,title=Contig_m3
|
|
|
+ |
Master_file=Contigs_m.fa
|
|
|
+ |
min_length=100 (default:100)
|
|
|
+ |
Gap=11
|
|
|
|
|
|
|
+ |
The min_length means that contig which is longer than 100 will be conserved.
|
|
|
+ |
The Gap is a optional variable.<br>
|
|
|
+ |
if Gap attends, it will be used to split scoffolding by continuous 11 N.<br>
|
|
|
+ |
if Gap is absent, the program will only merge data.
|
|
|
|
|
|
|
+ |
<b><font size="5">CISA.py</font></b>
|
|
|
|
|
|
|
+ |
Available commands:
|
|
|
+ |
./CISA.py configuration_file
|
|
|
|
|
|
|
+ |
The content of configuration file
|
|
|
+ |
genome=genome size
|
|
|
+ |
infile=Contigs.fa
|
|
|
+ |
nucmer=path/nucmer
|
|
|
+ |
R2_Gap=0.9 (default:0.9)
|
|
|
+ |
CISA=path/CISA1.0
|
|
|
+ |
makeblastdb=path/makeblastdb
|
|
|
+ |
blastn=path/blastn
|
|
|
|
|
|
|
+ |
genome
|
|
|
|
|
|
|
+ |
We suggest to use the longest length which is between attended contigs as genome variable.<br>
|
|
|
+ |
The break point of CISA will be set to 1.1 * genome variable.
|
|
|
|
|
|
|
+ |
infile<br>
|
|
|
+ |
File name with input.
|
|
|
|
|
|
|
+ |
nucmer<br>
|
|
|
+ |
Executive file for nucmer. If nucmer has beed set into the path, nucmer variable can be skipped.
|
|
|
|
|
|
|
+ |
makeblastdb<br>
|
|
|
+ |
Executive file for makeblastdb. If makeblastdb has beed set into the path, makeblastdb variable can be skipped.
|
|
|
|
|
|
|
+ |
blastn<br>
|
|
|
+ |
Executive file for blastn. If blastn has beed set into the path, blastn variable can be skipped.
|
|
|
|
|
|
|
+ |
CISA<br>
|
|
|
+ |
Home directory of CISA.
|
|
|
|
|
|
|
+ |
R2_Gap<br>
|
|
|
+ |
Tolerant amount of gap during CISA2 step.
|
|
|
|
|
|
|
+ |
= Example =
|
|
|
+ |
*'''Data Set:'''
|
|
|
+ |
[[Ecoli|Ecoli]]
|
|
|
|
|
|
|
+ |
*'''Merge Contigs:'''<br>
|
|
|
+ |
The content of the configuration file:<br>
|
|
|
+ |
''count=5<br>
|
|
|
+ |
data=Abyss_contigs.fa,title=Abyss<br>
|
|
|
+ |
data=Edena_contigs.fa,title=Edena<br>
|
|
|
+ |
data=SOAPdenovo_contigs.fa,title=SOAP<br>
|
|
|
+ |
data=CLC_contigs.fa,title=CLC<br>
|
|
|
+ |
data=Velvet_contigs.fa,title=Velvet<br>
|
|
|
+ |
Master_file=Contigs.fa<br>''
|
|
|
|
|
|
|
+ |
Command:<br>
|
|
|
+ |
./Merge.py configuration_file
|
|
|
|
|
|
|
+ |
*'''Start to integrate:'''<br>
|
|
|
+ |
The content of the configuration file:<br>
|
|
|
+ |
''genome=4626205<br>
|
|
|
+ |
infile=Contigs.fa<br>
|
|
|
+ |
nucmer=path/nucmer<br>
|
|
|
+ |
R2_Gap=0.95<br>
|
|
|
+ |
CISA=path/CISA1.0<br>
|
|
|
+ |
makeblastdb=path/makeblastdb<br>
|
|
|
+ |
blastn=path/blastn<br>''
|
|
|
+ |
4626205 which is the longest whole genome between different result from 5 assemblers is set into genome variable.
|
|
|
|
|
|
|
+ |
Command:<br>
|
|
|
+ |
./CISA.py configuration_file
|