Guide
Examples
= Prerequisites = * Linux 64-bit environment [http://www.centos.org CentOS] * Python 2.X [http://www.python.org Python]<br> - Under centos: yum install python * MUMmer 3.22 or higher [http://mummer.sourceforge.net MUMmer] * Blast 2.2.25+ or higher [http://blast.ncbi.nlm.nih.gov Blast+] = Installation = Just to unpack the tar file.<br> Available commands: tar xvf CISA.tar Execute access: chmod 755 -R CICA1.0 = Commands and configuration = <b><font size="5">Merge.py</font></b> Mergy.pu can convert data format to fit CISA. '''This is a essentail pre-work'''. Available commands: ./Merge.py myconfig output:Merge_info The content of configuration file: count=3 data=assembly1.fasta,title=Contig_m1 data=assembly2.fasta,title=Contig_m2 data=assembly3.fasta,title=Contig_m3 Master_file=Contigs_m.fa min_length=100 (default:100) Gap=11 '''count='''<br> The number of dataset you would like to merge '''min_length='''<br> It means that contig which is longer than 100 will be conserved. '''Gap=''' <br> It's a optional variable.<br> if Gap attends, it will be used to split scoffolding by continuous 11 N.<br> if Gap is absent, the program will only merge data. If your data is from windows base. Please convert it to linux format. dos2unix Abyss_contigs.fa <b><font size="5">CISA.py</font></b> Available commands: ./CISA.py myconfig The content of configuration file: genome=genome size infile=file outfile=file nucmer=your installed path/nucmer R2_Gap=0.9 default:0.9 myconfig file中也改成0.9 CISA=your installed path/CISA1.0 makeblastdb=your installed path/makeblastdb blastn=your installed path/blastn genome=<br> The genome size of each genome contain in ''Merge_info'' file. The largest genome size will be recommended to put here. We suggest to use the longest length which is between attended contigs as genome variable.<br> The break point of CISA will be set to 1.1 * genome variable. infile=<br> File name with input. outfile=<br> File name with output. nucmer=<br> Executive file for nucmer. If nucmer has beed set into the path, nucmer variable can be skipped. makeblastdb=<br> Executive file for makeblastdb. If makeblastdb has beed set into the path, makeblastdb variable can be skipped. blastn=<br> Executive file for blastn. If blastn has beed set into the path, blastn variable can be skipped. CISA=<br> Home directory of CISA. R2_Gap=<br> Tolerant amount of gap during CISA2 step. = Example = == Data Set == [[Ecoli|Ecoli]] == Merge Contigs == The content of the configuration file: <table width="457" border="1"> <tr> <td height="152" align="left" bgcolor="#66CCFF"> count=5 data=Abyss_contigs.fa,title=Abyss data=Edena_contigs.fa,title=Edena data=SOAPdenovo_contigs.fa,title=SOAP data=CLC_contigs.fa,title=CLC data=Velvet_contigs.fa,title=Velvet Master_file=Contigs.fa </td> </tr> </table> [[Media:Abyss contigs.fa|Abyss_contigs.fa]], [[Media:Edena contigs.fa|Edena_contigs.fa]], [[Media:SOAPdenovo contigs.fa|SOAPdenovo_contigs.fa]], [[Media:CLC contigs.fa|CLC_contigs.fa]], [[Media:Velvetcontigs.fa|Velvet_contigs.fa]] *Command:<br> ./Merge.py myconfig == Start to integrate == The content of the myconfig: genome=4626205 infile=Contigs.fa nucmer=path/nucmer R2_Gap=0.95 CISA=path/CISA1.0 makeblastdb=path/makeblastdb blastn=path/blastn *4626205 which is the longest whole genome between different result from 5 assemblers is set into genome variable. You can find it in Merge_info. *Command:<br> ./CISA.py my_config