Version Differences for Instruction

Line 1:
    + = Prerequisites =  
    + * Linux 64-bit environment  
    + [http://www.centos.org CentOS]  
    + * Python 2.X  
    + [http://www.python.org Python]<br>  
    + - Under centos: yum install python  
    + * MUMmer 3.22 or higher  
    + [http://mummer.sourceforge.net MUMmer]  
    + * Blast 2.2.25+ or higher  
    + [http://blast.ncbi.nlm.nih.gov Blast+]  
       
       
    + = Installation =  
    + Please unpack the tar file.<br>  
    + [[Media:CISA_20120425.tar|CISA - 2012.04.25]]  
       
    + Available commands:  
    + >tar xvf CISA.tar  
       
    + Execute access:  
    + >chmod 755 -R CICA1.0  
       
    + == Commands and configuration ==  
       
    + === Merge.py ===  
       
    + Mergy.py can convert data format to fit CISA. '''This is an essentail pre-work'''.  
       
    + Available commands:  
    + >python Merge.py Merge.config  
       
    + The content of configuration file:  
       
    + count=3  
    + data=assembly1.fasta,title=Contig_m1  
    + data=assembly2.fasta,title=Contig_m2  
    + data=assembly3.fasta,title=Contig_m3  
    + Master_file=Contigs_m.fa  
    + min_length=100 (default:100)  
    + Gap=11  
       
    + #count<br>The number of assemblies you would like to merge  
       
    + #min_length<br>Contigs of length smaller than min_length (e.g. 100 bp) will be discarded.  
       
    + #Gap<br>It's an optional variable.<br>With this variable, we will split the assemblies into contigs at >10 Ns.  
       
    + Please note that '''dos2unix''' may be necessary if your data are in DOS/MAC format.  
    + >dos2unix assembly1.fasta assembly2.fasta assembly3.fasta  
    + === CISA.py ===  
       
    + Available commands:  
    + >python CISA.py CISA.config  
       
    + The content of configuration file:  
    + genome=genome size  
    + infile=file  
    + outfile=file  
    + nucmer=your installed path/nucmer  
    + R2_Gap=0.95 (default:0.95)  
    + CISA=your installed path/CISA1.0  
    + makeblastdb=your installed path/makeblastdb  
    + blastn=your installed path/blastn  
       
    + #genome<br/>Please input the estimated genome size here. The longest length of your input assemblies will be recommended. <br/>The break point of CISA will be set to 1.1 * genome variable.  
    + #infile<br/>The file containing the set of contigs you want to integrate.  
    + #nucmer<br/>The executive file for nucmer. If nucmer has been added into the path, this variable can be skipped.  
    + #makeblastdb<br/>The executive file for makeblastdb. If makeblastdb has been added into the path, this variable can be skipped.  
    + #blastn<br/>The executive file for blastn. If blastn has been added into the path, this variable can be skipped.  
    + #CISA<br/>Home directory of CISA.  
    + #R2_Gap<br/>A threshold used in the phase 2 of CISA.  
       
    + == Example ==  
       
    + <ul><li><strong>Prepair datasets:</strong> At least three assemblies are required for contig integration using CISA.</li></ul>  
       
    + <blockquote>In the case of [[Ecoli|Ecoli]], we used five softwares including Abyss, CLC, Edena, SOAPdenovo and Velvet to generate five assemblies.</blockquote>  
       
    + <ul><li><strong>Convert the format and ID of your datasets:</strong><br/></li></ul>  
    + <blockquote><p> The content of the configuration file (Merge.config):</p></blockquote>  
    + <table width="457" border="1">  
    + <tr>  
    + <td height="152" align="left" bgcolor="#66CCFF">  
    + count=5  
    + data=Abyss_contigs.fa,title=Abyss  
    + data=Edena_contigs.fa,title=Edena  
    + data=SOAPdenovo_contigs.fa,title=SOAP  
    + data=CLC_contigs.fa,title=CLC  
    + data=Velvet_contigs.fa,title=Velvet  
    + Master_file=Contigs.fa  
    + </td>  
    + </tr>  
    + </table>  
       
    + <ul><li><strong>Command:</strong><br /></li></ul><blockquote><p>&gt;python Merge.py Merge.config</p></blockquote>  
       
    + Input files: [[Media:Abyss contigs.fa|Abyss_contigs.fa]], [[Media:Edena contigs.fa|Edena_contigs.fa]], [[Media:SOAPdenovo contigs.fa|SOAPdenovo_contigs.fa]], [[Media:CLC contigs.fa|CLC_contigs.fa]], [[Media:Velvetcontigs.fa|Velvet_contigs.fa]]  
       
    + Statistics of the input assemblies:  
    + <blockquote>  
    + Abyss_contigs.fa  
    + <br>Number of contigs: 133  
    + <br>Length of the longest contig: 222425  
    + <br>whole:4626205  
    + <br>N50: 96511  
    + <br>Edena_contigs.fa  
    + <br>Number of contigs: 211  
    + <br>Length of the longest contig: 186686  
    + <br>whole:4569446  
    + <br>N50: 57790  
    + <br>SOAPdenovo_contigs.fa  
    + <br>Number of contigs: 553  
    + <br>Length of the longest contig: 103369  
    + <br>whole:4547211  
    + <br>N50: 17944  
    + <br>CLC_contigs.fa  
    + <br>Number of contigs: 378  
    + <br>Length of the longest contig: 107342  
    + <br>whole:4546827  
    + <br>N50: 29905  
    + <br>Velvet_contigs.fa  
    + <br>Number of contigs: 283  
    + <br>Length of the longest contig: 166094  
    + <br>whole:4550675  
    + <br>N50: 54359  
    + </blockquote>  
    + Output file: [[Media:Set1 Contigs.fa|Contigs.fa]]  
       
    + <ul><li><strong>Start to integrate contigs:</strong><br /></li></ul><blockquote><p>The content of the configuration file (CISA.config):</p></blockquote>  
       
    + <table width="457" border="1">  
    + <tr>  
    + <td height="152" align="left" bgcolor="#66CCFF">  
    + genome=4626205  
    + infile=Contigs.fa  
    + outfile=CISA.fa  
    + nucmer=path/nucmer  
    + R2_Gap=0.95  
    + CISA=path/CISA1.0  
    + makeblastdb=path/makeblastdb  
    + blastn=path/blastn  
    + </td>  
    + </tr>  
    + </table>  
       
    + <blockquote><p>The genome is set to 4626205 based on the genome size of Abyss.<br /></p></blockquote>  
    + <ul><li><strong>Command:</strong><br /></li></ul><blockquote><p>&gt;python CISA.py CISA.config</p></blockquote>  
       
    + Input file: [[Media:Set1 Contigs.fa|Contigs.fa]]  
       
    + Output file: [[Media:CISA_Set1_Contigs.fa|CISA.fa]]