Guide
Examples
= Prerequisites = * Linux 64-bit environment [http://www.centos.org CentOS] * Python 2.X [http://www.python.org Python]<br> - Under centos: yum install python * MUMmer 3.22 or higher [http://mummer.sourceforge.net MUMmer] * Blast 2.2.25+ or higher [http://blast.ncbi.nlm.nih.gov Blast+] = Installation = Just to unpack the tar file.<br> Available commands: tar xvf CISA.tar Execute access: chmod 755 -R CICA1.0 = Commands and configuration = == Merge.py == Mergy.pu can convert data format to fit CISA. '''This is a essentail pre-work'''. Available commands: ./Merge.py Merge_config The content of configuration file: count=3 data=assembly1.fasta,title=Contig_m1 data=assembly2.fasta,title=Contig_m2 data=assembly3.fasta,title=Contig_m3 Master_file=Contigs_m.fa min_length=100 (default:100) Gap=11 #count<br>The number of dataset you would like to merge #min_length<br>It means that contig which is longer than 100 will be conserved. #Gap<br>It's a optional variable.<br>if Gap attends, it will be used to split scoffolding by continuous 11 N.<br>if Gap is absent, the program will only merge data. If your data is from windows base. Please convert it to linux format. dos2unix assembly1.fasta assembly2.fasta assembly3.fasta == CISA.py == Available commands: ./CISA.py CISA_config The content of configuration file: genome=genome size infile=file outfile=file nucmer=your installed path/nucmer R2_Gap=0.95 (default:0.95) CISA=your installed path/CISA1.0 makeblastdb=your installed path/makeblastdb blastn=your installed path/blastn #genome<br/>We suggest to use the longest length which is between attended contigs as genome variable.<br/>The break point of CISA will be set to 1.1 * genome variable. #infile<br/>File name with input. #nucmer<br/>Executive file for nucmer. If nucmer has beed set into the path, nucmer variable can be skipped. #makeblastdb<br/>Executive file for makeblastdb. If makeblastdb has beed set into the path, makeblastdb variable can be skipped. #blastn<br/>Executive file for blastn. If blastn has beed set into the path, blastn variable can be skipped. #CISA<br/>Home directory of CISA. #R2_Gap<br/>Tolerant amount of gap during CISA2 step. = Example = == Data Set == [[Ecoli|Ecoli]] == Merge Contigs == The content of the configuration file: <table width="457" border="1"> <tr> <td height="152" align="left" bgcolor="#66CCFF"> count=5 data=Abyss_contigs.fa,title=Abyss data=Edena_contigs.fa,title=Edena data=SOAPdenovo_contigs.fa,title=SOAP data=CLC_contigs.fa,title=CLC data=Velvet_contigs.fa,title=Velvet Master_file=Contigs.fa </td> </tr> </table> [[Media:Abyss contigs.fa|Abyss_contigs.fa]], [[Media:Edena contigs.fa|Edena_contigs.fa]], [[Media:SOAPdenovo contigs.fa|SOAPdenovo_contigs.fa]], [[Media:CLC contigs.fa|CLC_contigs.fa]], [[Media:Velvetcontigs.fa|Velvet_contigs.fa]] *Command:<br> ./Merge.py Merge_config == Start to integrate == The content of the myconfig: <table width="457" border="1"> <tr> <td height="152" align="left" bgcolor="#66CCFF"> genome=4626205 infile=Contigs.fa nucmer=path/nucmer R2_Gap=0.95 CISA=path/CISA1.0 makeblastdb=path/makeblastdb blastn=path/blastn </td> </tr> </table> *4626205 which is the longest whole genome between different result from 5 assemblers is set into genome variable. You can find the value in Merge_info. *Command:<br> ./CISA.py CISA_config