Version Differences for PBcR

Line 1:
    + ==Performance==  
    + {| {{table}} border="1"  
    + | align="center" style="background:#f0f0f0;"|'''Statistics without reference '''  
    + | align="center" style="background:#f0f0f0;"|'''All Data'''  
    + | align="center" style="background:#f0f0f0;"|'''4 SMRT cells : 1st Set'''  
    + | align="center" style="background:#f0f0f0;"|'''4 SMRT cells : 2nd Set'''  
    + | align="center" style="background:#f0f0f0;"|'''4 SMRT cells : 3rd Set'''  
    + | align="center" style="background:#f0f0f0;"|'''6 SMRT cells : 1st Set'''  
    + | align="center" style="background:#f0f0f0;"|'''6 SMRT cells : 2nd Set'''  
    + | align="center" style="background:#f0f0f0;"|'''6 SMRT cells : 3rd Set'''  
    + | align="center" style="background:#f0f0f0;"|'''8 SMRT cells : 1st Set'''  
    + | align="center" style="background:#f0f0f0;"|'''8 SMRT cells : 2nd Set'''  
    + | align="center" style="background:#f0f0f0;"|'''8 SMRT cells : 3rd Set'''  
    + |-  
    + |# contigs||[[Media:pbcr_d5_all.fa | 1]]||[[Media:pbcr_d5_rg4_1.fa | 1]]||[[Media:pbcr_d5_rg4_2.fa | 1]]||[[Media:pbcr_d5_rg4_3.fa | 5]]||[[Media:pbcr_d5_rg6_1.fa | 2]]||[[Media:pbcr_d5_rg6_2.fa | 2]]||[[Media:pbcr_d5_rg6_3.fa | 1]]||[[Media:pbcr_d5_rg8_1.fa | 4]]||[[Media:pbcr_d5_rg8_2.fa | 1]]||[[Media:pbcr_d5_rg8_3.fa | 2]]  
    + |-  
    + |Largest contig||4 651 604||4 647 117||4 648 057||3 447 068||3 749 516||2 770 859||4 649 699||1 679 082||4 649 323||4 189 785  
    + |-  
    + |Total length||4 651 604||4 647 117||4 648 057||4 661 453||4 645 941||4 657 272||4 649 699||4 655 949||4 649 323||4 652 482  
    + |-  
    + |N50||4 651 604||4 647 117||4 648 057||3 447 068||3 749 516||2 770 859||4 649 699||1 159 845||4 649 323||4 189 785  
    + |-  
    + | style="background:#f0f0f0;"| [[Media: SCA_D5.pdf | '''Misassemblies''']]||||||||||||||||||||  
    + |-  
    + |# misassemblies||9||9||8||9||7||9||10||7||10||8  
    + |-  
    + |Misassembled contigs length ||4 651 604||4 647 117||4 648 057||3 447 068||4 645 941||4 657 272||4 649 699||2 143 406||4 649 323||4 189 785  
    + |-  
    + | style="background:#f0f0f0;"| '''Mismatches'''||||||||||||||||||||  
    + |-  
    + |# mismatches per 100kbp||0.34||1.03||0.69||0.78||0.69||0.56||0.58||0.75||0.84||0.75  
    + |-  
    + |# indels per 100kbp||0.6||7.44||5.78||5.67||1.88||2.72||1.66||1.6||1.9||2.65  
    + |-  
    + |# N's per 100kbp ||0||0.02||0||0||0||0||0||0||0.02||0  
    + |-  
    + | style="background:#f0f0f0;"| '''Genome Statistics'''||||||||||||||||||  
    + |-  
    + |Genome fraction(%) ||100||100||100||99.949||99.956||100||100||99.959||100||99.993  
    + |-  
    + |Duplication ratio ||1.003||1.002||1.002||1.005||1.002||1.005||1.002||1.004||1.002||1.003  
    + |-  
    + |# genes ||4494+3 part||4494+3 part||4494+3 part||4490+5 part||4491+2 part||4495+2 part||4494+3 part||4489+6 part||4494+3 part||4493+4 part  
    + |-  
    + |NGA50 ||2 834 925||949 217||949 242||656 513||2 796 469||1 040 965||3 026 388||949 298||3 027 267||949 289  
    + |-  
    + | style="background:#f0f0f0;"| '''Running Time'''||||||||||||||||||||  
    + |-  
    + |PacBioToCA||48hr 16m||4hr 58m||5hr 48m||5hr 10m||11hr 09m||9hr 34m||10hr 47m||21hr 06m||22hr 05m||21hr 23m  
    + |-  
    + |runCA||15hr 48m||15hr 22m||13hr 50m||11hr 20m||12hr 38m||11hr 44m||13hr 48m||11hr 37m||14hr 36m||13hr 40m  
    + |-  
    + |Total||64hr 04m||20hr 20m||19hr 38m||16hr 30m||23hr 47m||21hr 18m||24hr 35m||32hr 43m||36hr 41m||25hr 03m  
    + |}  
       
       
       
       
    + =Dataset 6 (''E.coli'' K-12 MG1655, 8 SMRT cells)=  
    + We used all SMRT cells and randomly selected four and six SMRT cells three times for each, and evaluated the assemblies by QUAST against the reference genome ([ftp://ftp.ncbi.nih.gov/genomes/Bacteria/Escherichia_coli_K_12_substr__MG1655_uid57779/ NC_000913]) and [[Media: Ec_gene_result.ncbi | Ec_gene_list]]. ([http://sb.nhri.org.tw/comps/quast/Non-hybrid/PBcR_pipeline/D6/report.html more detail])  
       
    + ==Performance==  
    + {| {{table}} border="1"  
    + | align="center" style="background:#f0f0f0;"|'''Statistics without reference '''  
    + | align="center" style="background:#f0f0f0;"|'''All Data'''  
    + | align="center" style="background:#f0f0f0;"|'''4 SMRT cells : 1st Set'''  
    + | align="center" style="background:#f0f0f0;"|'''4 SMRT cells : 2nd Set'''  
    + | align="center" style="background:#f0f0f0;"|'''4 SMRT cells : 3rd Set'''  
    + | align="center" style="background:#f0f0f0;"|'''6 SMRT cells : 1st Set'''  
    + | align="center" style="background:#f0f0f0;"|'''6 SMRT cells : 2nd Set'''  
    + | align="center" style="background:#f0f0f0;"|'''6 SMRT cells : 3rd Set'''  
    + |-  
    + |# contigs||[[Media:pbcr_d6_all.fa | 2]]||[[Media:pbcr_d6_rg4_1.fa | 8]]||[[Media:pbcr_d6_rg4_2.fa | 10]]||[[Media:pbcr_d6_rg4_3.fa | 14]]||[[Media:pbcr_d6_rg6_1.fa | 1]]||[[Media:pbcr_d6_rg6_2.fa | 1]]||[[Media:pbcr_d6_rg6_3.fa | 4]]  
    + |-  
    + |Largest contig||4 278 957||2 277 010||1 213 670||984 459||4 641 350||4 640 250||3 162 440  
    + |-  
    + |Total length||4 650 771||4 648 304||4 644 602||4 656 274||4 641 350||4 640 250||4 653 394  
    + |-  
    + |N50||4 278 957||622 425||800 993||565 251||4 641 350||4 640 250||3 162 440  
    + |-  
    + | style="background:#f0f0f0;"| [[Media:SCA_d6.pdf | '''Misassemblies''']]||||||||||||||  
    + |-  
    + |# misassemblies||9||9||9||8||8||8||9  
    + |-  
    + |Misassembled contigs length ||4 278 957||2 809 129||2 085 482||1 947 163||4 641 350||4 640 250||3 209 090  
    + |-  
    + | style="background:#f0f0f0;"| '''Mismatches'''||||||||||||||  
    + |-  
    + |# mismatches per 100kbp||0.37||2.49||1.88||5.38||0.69||0.67||0.86  
    + |-  
    + |# indels per 100kbp||3.58||53.34||45.82||73.07||10.65||11.28||10.46  
    + |-  
    + |# N's per 100kbp ||0||0.04||0.02||0.09||0||0||0  
    + |-  
    + | style="background:#f0f0f0;"| '''Genome Statistics'''||||||||||||||  
    + |-  
    + |Genome fraction(%) ||99.993||99.733||99.67||99.693||99.972||99.946||99.968  
    + |-  
    + |Duplication ratio ||1.002||1.005||1.006||1.007||1.001||1.001||1.003  
    + |-  
    + |# genes ||4492+5 part||4475+10 part||4467+12 part||4469+13 part||4492+4 part||4491+4 part||4492+4 part  
    + |-  
    + |NGA50 ||859 464||621 281||572 455||436 292||1 098 529||1 096 784||859 502  
    + |-  
    + | style="background:#f0f0f0;"| '''Running Time'''||||||||||||||  
    + |-  
    + |pacBioToCA||20hr 03m||5hr 52m||6hr 05m||5hr 19m||15hr 53m||14hr 47m||15hr 38m  
    + |-  
    + |runCA||15hr 41m||7hr 32m||7hr 10m||5hr 42m||15hr 44m||16hr 02m||13hr 27m  
    + |-  
    + |Total||35hr 44m||13hr 24m||13hr 15m||11hr 01m||31hr 37m||30hr 49m||29hr 05m  
    + |}  
    + [[Media: sca_d6_summary.pdf | '''Misassemblies''']] for Adobe reader.  
    + =Dataset 7, (''M. ruber'' DSM1279, 4 SMRT cells)=  
    + We used all SMRT cells to do assembly and evaluated the assemblies by QUAST against the reference genome ([ftp://ftp.ncbi.nih.gov/genomes/Bacteria/Meiothermus_ruber_DSM_1279_uid46661/ NC_013946]) and [[Media:gene_mruber.ncbi | Mr_gene_list]]. ([http://sb.nhri.org.tw/comps/quast/Non-hybrid/PBcR_pipeline/D7/report.html more detail])  
    + ==Performance==  
    + {| {{table}} border="1"  
    + | align="center" style="background:#f0f0f0;"|'''Statistics without reference '''  
    + | align="center" style="background:#f0f0f0;"|'''All Data'''  
       
    + |-  
    + |# contigs||[[Media:pbcr_d7_all.fa | 2]]  
    + |-  
    + |Largest contig||2 974 307  
    + |-  
    + |Total length||3 100 289  
    + |-  
    + |N50||2 974 307  
    + |-  
    + | style="background:#f0f0f0;"| [[Media:sca_d7.pdf | '''Misassemblies''']]||  
    + |-  
    + |# misassemblies||3  
    + |-  
    + |Misassembled contigs length ||2 974 307  
    + |-  
    + | style="background:#f0f0f0;"| '''Mismatches'''||  
    + |-  
    + |# mismatches per 100kbp||0.23  
    + |-  
    + |# indels per 100kbp||5.01  
    + |-  
    + |# N's per 100kbp ||0.03  
    + |-  
    + | style="background:#f0f0f0;"| '''Genome Statistics'''||  
    + |-  
    + |Genome fraction(%) ||99.883  
    + |-  
    + |Duplication ratio ||1.002  
    + |-  
    + |# genes ||3093+4 part  
    + |-  
    + |NGA50 ||1 707 938  
    + |-  
    + | style="background:#f0f0f0;"| '''Running Time'''||  
    + |-  
    + |pacBioToCA||7hr 35m  
    + |-  
    + |runCA||8hr 7m  
    + |-  
    + |Total||15hr 42m  
    + |}  
       
    + =Dataset 8 (''P. heparinus'' DSM1279, 7 SMRT cells)=  
    + We used all SMRT cells and randomly selected four SMRT cells three times for each, and evaluated the assemblies by QUAST against the reference genome ([ftp://ftp.ncbi.nih.gov/genomes/Bacteria/Pedobacter_heparinus_DSM_2366_uid59111/ NC_013061]) and [[Media:gene_phep.ncbi | Ph_gene_list]]. ([http://sb.nhri.org.tw/comps/quast/Non-hybrid/PBcR_pipeline/D8/report.html more detail])  
    + ==Performance==  
    + {| {{table}} border="1"  
    + | align="center" style="background:#f0f0f0;"|'''Statistics without reference '''  
    + | align="center" style="background:#f0f0f0;"|'''All Data'''  
    + | align="center" style="background:#f0f0f0;"|'''4 SMRT cells : 1st Set'''  
    + | align="center" style="background:#f0f0f0;"|'''4 SMRT cells : 2nd Set'''  
    + | align="center" style="background:#f0f0f0;"|'''4 SMRT cells : 3rd Set'''  
    + |-  
    + |# contigs||[[Media: pbcr_d8_all.fa | 1]]||[[Media: pbcr_d8_rg4_1.fa | 3]]||[[Media: pbcr_d8_rg4_2.fa | 3]]||[[Media: pbcr_d8_rg4_3.fa | 3]]  
    + |-  
    + |Largest contig||5 163 983||2 232 679||2 236 613||2 237 949  
    + |-  
    + |Total length||5 163 983||5 161 276||5 165 518||5 166 563  
    + |-  
    + |N50||5 163 983||2 043 590||2 044 147||2 135 225  
    + |-  
    + | style="background:#f0f0f0;"| [[Media:sca_d8.pdf | '''Misassemblies''']]||||||||  
    + |-  
    + |# misassemblies||0||0||0||0  
    + |-  
    + |Misassembled contigs length ||0||0||0||0  
    + |-  
    + | style="background:#f0f0f0;"| '''Mismatches'''||||||||  
    + |-  
    + |# mismatches per 100kbp||8.41||9.960||8.27||10.29  
    + |-  
    + |# indels per 100kbp||2.19||18.99||13.13||14.01  
    + |-  
    + |# N's per 100kbp ||0||0||0||0  
    + |-  
    + | style="background:#f0f0f0;"| '''Genome Statistics'''||||||||  
    + |-  
    + |Genome fraction(%) ||99.919||99.864||99.907||99.89  
    + |-  
    + |Duplication ratio ||1||1||1.001||1.001  
    + |-  
    + |# genes ||4335+3 part||4330+5 part||4333+5 part||4333+3 part  
    + |-  
    + |NGA50 ||5 163 983 532||2 043 590||2 044 147||2 135 225  
    + |-  
    + | style="background:#f0f0f0;"| '''Running Time'''||||||||  
    + |-  
    + |pacBioToCA||18hr 55m||6hr 27m||6hr 34m||6hr 31m  
    + |-  
    + |runCA||21hr 36m||11hr 39m||12hr 26m||12hr 12m  
    + |-  
    + |Total||40hr 31m||18hr 06m||19hr 00n||18hr 43m  
    + |}  
    + [[Media: sca_d8_summary.pdf | '''Misassemblies''']] for Adobe reader.  
    + =Dataset 9 (''E. coli'' K-12, P4-C2 chemistry, 20 Kbp, 1 SMRT cell)=  
    + We used all SMRT cells and evaluated the assemblies by QUAST against the reference genome ([ftp://ftp.ncbi.nih.gov/genomes/Bacteria/Escherichia_coli_K_12_substr__MG1655_uid57779/ NC_000913]) and [[Media: Ec_gene_result.ncbi | Ec_gene_list]]. ([http://sb.nhri.org.tw/comps/quast/Non-hybrid/PBcR_pipeline/D9/report.html more detail])  
       
    + ==Performance==  
       
    + {| {{table}} border="1"  
    + | align="center" style="background:#f0f0f0;"|'''Statistics without reference '''  
    + | align="center" style="background:#f0f0f0;"|'''All Data'''  
       
    + |-  
    + |# contigs||[[Media:pbcr_d9_all.fa |1]]  
    + |-  
    + |Largest contig||4 656257  
    + |-  
    + |Total length||4 656 257  
    + |-  
    + |N50||4 656 257  
    + |-  
    + | style="background:#f0f0f0;"| [[Media: SCA_D9.pdf | '''Misassemblies''']]||  
    + |-  
    + |# misassemblies||8  
    + |-  
    + |Misassembled contigs length ||4 656 257  
    + |-  
    + | style="background:#f0f0f0;"| '''Mismatches'''||  
    + |-  
    + |# mismatches per 100kbp||0.22  
    + |-  
    + |# indels per 100kbp||13.15  
    + |-  
    + |# N's per 100kbp ||0  
    + |-  
    + | style="background:#f0f0f0;"| '''Genome Statistics'''||  
    + |-  
    + |Genome fraction(%) ||100  
    + |-  
    + |Duplication ratio ||1.004  
    + |-  
    + |# genes ||4494+3 part  
    + |-  
    + |NGA50 ||3 026 094  
    + |-  
    + | style="background:#f0f0f0;"| '''Running Time'''||  
    + |-  
    + |PacBioToCA||13hr 01m  
    + |-  
    + |runCA||17hr 58m  
    + |-  
    + |Total||30hr 59m  
    + |}  
       
       
       
    + We used the latest version of PBcR pipeline ([http://sourceforge.net/projects/wgs-assembler/files/wgs-assembler/wgs-8.2beta/ 8.2beta]). [http://sb.nhri.org.tw/comps/quast/Non-hybrid/PBcR_pipeline/D9_PBcR8.2/report.html more detail]  
       
    + PBcR -pbCNS -length 500 -partitions 200 -l p4c2 -s pacbio.spec -fastq filtered_subreads.fastq genomeSize=4650000  
       
    + ==wgs-8.2beta Performance==  
       
    + {| {{table}} border="1"  
    + | align="center" style="background:#f0f0f0;"|'''Statistics without reference '''  
    + | align="center" style="background:#f0f0f0;"|'''All Data'''  
       
    + |-  
    + |# contigs||[[Media:pbcr_pipeline_d9.fa |2]]  
    + |-  
    + |Largest contig||4 644 060  
    + |-  
    + |Total length||4 652 830  
    + |-  
    + |N50||4 644 060  
    + |-  
    + | style="background:#f0f0f0;"| [[Media: pbcr_pipeline_d9.pdf | '''Misassemblies''']]||  
    + |-  
    + |# misassemblies||8  
    + |-  
    + |Misassembled contigs length ||4 644 060  
    + |-  
    + | style="background:#f0f0f0;"| '''Mismatches'''||  
    + |-  
    + |# mismatches per 100kbp||0.17  
    + |-  
    + |# indels per 100kbp||31.7  
    + |-  
    + |# N's per 100kbp ||0  
    + |-  
    + | style="background:#f0f0f0;"| '''Genome Statistics'''||  
    + |-  
    + |Genome fraction(%) ||100  
    + |-  
    + |Duplication ratio ||1.003  
    + |-  
    + |# genes ||4494+3 part  
    + |-  
    + |NGA50 ||3 025 484  
    + |-  
    + | style="background:#f0f0f0;"| '''Running time'''||  
    + |-  
    + |Running time|| 23m  
    + |}