Version Differences for HGAP

(Discard Lower-case bases)
(Discard Lower-case bases)
Line 351:
  |-    |- 
  |# mismatches per 100kbp||0.02||0.06||9.98||6.42    |# mismatches per 100kbp||0.02||0.06||9.98||6.42 
       
    + =DataSet1=  
    + We used all SMRT cells and randomly selected four and six SMRT cells three times for each, and access the correctness by Quast.  
       
    + ==Performance==  
    + {| {{table}} border="1"  
    + | align="center" style="background:#f0f0f0;"|'''Statistics without reference '''  
    + | align="center" style="background:#f0f0f0;"|'''All Data'''  
    + | align="center" style="background:#f0f0f0;"|'''4 SMRT cells : 1st Set'''  
    + | align="center" style="background:#f0f0f0;"|'''4 SMRT cells : 2nd Set'''  
    + | align="center" style="background:#f0f0f0;"|'''4 SMRT cells : 3rd Set'''  
    + | align="center" style="background:#f0f0f0;"|'''6 SMRT cells : 1st Set'''  
    + | align="center" style="background:#f0f0f0;"|'''6 SMRT cells : 2nd Set'''  
    + | align="center" style="background:#f0f0f0;"|'''6 SMRT cells : 3rd Set'''  
    + |-  
    + |# contigs||16||10||14||16||9||18||13  
    + |-  
    + |Largest contig||2 198 457||3 4848 77||1 936 831||1 948 632||2 104 087||1 169 224||1 439 551  
    + |-  
    + |Total length||4 808 733||4 706 800||4 705 398||4 745 036||4 741 512||4 814 718||4 749 785  
    + |-  
    + |N50||1 005 770 ||3 484 877||966 809||1 434 284||1 655 500||676 526||1 268 010  
    + |-  
    + | style="background:#f0f0f0;"| '''Misassemblies'''||||||||||||||  
    + |-  
    + |# misassemblies||19||9||12||15||14||17||11  
    + |-  
    + |Misassembled contigs length ||2 939 040||3 530 352||2 949 761||3 653 461||3 820 624||2 387 129||3 986 402  
    + |-  
    + | style="background:#f0f0f0;"| '''Mismatches'''||||||||||||||  
    + |-  
    + |# mismatches per 100kbp||0.8||0.43||0.58||1.36||0.15||0.95||0.58  
    + |-  
    + |# indels per 100kbp||5.71||2.98||4.45||9.56||1.77||8.02||6.88  
    + |-  
    + |# N's per 100kbp ||0||0||0||0||0||0||0  
    + |-  
    + | style="background:#f0f0f0;"| '''Genome Statistics'''||||||||||||||  
    + |-  
    + |Genome fraction(%) ||100||100||99.815||99.87||100||99.995||99.979  
    + |-  
    + |Duplication ratio ||1.037||1.016||1.017||1.025||1.022||1.038||1.025  
    + |-  
    + |# genes ||4494+3 part||4494+3 part||4480+7 part||4485+9 part||4494+3 part||4493+4 part||4492+5 part  
    + |-  
    + |NGA50 ||615 234||1 205 052||572 342||875 953||844 482||633 220||1 267 242  
    + |-  
    + |'''Running Time'''||19hr 06m||13hr 34m||13hr 21m||12hr 38m||21hr 28m||22hr 56m||22hr 07m  
    + |-  
    + |}  
       
       
       
    + ==Discard Unconvincing Contigs==  
    + We aligned subreads to contigs, and discarded the contigs with fewer than 100 reads aligned.  
    + ===Performance===  
    + {| {{table}} border="1"  
    + | align="center" style="background:#f0f0f0;"|'''Statistics without reference '''  
    + | align="center" style="background:#f0f0f0;"|'''All Data'''  
    + | align="center" style="background:#f0f0f0;"|'''4 SMRT cells : 1st Set'''  
    + | align="center" style="background:#f0f0f0;"|'''4 SMRT cells : 2nd Set'''  
    + | align="center" style="background:#f0f0f0;"|'''4 SMRT cells : 3rd Set'''  
    + | align="center" style="background:#f0f0f0;"|'''6 SMRT cells : 1st Set'''  
    + | align="center" style="background:#f0f0f0;"|'''6 SMRT cells : 2nd Set'''  
    + | align="center" style="background:#f0f0f0;"|'''6 SMRT cells : 3rd Set'''  
    + |-  
    + |# contigs||7||8||10||12||4||9||12  
    + |-  
    + |Largest contig||2 198 457||3 4848 77||1 936 831||1 948 632||2 104 087||1 169 224||1 439 551  
    + |-  
    + |Total length||4 706 061||4 674 582||4 659 277||4 682 754||4 680 475||4 702 993||4 739 366  
    + |-  
    + |N50||1 005 770 ||3 484 877||966 809||1 434 284||1 655 500||676 526||1 268 010  
    + |-  
    + | style="background:#f0f0f0;"| '''Misassemblies'''||||||||||||||  
    + |-  
    + |# misassemblies||10||7||8||9||9||8||10  
    + |-  
    + |Misassembled contigs length ||2 836 368||3 498 134||2 903 640||3 591 179||3 759 587||2 275 404||3 975 983  
    + |-  
    + | style="background:#f0f0f0;"| '''Mismatches'''||||||||||||||  
    + |-  
    + |# mismatches per 100kbp||0.8||0.43||0.45||1.27||0.15||0.75||0.58  
    + |-  
    + |# indels per 100kbp||5.71||2.98||3.56||8.72||1.77||6.06||6.88  
    + |-  
    + |# N's per 100kbp ||0||0||0||0||0||0||0  
    + |-  
    + | style="background:#f0f0f0;"| '''Genome Statistics'''||||||||||||||  
    + |-  
    + |Genome fraction(%) ||100||100||99.798||99.87||100||99.995||99.979  
    + |-  
    + |Duplication ratio ||1.014||1.009||1.006||1.012||1.009||1.014||1.023  
    + |-  
    + |# genes ||4494+3 part||4494+3 part||4479+8 part||4485+9 part||4494+3 part||4493+4 part||4492+5 part  
    + |-  
    + |NGA50 ||615 234||1 205 052||572 342||875 953||844 482||633 220||1 267 242  
    + |-  
    + |}  
       
       
       
       
       
    + ==Discard Lower-case bases ==  
    + After discarding unconvincing contigs, we discarded low quality bases which present in lower-case from contigs two-side ends.  
       
    + ===Performance===  
    + {| {{table}} border="1"  
    + | align="center" style="background:#f0f0f0;"|'''Statistics without reference '''  
    + | align="center" style="background:#f0f0f0;"|'''All Data'''  
    + | align="center" style="background:#f0f0f0;"|'''4 SMRT cells : 1st Set'''  
    + | align="center" style="background:#f0f0f0;"|'''4 SMRT cells : 2nd Set'''  
    + | align="center" style="background:#f0f0f0;"|'''4 SMRT cells : 3rd Set'''  
    + | align="center" style="background:#f0f0f0;"|'''6 SMRT cells : 1st Set'''  
    + | align="center" style="background:#f0f0f0;"|'''6 SMRT cells : 2nd Set'''  
    + | align="center" style="background:#f0f0f0;"|'''6 SMRT cells : 3rd Set'''  
    + |-  
    + |# contigs||7||8||10||12||4||9||12  
    + |-  
    + |Largest contig||2 196 495||3 478 799||1 936 007||1 948 495||2 100 388||1 165 497||1 438 506  
    + |-  
    + |Total length||4 694 972||4 662 655||4 649 216||4 657 587||4 668 899||4 681 301||4 714 790  
    + |-  
    + |N50||1 005 009||3 478 799||964 998||1 433 016||1 654 501||375 502||1 266 511  
    + |-  
    + | style="background:#f0f0f0;"| '''Misassemblies'''||||||||||||||  
    + |-  
    + |# misassemblies||9||9||7||8||9||7||8  
    + |-  
    + |Misassembled contigs length ||2 210 994||3 490 490||2 901 005||3 496 520||3 754 889||2 256 498||3 197 010  
    + |-  
    + | style="background:#f0f0f0;"| '''Mismatches'''||||||||||||||  
    + |-  
    + |# mismatches per 100kbp||0.63||0.28||0.22||0.91||0.15||0.54||0.47  
    + |-  
    + |# indels per 100kbp||5.02||2.55||1.84||7.08||1.68||4.91||6.12  
    + |-  
    + |# N's per 100kbp ||0||0||0||0||0||0||0  
    + |-  
    + | style="background:#f0f0f0;"| '''Genome Statistics'''||||||||||||||  
    + |-  
    + |Genome fraction(%) ||100||99.842||99.776||99.889||100||99.985||99.979  
    + |-  
    + |Duplication ratio ||1.012||1.008||1.005||1.006||1.006||1.009||1.018  
    + |-  
    + |# genes ||4494+3 part||4485+6 part||4478+9 part||4482+11 part||4494+3 part||4493+4 part||4492+5 part  
    + |-  
    + |NGA50 ||614 657||1 088 544||572 342||875 453||843 983||632 720||1 265 743  
    + |-  
    + |}  
  |-    |- 
  |# indels per 100kbp||0.77||0.56||0.85||1.45    |# indels per 100kbp||0.77||0.56||0.85||1.45