Version Differences for HGAP

(Discard Lower-case bases)
Line 351:
  |-    |- 
  |# mismatches per 100kbp||0.02||0.06||9.98||6.42    |# mismatches per 100kbp||0.02||0.06||9.98||6.42 
       
- =DataSet1=      
- We used all SMRT cells and randomly selected four and six SMRT cells three times for each, and access the correctness by Quast.      
       
- ==Performance==      
- {| {{table}} border="1"      
- | align="center" style="background:#f0f0f0;"|'''Statistics without reference '''      
- | align="center" style="background:#f0f0f0;"|'''All Data'''      
- | align="center" style="background:#f0f0f0;"|'''4 SMRT cells : 1st Set'''      
- | align="center" style="background:#f0f0f0;"|'''4 SMRT cells : 2nd Set'''      
- | align="center" style="background:#f0f0f0;"|'''4 SMRT cells : 3rd Set'''      
- | align="center" style="background:#f0f0f0;"|'''6 SMRT cells : 1st Set'''      
- | align="center" style="background:#f0f0f0;"|'''6 SMRT cells : 2nd Set'''      
- | align="center" style="background:#f0f0f0;"|'''6 SMRT cells : 3rd Set'''      
- |-      
- |# contigs||16||10||14||16||9||18||13      
- |-      
- |Largest contig||2 198 457||3 4848 77||1 936 831||1 948 632||2 104 087||1 169 224||1 439 551      
- |-      
- |Total length||4 808 733||4 706 800||4 705 398||4 745 036||4 741 512||4 814 718||4 749 785      
- |-      
- |N50||1 005 770 ||3 484 877||966 809||1 434 284||1 655 500||676 526||1 268 010      
- |-      
- | style="background:#f0f0f0;"| '''Misassemblies'''||||||||||||||      
- |-      
- |# misassemblies||19||9||12||15||14||17||11      
- |-      
- |Misassembled contigs length ||2 939 040||3 530 352||2 949 761||3 653 461||3 820 624||2 387 129||3 986 402      
- |-      
- | style="background:#f0f0f0;"| '''Mismatches'''||||||||||||||      
- |-      
- |# mismatches per 100kbp||0.8||0.43||0.58||1.36||0.15||0.95||0.58      
- |-      
- |# indels per 100kbp||5.71||2.98||4.45||9.56||1.77||8.02||6.88      
- |-      
- |# N's per 100kbp ||0||0||0||0||0||0||0      
- |-      
- | style="background:#f0f0f0;"| '''Genome Statistics'''||||||||||||||      
- |-      
- |Genome fraction(%) ||100||100||99.815||99.87||100||99.995||99.979      
- |-      
- |Duplication ratio ||1.037||1.016||1.017||1.025||1.022||1.038||1.025      
- |-      
- |# genes ||4494+3 part||4494+3 part||4480+7 part||4485+9 part||4494+3 part||4493+4 part||4492+5 part      
- |-      
- |NGA50 ||615 234||1 205 052||572 342||875 953||844 482||633 220||1 267 242      
- |-      
- |'''Running Time'''||19hr 06m||13hr 34m||13hr 21m||12hr 38m||21hr 28m||22hr 56m||22hr 07m      
- |-      
- |}      
       
       
       
- ==Discard Unconvincing Contigs==      
- We aligned subreads to contigs, and discarded the contigs with fewer than 100 reads aligned.      
- ===Performance===      
- {| {{table}} border="1"      
- | align="center" style="background:#f0f0f0;"|'''Statistics without reference '''      
- | align="center" style="background:#f0f0f0;"|'''All Data'''      
- | align="center" style="background:#f0f0f0;"|'''4 SMRT cells : 1st Set'''      
- | align="center" style="background:#f0f0f0;"|'''4 SMRT cells : 2nd Set'''      
- | align="center" style="background:#f0f0f0;"|'''4 SMRT cells : 3rd Set'''      
- | align="center" style="background:#f0f0f0;"|'''6 SMRT cells : 1st Set'''      
- | align="center" style="background:#f0f0f0;"|'''6 SMRT cells : 2nd Set'''      
- | align="center" style="background:#f0f0f0;"|'''6 SMRT cells : 3rd Set'''      
- |-      
- |# contigs||7||8||10||12||4||9||12      
- |-      
- |Largest contig||2 198 457||3 4848 77||1 936 831||1 948 632||2 104 087||1 169 224||1 439 551      
- |-      
- |Total length||4 706 061||4 674 582||4 659 277||4 682 754||4 680 475||4 702 993||4 739 366      
- |-      
- |N50||1 005 770 ||3 484 877||966 809||1 434 284||1 655 500||676 526||1 268 010      
- |-      
- | style="background:#f0f0f0;"| '''Misassemblies'''||||||||||||||      
- |-      
- |# misassemblies||10||7||8||9||9||8||10      
- |-      
- |Misassembled contigs length ||2 836 368||3 498 134||2 903 640||3 591 179||3 759 587||2 275 404||3 975 983      
- |-      
- | style="background:#f0f0f0;"| '''Mismatches'''||||||||||||||      
- |-      
- |# mismatches per 100kbp||0.8||0.43||0.45||1.27||0.15||0.75||0.58      
- |-      
- |# indels per 100kbp||5.71||2.98||3.56||8.72||1.77||6.06||6.88      
- |-      
- |# N's per 100kbp ||0||0||0||0||0||0||0      
- |-      
- | style="background:#f0f0f0;"| '''Genome Statistics'''||||||||||||||      
- |-      
- |Genome fraction(%) ||100||100||99.798||99.87||100||99.995||99.979      
- |-      
- |Duplication ratio ||1.014||1.009||1.006||1.012||1.009||1.014||1.023      
- |-      
- |# genes ||4494+3 part||4494+3 part||4479+8 part||4485+9 part||4494+3 part||4493+4 part||4492+5 part      
- |-      
- |NGA50 ||615 234||1 205 052||572 342||875 953||844 482||633 220||1 267 242      
- |-      
- |}      
       
       
       
       
       
- ==Discard Lower-case bases ==      
- After discarding unconvincing contigs, we discarded low quality bases which present in lower-case from contigs two-side ends.      
       
- ===Performance===      
- {| {{table}} border="1"      
- | align="center" style="background:#f0f0f0;"|'''Statistics without reference '''      
- | align="center" style="background:#f0f0f0;"|'''All Data'''      
- | align="center" style="background:#f0f0f0;"|'''4 SMRT cells : 1st Set'''      
- | align="center" style="background:#f0f0f0;"|'''4 SMRT cells : 2nd Set'''      
- | align="center" style="background:#f0f0f0;"|'''4 SMRT cells : 3rd Set'''      
- | align="center" style="background:#f0f0f0;"|'''6 SMRT cells : 1st Set'''      
- | align="center" style="background:#f0f0f0;"|'''6 SMRT cells : 2nd Set'''      
- | align="center" style="background:#f0f0f0;"|'''6 SMRT cells : 3rd Set'''      
- |-      
- |# contigs||7||8||10||12||4||9||12      
- |-      
- |Largest contig||2 196 495||3 478 799||1 936 007||1 948 495||2 100 388||1 165 497||1 438 506      
- |-      
- |Total length||4 694 972||4 662 655||4 649 216||4 657 587||4 668 899||4 681 301||4 714 790      
- |-      
- |N50||1 005 009||3 478 799||964 998||1 433 016||1 654 501||375 502||1 266 511      
- |-      
- | style="background:#f0f0f0;"| '''Misassemblies'''||||||||||||||      
- |-      
- |# misassemblies||9||9||7||8||9||7||8      
- |-      
- |Misassembled contigs length ||2 210 994||3 490 490||2 901 005||3 496 520||3 754 889||2 256 498||3 197 010      
- |-      
- | style="background:#f0f0f0;"| '''Mismatches'''||||||||||||||      
- |-      
- |# mismatches per 100kbp||0.63||0.28||0.22||0.91||0.15||0.54||0.47      
- |-      
- |# indels per 100kbp||5.02||2.55||1.84||7.08||1.68||4.91||6.12      
- |-      
- |# N's per 100kbp ||0||0||0||0||0||0||0      
- |-      
- | style="background:#f0f0f0;"| '''Genome Statistics'''||||||||||||||      
- |-      
- |Genome fraction(%) ||100||99.842||99.776||99.889||100||99.985||99.979      
- |-      
- |Duplication ratio ||1.012||1.008||1.005||1.006||1.006||1.009||1.018      
- |-      
- |# genes ||4494+3 part||4485+6 part||4478+9 part||4482+11 part||4494+3 part||4493+4 part||4492+5 part      
- |-      
- |NGA50 ||614 657||1 088 544||572 342||875 453||843 983||632 720||1 265 743      
- |-      
- |}      
  |-    |- 
  |# indels per 100kbp||0.77||0.56||0.85||1.45    |# indels per 100kbp||0.77||0.56||0.85||1.45