(→Discard Lower-case bases)
|
(→Discard Lower-case bases)
|
Line 351: | |||
|- | |- | ||
|# mismatches per 100kbp||0.02||0.06||9.98||6.42 | |# mismatches per 100kbp||0.02||0.06||9.98||6.42 | ||
+ | =DataSet1= | ||
+ | We used all SMRT cells and randomly selected four and six SMRT cells three times for each, and access the correctness by Quast. | ||
+ | ==Performance== | ||
+ | {| {{table}} border="1" | ||
+ | | align="center" style="background:#f0f0f0;"|'''Statistics without reference ''' | ||
+ | | align="center" style="background:#f0f0f0;"|'''All Data''' | ||
+ | | align="center" style="background:#f0f0f0;"|'''4 SMRT cells : 1st Set''' | ||
+ | | align="center" style="background:#f0f0f0;"|'''4 SMRT cells : 2nd Set''' | ||
+ | | align="center" style="background:#f0f0f0;"|'''4 SMRT cells : 3rd Set''' | ||
+ | | align="center" style="background:#f0f0f0;"|'''6 SMRT cells : 1st Set''' | ||
+ | | align="center" style="background:#f0f0f0;"|'''6 SMRT cells : 2nd Set''' | ||
+ | | align="center" style="background:#f0f0f0;"|'''6 SMRT cells : 3rd Set''' | ||
+ | |- | ||
+ | |# contigs||16||10||14||16||9||18||13 | ||
+ | |- | ||
+ | |Largest contig||2 198 457||3 4848 77||1 936 831||1 948 632||2 104 087||1 169 224||1 439 551 | ||
+ | |- | ||
+ | |Total length||4 808 733||4 706 800||4 705 398||4 745 036||4 741 512||4 814 718||4 749 785 | ||
+ | |- | ||
+ | |N50||1 005 770 ||3 484 877||966 809||1 434 284||1 655 500||676 526||1 268 010 | ||
+ | |- | ||
+ | | style="background:#f0f0f0;"| '''Misassemblies'''|||||||||||||| | ||
+ | |- | ||
+ | |# misassemblies||19||9||12||15||14||17||11 | ||
+ | |- | ||
+ | |Misassembled contigs length ||2 939 040||3 530 352||2 949 761||3 653 461||3 820 624||2 387 129||3 986 402 | ||
+ | |- | ||
+ | | style="background:#f0f0f0;"| '''Mismatches'''|||||||||||||| | ||
+ | |- | ||
+ | |# mismatches per 100kbp||0.8||0.43||0.58||1.36||0.15||0.95||0.58 | ||
+ | |- | ||
+ | |# indels per 100kbp||5.71||2.98||4.45||9.56||1.77||8.02||6.88 | ||
+ | |- | ||
+ | |# N's per 100kbp ||0||0||0||0||0||0||0 | ||
+ | |- | ||
+ | | style="background:#f0f0f0;"| '''Genome Statistics'''|||||||||||||| | ||
+ | |- | ||
+ | |Genome fraction(%) ||100||100||99.815||99.87||100||99.995||99.979 | ||
+ | |- | ||
+ | |Duplication ratio ||1.037||1.016||1.017||1.025||1.022||1.038||1.025 | ||
+ | |- | ||
+ | |# genes ||4494+3 part||4494+3 part||4480+7 part||4485+9 part||4494+3 part||4493+4 part||4492+5 part | ||
+ | |- | ||
+ | |NGA50 ||615 234||1 205 052||572 342||875 953||844 482||633 220||1 267 242 | ||
+ | |- | ||
+ | |'''Running Time'''||19hr 06m||13hr 34m||13hr 21m||12hr 38m||21hr 28m||22hr 56m||22hr 07m | ||
+ | |- | ||
+ | |} | ||
+ | ==Discard Unconvincing Contigs== | ||
+ | We aligned subreads to contigs, and discarded the contigs with fewer than 100 reads aligned. | ||
+ | ===Performance=== | ||
+ | {| {{table}} border="1" | ||
+ | | align="center" style="background:#f0f0f0;"|'''Statistics without reference ''' | ||
+ | | align="center" style="background:#f0f0f0;"|'''All Data''' | ||
+ | | align="center" style="background:#f0f0f0;"|'''4 SMRT cells : 1st Set''' | ||
+ | | align="center" style="background:#f0f0f0;"|'''4 SMRT cells : 2nd Set''' | ||
+ | | align="center" style="background:#f0f0f0;"|'''4 SMRT cells : 3rd Set''' | ||
+ | | align="center" style="background:#f0f0f0;"|'''6 SMRT cells : 1st Set''' | ||
+ | | align="center" style="background:#f0f0f0;"|'''6 SMRT cells : 2nd Set''' | ||
+ | | align="center" style="background:#f0f0f0;"|'''6 SMRT cells : 3rd Set''' | ||
+ | |- | ||
+ | |# contigs||7||8||10||12||4||9||12 | ||
+ | |- | ||
+ | |Largest contig||2 198 457||3 4848 77||1 936 831||1 948 632||2 104 087||1 169 224||1 439 551 | ||
+ | |- | ||
+ | |Total length||4 706 061||4 674 582||4 659 277||4 682 754||4 680 475||4 702 993||4 739 366 | ||
+ | |- | ||
+ | |N50||1 005 770 ||3 484 877||966 809||1 434 284||1 655 500||676 526||1 268 010 | ||
+ | |- | ||
+ | | style="background:#f0f0f0;"| '''Misassemblies'''|||||||||||||| | ||
+ | |- | ||
+ | |# misassemblies||10||7||8||9||9||8||10 | ||
+ | |- | ||
+ | |Misassembled contigs length ||2 836 368||3 498 134||2 903 640||3 591 179||3 759 587||2 275 404||3 975 983 | ||
+ | |- | ||
+ | | style="background:#f0f0f0;"| '''Mismatches'''|||||||||||||| | ||
+ | |- | ||
+ | |# mismatches per 100kbp||0.8||0.43||0.45||1.27||0.15||0.75||0.58 | ||
+ | |- | ||
+ | |# indels per 100kbp||5.71||2.98||3.56||8.72||1.77||6.06||6.88 | ||
+ | |- | ||
+ | |# N's per 100kbp ||0||0||0||0||0||0||0 | ||
+ | |- | ||
+ | | style="background:#f0f0f0;"| '''Genome Statistics'''|||||||||||||| | ||
+ | |- | ||
+ | |Genome fraction(%) ||100||100||99.798||99.87||100||99.995||99.979 | ||
+ | |- | ||
+ | |Duplication ratio ||1.014||1.009||1.006||1.012||1.009||1.014||1.023 | ||
+ | |- | ||
+ | |# genes ||4494+3 part||4494+3 part||4479+8 part||4485+9 part||4494+3 part||4493+4 part||4492+5 part | ||
+ | |- | ||
+ | |NGA50 ||615 234||1 205 052||572 342||875 953||844 482||633 220||1 267 242 | ||
+ | |- | ||
+ | |} | ||
+ | ==Discard Lower-case bases == | ||
+ | After discarding unconvincing contigs, we discarded low quality bases which present in lower-case from contigs two-side ends. | ||
+ | ===Performance=== | ||
+ | {| {{table}} border="1" | ||
+ | | align="center" style="background:#f0f0f0;"|'''Statistics without reference ''' | ||
+ | | align="center" style="background:#f0f0f0;"|'''All Data''' | ||
+ | | align="center" style="background:#f0f0f0;"|'''4 SMRT cells : 1st Set''' | ||
+ | | align="center" style="background:#f0f0f0;"|'''4 SMRT cells : 2nd Set''' | ||
+ | | align="center" style="background:#f0f0f0;"|'''4 SMRT cells : 3rd Set''' | ||
+ | | align="center" style="background:#f0f0f0;"|'''6 SMRT cells : 1st Set''' | ||
+ | | align="center" style="background:#f0f0f0;"|'''6 SMRT cells : 2nd Set''' | ||
+ | | align="center" style="background:#f0f0f0;"|'''6 SMRT cells : 3rd Set''' | ||
+ | |- | ||
+ | |# contigs||7||8||10||12||4||9||12 | ||
+ | |- | ||
+ | |Largest contig||2 196 495||3 478 799||1 936 007||1 948 495||2 100 388||1 165 497||1 438 506 | ||
+ | |- | ||
+ | |Total length||4 694 972||4 662 655||4 649 216||4 657 587||4 668 899||4 681 301||4 714 790 | ||
+ | |- | ||
+ | |N50||1 005 009||3 478 799||964 998||1 433 016||1 654 501||375 502||1 266 511 | ||
+ | |- | ||
+ | | style="background:#f0f0f0;"| '''Misassemblies'''|||||||||||||| | ||
+ | |- | ||
+ | |# misassemblies||9||9||7||8||9||7||8 | ||
+ | |- | ||
+ | |Misassembled contigs length ||2 210 994||3 490 490||2 901 005||3 496 520||3 754 889||2 256 498||3 197 010 | ||
+ | |- | ||
+ | | style="background:#f0f0f0;"| '''Mismatches'''|||||||||||||| | ||
+ | |- | ||
+ | |# mismatches per 100kbp||0.63||0.28||0.22||0.91||0.15||0.54||0.47 | ||
+ | |- | ||
+ | |# indels per 100kbp||5.02||2.55||1.84||7.08||1.68||4.91||6.12 | ||
+ | |- | ||
+ | |# N's per 100kbp ||0||0||0||0||0||0||0 | ||
+ | |- | ||
+ | | style="background:#f0f0f0;"| '''Genome Statistics'''|||||||||||||| | ||
+ | |- | ||
+ | |Genome fraction(%) ||100||99.842||99.776||99.889||100||99.985||99.979 | ||
+ | |- | ||
+ | |Duplication ratio ||1.012||1.008||1.005||1.006||1.006||1.009||1.018 | ||
+ | |- | ||
+ | |# genes ||4494+3 part||4485+6 part||4478+9 part||4482+11 part||4494+3 part||4493+4 part||4492+5 part | ||
+ | |- | ||
+ | |NGA50 ||614 657||1 088 544||572 342||875 453||843 983||632 720||1 265 743 | ||
+ | |- | ||
+ | |} | ||
|- | |- | ||
|# indels per 100kbp||0.77||0.56||0.85||1.45 | |# indels per 100kbp||0.77||0.56||0.85||1.45 |