Self-correction approach (SCA) was proposed in the ref (Reducing assembly complexity of microbial genomes with single-molecule sequencin, Genome Biology 2013).
We used all SMRT cells and randomly selected four and six SMRT cells three times for each, and access the correctness by Quast.
Statistics without reference | All Data | 4 SMRT cells : 1st Set | 4 SMRT cells : 2nd Set | 4 SMRT cells : 3rd Set | 6 SMRT cells : 1st Set | 6 SMRT cells : 2nd Set | 6 SMRT cells : 3rd Set |
# contigs | 16 | 10 | 14 | 16 | 9 | 18 | 13 |
Largest contig | 2 198 457 | 3 4848 77 | 1 936 831 | 1 948 632 | 2 104 087 | 1 169 224 | 1 439 551 |
Total length | 4 808 733 | 4 706 800 | 4 705 398 | 4 745 036 | 4 741 512 | 4 814 718 | 4 749 785 |
N50 | 1 005 770 | 3 484 877 | 966 809 | 1 434 284 | 1 655 500 | 676 526 | 1 268 010 |
Misassemblies | |||||||
# misassemblies | 19 | 9 | 12 | 15 | 14 | 17 | 11 |
Misassembled contigs length | 2 939 040 | 3 530 352 | 2 949 761 | 3 653 461 | 3 820 624 | 2 387 129 | 3 986 402 |
Mismatches | |||||||
# mismatches per 100kbp | 0.8 | 0.43 | 0.58 | 1.36 | 0.15 | 0.95 | 0.58 |
# indels per 100kbp | 5.71 | 2.98 | 4.45 | 9.56 | 1.77 | 8.02 | 6.88 |
# N's per 100kbp | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Genome Statistics | |||||||
Genome fraction(%) | 100 | 100 | 99.815 | 99.87 | 100 | 99.995 | 99.979 |
Duplication ratio | 1.037 | 1.016 | 1.017 | 1.025 | 1.022 | 1.038 | 1.025 |
# genes | 4494+3 part | 4494+3 part | 4480+7 part | 4485+9 part | 4494+3 part | 4493+4 part | 4492+5 part |
NGA50 | 615 234 | 1 205 052 | 572 342 | 875 953 | 844 482 | 633 220 | 1 267 242 |
Running Time | 19hr 06m | 13hr 34m | 13hr 21m | 12hr 38m | 21hr 28m | 22hr 56m | 22hr 07m |