Hierarchical Genome Assembly Process (HGAP) was proposed in the ref (Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nat Meth 2013).
We downloaded smrtanalysis-2.0.1 from DevNet, you can run the RS_HGAP_Assembly.1 and RS_Modification_and_Motif_Analysis.1 protocols on SMRT Portal or execute by command line.
Prepare data for HGAP Protocol
1. Build input XML file (detail step please refer to the tutorial)
2. Build HGAP parameters XML file : HGAP2.0.xml. We used default parameters setting mostly, and set minSubReadLength = 50, readScore = 0.75, minLength = 50.
3. execute HGAP protocol.
smrtpipe.py --params=HGAP.xml xml:input.xml
Import reference
1. After execute HGAP Protocol, there will be generating a polished_assemble.fasta.gz in "data" folder. The file serves as a reference for mapping the single pass reads as specified by the original filter parameters to the draft assembly to generate a higher accurate consensus sequence via Quiver
2. Import the reference by SMRT portal.
3. SMRT protal will generate a reference folder under /opt/smrtanalysis/common/userdata.d/references/XXXXXX. You can copy the whole folder to your working directory, or asign the path in the Quiver.xml
Prepare for Quiver
1. Build Quiver parameters XML file : Quiver.xml. We set minSubReadLength = 50, readScore = 0.75, minLength = 50, and the others we used default value.
2. execute Quiver protocol.
smrtpipe.py --params=Quiver.xml xml:input.xml
We randomly selected four, six and eight SMRT cells three times for each, and evaluated the assemblies by QUAST against the reference genome (NC_000913) and Ec_gene_list.
Statistics without reference | 4 SMRT cells : 1st Set | 4 SMRT cells : 2nd Set | 4 SMRT cells : 3rd Set | 6 SMRT cells : 1st Set | 6 SMRT cells : 2nd Set | 6 SMRT cells : 3rd Set | 8 SMRT cells : 1st Set | 8 SMRT cells : 2nd Set | 8 SMRT cells : 3rd Set |
# contigs | 5 | 10 | 4 | 11 | 7 | 8 | 6 | 10 | 5 |
Largest contig | 3 770 578 | 4 106 852 | 4 644 754 | 3 785 116 | 4 647 724 | 3 287 965 | 4 649 322 | 4 623 068 | 4 649 308 |
Total length | 4 684 069 | 4 723 363 | 4 671 153 | 4 736 342 | 4 711 060 | 4 708 831 | 4 706 433 | 4 731 334 | 4 691 736 |
N50 | 3 770 578 | 4 106 852 | 4 644 754 | 3 785 116 | 4 647 724 | 3 287 965 | 4 649 322 | 4 623 068 | 4 649 308 |
Misassemblies | |||||||||
# misassemblies | 10 | 13 | 13 | 15 | 12 | 11 | 11 | 16 | 12 |
Misassembled contigs length | 3 788 648 | 4 700 016 | 4 671 153 | 4 726 005 | 4 685 712 | 3 339 030 | 4 694 303 | 4 698 068 | 4 649 308 |
Mismatches | |||||||||
# mismatches per 100kbp | 0.47 | 0.56 | 0.37 | 0.19 | 0.11 | 0.15 | 0.13 | 0.43 | 0.17 |
# indels per 100kbp | 1.08 | 4.44 | 0.22 | 1.66 | 0.63 | 0.65 | 0.19 | 4.59 | 0.56 |
# N's per 100kbp | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Genome Statistics | |||||||||
Genome fraction(%) | 100 | 100 | 99.994 | 99.999 | 100 | 100 | 100 | 99.99 | 100 |
Duplication ratio | 1.01 | 1.018 | 1.007 | 1.021 | 1.031 | 1.015 | 1.012 | 1.02 | 1.011 |
# genes | 4495+2 part | 4495+2 part | 4493+3 part | 4494+3 part | 4495+2 part | 4495+2 part | 4495+2 part | 4494+3 part | 4495+2 part |
NGA50 | 1 207 217 | 2 558 505 | 1 640 882 | 2 888 022 | 2 834 458 | 1 298 912 | 1 477 605 | 1 344 200 | 2 995 586 |
Running Time | ?hr ?m | ?hr ?m | ?hr ?m | 21hr 05m | 19hr 32m | 21hr 01m | 26hr 46m | 27hr 52m | 26hr 13m |
We aligned subreads to contigs, and discarded the contigs with fewer than 100 reads aligned.
Statistics without reference | 4 SMRT cells : 1st Set | 4 SMRT cells : 2nd Set | 4 SMRT cells : 3rd Set | 6 SMRT cells : 1st Set | 6 SMRT cells : 2nd Set | 6 SMRT cells : 3rd Set | 8 SMRT cells : 1st Set | 8 SMRT cells : 2nd Set | 8 SMRT cells : 3rd Set |
# contigs | 2 | 6 | 1 | 5 | 2 | 4 | 2 | 3 | 2 |
Largest contig | 3 770 578 | 4 106 852 | 4 644 754 | 3 785 116 | 4 647 724 | 3 287 965 | 4 649 322 | 4 623 068 | 4 649 308 |
Total length | 4 651 736 | 4 691 077 | 4 644 754 | 4 675 943 | 4 660 074 | 4 671 197 | 4 664 502 | 4 661 980 | 4 661 084 |
N50 | 3 770 578 | 4 106 852 | 4 644 754 | 3 785 116 | 4 647 724 | 3 287 965 | 4 649 322 | 4 623 068 | 4 649 308 |
Misassemblies | |||||||||
# misassemblies | 8 | 10 | 10 | 10 | 8 | 7 | 8 | 9 | 9 |
Misassembled contigs length | 3 770 578 | 4 677 561 | 4 644 754 | 4 675 943 | 4 647 724 | 3 301 396 | 4 664 502 | 4 639 404 | 4 649 308 |
Mismatches | |||||||||
# mismatches per 100kbp | 0.15 | 0.5 | 0.37 | 0.22 | 0.11 | 0.15 | 0.13 | 0.22 | 0.17 |
# indels per 100kbp | 0.47 | 3.34 | 0.22 | 1.47 | 0.63 | 0.65 | 0.19 | 1.44 | 0.56 |
# N's per 100kbp | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Genome Statistics | |||||||||
Genome fraction(%) | 100 | 100 | 99.994 | 99.999 | 100 | 100 | 100 | 99.99 | 100 |
Duplication ratio | 1.003 | 1.011 | 1.002 | 1.008 | 1.005 | 1.007 | 1.005 | 1.005 | 1.005 |
# genes | 4494+3 part | 4495+2 part | 4493+3 part | 4493+4 part | 4495+2 part | 4495+2 part | 4495+2 part | 4493+4 part | 4495+2 part |
NGA50 | 1 207 217 | 2 558 505 | 1 640 882 | 2 888 022 | 2 834 458 | 1 298 912 | 1 477 605 | 1 344 200 | 2 995 586 |
After discarding unconvincing contigs, we discarded low quality bases which present in lower-case from contigs two-side ends. more detail
Statistics without reference | 4 SMRT cells : 1st Set | 4 SMRT cells : 2nd Set | 4 SMRT cells : 3rd Set | 6 SMRT cells : 1st Set | 6 SMRT cells : 2nd Set | 6 SMRT cells : 3rd Set | 8 SMRT cells : 1st Set | 8 SMRT cells : 2nd Set | 8 SMRT cells : 3rd Set |
# contigs | 2 | 6 | 1 | 4 | 2 | 4 | 2 | 3 | 2 |
Largest contig | 3 768 995 | 4 105 501 | 4 644 254 | 3 784 001 | 4 646 000 | 3 287 004 | 4 646 998 | 4 622 502 | 4 647 000 |
Total length | 4 649 500 | 4 678 503 | 4 644 254 | 4 660 999 | 4 655 498 | 4 667 500 | 4 660 992 | 4 660 836 | 4 656 000 |
N50 | 3 768 995 | 4 105 501 | 4 644 254 | 3 784 001 | 4 646 000 | 3 287 004 | 4 646 998 | 4 622 502 | 4 647 000 |
Misassemblies | |||||||||
# misassemblies | 8 | 10 | 10 | 9 | 8 | 8 | 8 | 9 | 8 |
Misassembled contigs length | 3 768 995 | 4 666 999 | 4 644 254 | 4 660 999 | 4 646 000 | 3 299 005 | 4 660 992 | 4 638 338 | 4 647 000 |
Mismatches | |||||||||
# mismatches per 100kbp | 0.15 | 0.5 | 0.37 | 0.19 | 0.11 | 0.11 | 0.13 | 0.22 | 0.17 |
# indels per 100kbp | 0.32 | 2.76 | 0.22 | 1.44 | 0.5 | 0.58 | 0.19 | 1.34 | 0.47 |
# N's per 100kbp | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Genome Statistics | |||||||||
Genome fraction(%) | 100 | 100 | 99.994 | 99.999 | 100 | 100 | 100 | 99.99 | 100 |
Duplication ratio | 1.002 | 1.008 | 1.002 | 1.005 | 1.003 | 1.006 | 1.005 | 1.005 | 1.004 |
# genes | 4494+3 part | 4494+3 part | 4493+3 part | 4493+4 part | 4495+2 part | 4495+2 part | 4495+2 part | 4493+4 part | 4495+2 part |
NGA50 | 1 207 217 | 2 558 154 | 1 640 382 | 2 888 022 | 2 833 234 | 1 298 912 | 1 476 281 | 1 344 200 | 2 995 586 |
Misassemblies for Adobe reader.
We used all SMRT cells and randomly selected four and six SMRT cells three times for each, and evaluated the assemblies by QUAST against the reference genome (NC_000913) and Ec_gene_list.
Statistics without reference | All Data | 4 SMRT cells : 1st Set | 4 SMRT cells : 2nd Set | 4 SMRT cells : 3rd Set | 6 SMRT cells : 1st Set | 6 SMRT cells : 2nd Set | 6 SMRT cells : 3rd Set |
# contigs | 16 | 10 | 14 | 16 | 9 | 18 | 13 |
Largest contig | 2 198 457 | 3 484 877 | 1 936 831 | 1 948 632 | 2 104 087 | 1 169 224 | 1 439 551 |
Total length | 4 808 733 | 4 706 800 | 4 705 398 | 4 745 036 | 4 741 512 | 4 814 718 | 4 749 785 |
N50 | 1 005 770 | 3 484 877 | 966 809 | 1 434 284 | 1 655 500 | 676 526 | 1 268 010 |
Misassemblies | |||||||
# misassemblies | 19 | 9 | 12 | 15 | 14 | 17 | 11 |
Misassembled contigs length | 2 939 040 | 3 530 352 | 2 949 761 | 3 653 461 | 3 820 624 | 2 387 129 | 3 986 402 |
Mismatches | |||||||
# mismatches per 100kbp | 0.8 | 0.43 | 0.58 | 1.36 | 0.15 | 0.95 | 0.58 |
# indels per 100kbp | 5.71 | 2.98 | 4.45 | 9.56 | 1.77 | 8.02 | 6.88 |
# N's per 100kbp | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Genome Statistics | |||||||
Genome fraction(%) | 100 | 100 | 99.815 | 99.87 | 100 | 99.995 | 99.979 |
Duplication ratio | 1.037 | 1.016 | 1.017 | 1.025 | 1.022 | 1.038 | 1.025 |
# genes | 4494+3 part | 4494+3 part | 4480+7 part | 4485+9 part | 4494+3 part | 4493+4 part | 4492+5 part |
NGA50 | 615 234 | 1 205 052 | 572 342 | 875 953 | 844 482 | 633 220 | 1 267 242 |
Running Time | 19hr 06m | 13hr 34m | 13hr 21m | 12hr 38m | 21hr 28m | 22hr 56m | 22hr 07m |
We aligned subreads to contigs, and discarded the contigs with fewer than 100 reads aligned.
Statistics without reference | All Data | 4 SMRT cells : 1st Set | 4 SMRT cells : 2nd Set | 4 SMRT cells : 3rd Set | 6 SMRT cells : 1st Set | 6 SMRT cells : 2nd Set | 6 SMRT cells : 3rd Set |
# contigs | 7 | 8 | 10 | 12 | 4 | 9 | 12 |
Largest contig | 2 198 457 | 3 4848 77 | 1 936 831 | 1 948 632 | 2 104 087 | 1 169 224 | 1 439 551 |
Total length | 4 706 061 | 4 674 582 | 4 659 277 | 4 682 754 | 4 680 475 | 4 702 993 | 4 739 366 |
N50 | 1 005 770 | 3 484 877 | 966 809 | 1 434 284 | 1 655 500 | 676 526 | 1 268 010 |
Misassemblies | |||||||
# misassemblies | 10 | 7 | 8 | 9 | 9 | 8 | 10 |
Misassembled contigs length | 2 836 368 | 3 498 134 | 2 903 640 | 3 591 179 | 3 759 587 | 2 275 404 | 3 975 983 |
Mismatches | |||||||
# mismatches per 100kbp | 0.8 | 0.43 | 0.45 | 1.27 | 0.15 | 0.75 | 0.58 |
# indels per 100kbp | 5.71 | 2.98 | 3.56 | 8.72 | 1.77 | 6.06 | 6.88 |
# N's per 100kbp | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Genome Statistics | |||||||
Genome fraction(%) | 100 | 100 | 99.798 | 99.87 | 100 | 99.995 | 99.979 |
Duplication ratio | 1.014 | 1.009 | 1.006 | 1.012 | 1.009 | 1.014 | 1.023 |
# genes | 4494+3 part | 4494+3 part | 4479+8 part | 4485+9 part | 4494+3 part | 4493+4 part | 4492+5 part |
NGA50 | 615 234 | 1 205 052 | 572 342 | 875 953 | 844 482 | 633 220 | 1 267 242 |
After discarding unconvincing contigs, we discarded low quality bases which present in lower-case from contigs two-side ends. more detail
Statistics without reference | All Data | 4 SMRT cells : 1st Set | 4 SMRT cells : 2nd Set | 4 SMRT cells : 3rd Set | 6 SMRT cells : 1st Set | 6 SMRT cells : 2nd Set | 6 SMRT cells : 3rd Set |
# contigs | 7 | 8 | 10 | 12 | 4 | 9 | 12 |
Largest contig | 2 196 495 | 3 478 799 | 1 936 007 | 1 948 495 | 2 100 388 | 1 165 497 | 1 438 506 |
Total length | 4 694 972 | 4 662 655 | 4 649 216 | 4 657 587 | 4 668 899 | 4 681 301 | 4 714 790 |
N50 | 1 005 009 | 3 478 799 | 964 998 | 1 433 016 | 1 654 501 | 375 502 | 1 266 511 |
Misassemblies | |||||||
# misassemblies | 9 | 9 | 8 | 9 | 10 | 9 | 10 |
Misassembled contigs length | 2 210 994 | 3 490 490 | 2 901 005 | 3 496 520 | 3 754 889 | 2 256 498 | 3 197 010 |
Mismatches | |||||||
# mismatches per 100kbp | 0.63 | 0.28 | 0.22 | 0.91 | 0.15 | 0.54 | 0.47 |
# indels per 100kbp | 5 | 2.5 | 1.84 | 6.8 | 1.64 | 4.63 | 5.99 |
# N's per 100kbp | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Genome Statistics | |||||||
Genome fraction(%) | 100 | 99.842 | 99.776 | 99.859 | 100 | 99.985 | 99.979 |
Duplication ratio | 1.012 | 1.007 | 1.005 | 1.006 | 1.006 | 1.009 | 1.016 |
# genes | 4494+3 part | 4485+6 part | 4478+9 part | 4482+11 part | 4494+3 part | 4493+4 part | 4492+5 part |
NGA50 | 614 657 | 949 284 | 432 003 | 853 140 | 747 216 | 579 994 | 672 148 |
Misassemblies for Adobe reader.
We used all SMRT cells to do assembly and evaluated the assemblies by QUAST against the reference genome (NC_013946)and Mr_gene_list.
Statistics without reference | All Data |
# contigs | 3 |
Largest contig | 2 548 031 |
Total length | 3 121 070 |
N50 | 2 548 031 |
Misassemblies | |
# misassemblies | 1 |
Misassembled contigs length | 2 548 031 |
Mismatches | |
# mismatches per 100kbp | 0.52 |
# indels per 100kbp | 2.71 |
# N's per 100kbp | 0 |
Genome Statistics | |
Genome fraction(%) | 99.986 |
Duplication ratio | 1.017 |
# genes | 3103+2 part |
NGA50 | 1 155 126 |
Running Time | 18hr 19m |
We discarded low quality bases which present in lower-case from contigs two-side ends. more detail
Statistics without reference | All Data |
# contigs | 3 |
Largest contig | 2 545 501 |
Total length | 3 115 015 |
N50 | 2 545 501 |
Misassemblies | |
# misassemblies | 1 |
Misassembled contigs length | 2 545 501 |
Mismatches | |
# mismatches per 100kbp | 0.42 |
# indels per 100kbp | 2.52 |
# N's per 100kbp | 0 |
Genome Statistics | |
Genome fraction(%) | 99.986 |
Duplication ratio | 1.006 |
# genes | 3103+2 part |
NGA50 | 1 153 096 |
We used all SMRT cells and randomly selected four SMRT cells three times for each, and evaluated the assemblies by QUAST against the reference genome (NC_013061) and Ph_gene_list
Statistics without reference | All Data | 4 SMRT cells : 1st Set | 4 SMRT cells : 2nd Set | 4 SMRT cells : 3rd Set |
# contigs | 3 | 3 | 3 | 6 |
Largest contig | 2 934 267 | 2 927 454 | 2 929 942 | 2 226 051 |
Total length | 5 178 932 | 5 176 592 | 5 176 771 | 5 182 410 |
N50 | 2 934 267 | 2 927 454 | 2 929 942 | 2 133 457 |
Misassemblies | ||||
# misassemblies | 0 | 1 | 0 | 1 |
Misassembled contigs length | 0 | 2 240 169 | 0 | 13 124 |
Mismatches | ||||
# mismatches per 100kbp | 0 | 0.02 | 0.06 | 6.45 |
# indels per 100kbp | 1.05 | 0.54 | 0.6 | 1.88 |
# N's per 100kbp | 0 | 0 | 0 | 0 |
Genome Statistics | ||||
Genome fraction(%) | 100 | 100 | 100 | 99.936 |
Duplication ratio | 1.003 | 1.003 | 1.003 | 1.006 |
# genes | 4338+1 part | 4338+1 part | 4338+1 part | 4335+4 part |
NGA50 | 2 934 267 | 2 927 454 | 2 929 942 | 2 133 457 |
Running Time | 24hr 56m | 17hr 41m | 18hr 14m | 17hr 04m |
We aligned subreads to contigs, and discarded the contigs with fewer than 100 reads aligned.
Statistics without reference | All Data | 4 SMRT cells : 1st Set | 4 SMRT cells : 2nd Set | 4 SMRT cells : 3rd Set |
# contigs | 3 | 2 | 2 | 5 |
Largest contig | 2 943 267 | 2 927 454 | 2 929 942 | 2 226 051 |
Total length | 5 178 932 | 5 167 623 | 5 167 190 | 5 172 946 |
N50 | 2 934 267 | 2 927 454 | 2 929 942 | 2 133 457 |
Misassemblies | ||||
# misassemblies | 0 | 1 | 0 | 1 |
Misassembled contigs length | 0 | 2 240 169 | 0 | 13 124 |
Mismatches | ||||
# mismatches per 100kbp | 0 | 0.04 | 0.08 | 6.45 |
# indels per 100kbp | 1.05 | 0.68 | 0.6 | 1.82 |
# N's per 100kbp | 0 | 0 | 0 | 0 |
Genome Statistics | ||||
Genome fraction(%) | 100 | 99.951 | 99.916 | 99.878 |
Duplication ratio | 1.003 | 1.002 | 1.002 | 1.004 |
# genes | 4338+1 part | 4336+2 part | 4335+3 part | 4333+5 part |
NGA50 | 2 934 267 | 2 927 454 | 2 929 942 | 2 133 457 |
After discarding unconvincing contigs, we discarded low quality bases which present in lower-case from contigs two-side ends. more detail
Statistics without reference | All Data | 4 SMRT cells : 1st Set | 4 SMRT cells : 2nd Set | 4 SMRT cells : 3rd Set |
# contigs | 3 | 2 | 2 | 5 |
Largest contig | 2 932 503 | 2 925 498 | 2 925 998 | 2 225 051 |
Total length | 5 175 001 | 5 163 999 | 5 162 498 | 5 161 405 |
N50 | 2 932 503 | 2 925 498 | 2 925 998 | 2 131 500 |
Misassemblies | ||||
# misassemblies | 0 | 1 | 0 | 0 |
Misassembled contigs length | 0 | 2 238 501 | 0 | 0 |
Mismatches | ||||
# mismatches per 100kbp | 0.02 | 0.06 | 9.98 | 6.42 |
# indels per 100kbp | 0.77 | 0.52 | 0.85 | 1.44 |
# N's per 100kbp | 0 | 0 | 0 | 0 |
Genome Statistics | ||||
Genome fraction(%) | 100 | 99.931 | 99.869 | 99.782 |
Duplication ratio | 1.001 | 1.001 | 1 | 1.001 |
# genes | 4338+1 part | 4336+2 part | 4331+4 part | 4328+7 part |
NGA50 | 2 932 503 | 2 925 498 | 2 925 998 | 2 131 500 |
Misassemblies for Adobe reader.
We used all SMRT cells and evaluated the assemblies by QUAST against the reference genome (NC_000913) and Ec_gene_list.
We used the one SMRT cell and access the correctness by Quast
Statistics without reference | All Data |
# contigs | 2 |
Largest contig | 4 656 681 |
Total length | 4 672 546 |
N50 | 4 656 681 |
Misassemblies | |
# misassemblies | 9 |
Misassembled contigs length | 4 672 546 |
Mismatches | |
# mismatches per 100kbp | 0.15 |
# indels per 100kbp | 4.87 |
# N's per 100kbp | 0 |
Genome Statistics | |
Genome fraction(%) | 100 |
Duplication ratio | 1.007 |
# genes | 4494+3 part |
NGA50 | 2 995 500 |
Running Time | 16hr 40m |
We aligned subreads to contigs, and discarded the contigs with fewer than 100 reads aligned.
Statistics without reference | All Data |
# contigs | 1 |
Largest contig | 4 656 681 |
Total length | 4 656 681 |
N50 | 4 656 681 |
Misassemblies | |
# misassemblies | 8 |
Misassembled contigs length | 4 656 681 |
Mismatches | |
# mismatches per 100kbp | 0.15 |
# indels per 100kbp | 4.87 |
# N's per 100kbp | 0 |
Genome Statistics | |
Genome fraction(%) | 100 |
Duplication ratio | 1.004 |
# genes | 4494+3 part |
NGA50 | 2 995 500 |
After discarding unconvincing contigs, we discarded low quality bases which present in lower-case from contigs two-side ends. more detail
Statistics without reference | All Data |
# contigs | 1 |
Largest contig | 4 654 377 |
Total length | 4 654 377 |
N50 | 4 654 377 |
Misassemblies | |
# misassemblies | 8 |
Misassembled contigs length | 4 654 377 |
Mismatches | |
# mismatches per 100kbp | 0.15 |
# indels per 100kbp | 4.81 |
# N's per 100kbp | 0 |
Genome Statistics | |
Genome fraction(%) | 100 |
Duplication ratio | 1.003 |
# genes | 4494+3 part |
NGA50 | 3 026 319 |
We used HGAP3.0.xml protocol and ran dataset 9 on SMRT portal.
Statistics without reference | All Data |
# contigs | 1 |
Largest contig | 4 655 855 |
Total length | 4 655 855 |
N50 | 4 655 855 |
Misassemblies | |
# misassemblies | 8 |
Misassembled contigs length | 4 655 855 |
Mismatches | |
# mismatches per 100kbp | 0.3 |
# indels per 100kbp | 0.86 |
# N's per 100kbp | 0 |
Genome Statistics | |
Genome fraction(%) | 100 |
Duplication ratio | 1.004 |
# genes | 4494 + 3 part |
NGA50 | 3 026 418 |
Running Time | 1hr 22m |
We discarded low quality bases which present in lower-case from contigs two-side ends. more detail
Statistics without reference | All Data |
# contigs | 1 |
Largest contig | 4 652 541 |
Total length | 4 652 541 |
N50 | 4 652 541 |
Misassemblies | |
# misassemblies | 8 |
Misassembled contigs length | 4 652 541 |
Mismatches | |
# mismatches per 100kbp | 0.19 |
# indels per 100kbp | 0.56 |
# N's per 100kbp | 0 |
Genome Statistics | |
Genome fraction(%) | 100 |
Duplication ratio | 1.004 |
# genes | 4494+3 part |
NGA50 | 3 026 418 |