We have evaluated the assemblies with QUAST 2.2(reference genome NC_000913 and Ec_gene_list).
Single SMRT cell reads were corrected with raw, 100X and 118X short reads.
Statistics without reference | 071634_raw_asm.ctg | 192221_raw_asm.ctg | 210845_raw_asm.ctg | 071634_100X_asm.ctg | 071634_118X_asm.ctg | 192221_118X_asm.ctg | 210845_118X_asm.ctg |
# contigs | 80 | 93 | 83 | 61 | 69 | 68 | 66 |
Largest contig | 745120 | 664876 | 562203 | 663399 | 434084 | 345313 | 437164 |
Total length | 4975695 | 5031560 | 5043217 | 4804004 | 4805579 | 4801310 | 4733683 |
N50 | 356974 | 221472 | 324225 | 295449 | 179662 | 207976 | 186993 |
Misassemblies | |||||||
# misassemblies | 11 | 17 | 21 | 10 | 13 | 20 | 15 |
Misassembled contigs length | 1552524 | 976207 | 2108892 | 1222277 | 782726 | 1156917 | 527873 |
Mismatches | |||||||
# mismatches per 100 kbp | 3.32 | 2.91 | 3.06 | 7.08 | 6.4 | 10.33 | 4.13 |
# indels per 100 kbp | 2.98 | 1.38 | 1.01 | 13.15 | 5.2 | 5.54 | 2.69 |
# N's per 100 kbp | 0.38 | 0.12 | 0.22 | 0.4 | 0.37 | 0.4 | 0.23 |
Genome statistics | |||||||
Genome fraction (%) | 99.97 | 100 | 100 | 99.304 | 99.424 | 99.522 | 98.712 |
Duplication ratio | 1.074 | 1.086 | 1.090 | 1.043 | 1.047 | 1.04 | 1.033 |
# genes | 4489 + 7 part | 4490 + 7 part | 4495 + 2 part | 4461 + 25 part | 4451 + 31 part | 4459 + 28 part | 4412 + 32 part |
NGA50 | 357183 | 221098 | 279423 | 226118 | 179662 | 194634 | 191457 |
We discarded the contigs which fewer than 100 reads aligned. more detail
Statistics without reference | 071634_raw_asm.ctg | 192221_raw_asm.ctg | 210845_raw_asm.ctg | 071634_100X_asm.ctg | 071634_118X_asm.ctg | 192221_118X_asm.ctg | 210845_118X_asm.ctg |
# contigs | 19 | 24 | 21 | 28 | 38 | 29 | 31 |
Largest contig | 745120 | 664876 | 592203 | 663399 | 434084 | 345313 | 437164 |
Total length | 4669108 | 4675696 | 4700617 | 4636263 | 4644391 | 4603072 | 4578972 |
N50 | 356974 | 222559 | 399011 | 295449 | 180706 | 207976 | 191458 |
Misassemblies | |||||||
# misassemblies | 7 | 6 | 11 | 6 | 5 | 7 | 6 |
Misassembled contigs length | 1539749 | 936587 | 2058922 | 1200212 | 727024 | 1097466 | 478971 |
Mismatches | |||||||
# mismatches per 100 kbp | 2.75 | 2.75 | 3.04 | 7.08 | 5.85 | 8.82 | 3.69 |
# indels per 100 kbp | 2.23 | 1.1 | 1.17 | 13.46 | 5.83 | 2.49 | 2.37 |
# N's per 100 kbp | 0.19 | 0.02 | 0.04 | 0.26 | 0.15 | 0.07 | 0.02 |
Genome statistics | |||||||
Genome fraction (%) | 99.639 | 99.699 | 99.834 | 99.984 | 99.051 | 98.78 | 98.159 |
Duplication ratio | 1.011 | 1.011 | 1.017 | 1.01 | 1.015 | 1.005 | 1.006 |
# genes | 4473 + 15 part | 4465 + 18 part | 4480 + 10 part | 4435 + 29 part | 4431 + 36 part | 4413 + 36 part | 4380 + 34 part |
NGA50 | 357183 | 221098 | 279423 | 226118 | 179662 | 194634 | 191457 |
Two SMRT cell reads were corrected with raw, 100X, and 118X short reads. The PBcR were then filtered to 25X or directly assembled by runCA.
Statistics without reference | 2_raw_asm.ctg | 2_raw_25X_asm.ctg | 2_100X_asm.ctg | 2_100X_25X_asm.ctg | 2_118X_asm.ctg | 2_118X_25X_asm.ctg |
# contigs | 106 | 80 | 81 | 71 | 81 | 54 |
Largest contig | 762045 | 757702 | 767781 | 405645 | 520095 | 570062 |
Total length | 5168000 | 5080370 | 4918962 | 4799524 | 4832961 | 4725680 |
N50 | 419161 | 405539 | 331262 | 193986 | 186504 | 210927 |
Misassemblies | ||||||
# misassemblies | 13 | 18 | 15 | 9 | 16 | 16 |
Misassembled contigs length | 1591856 | 1751860 | 1703983 | 165469 | 616747 | 1468075 |
Mismatches | ||||||
# mismatches per 100 kbp | 2.44 | 1.83 | 5.08 | 4.34 | 6.28 | 6.08 |
# indels per 100 kbp | 0.88 | 0.82 | 6.34 | 2.14 | 5.61 | 2.93 |
# N's per 100 kbp | 0.48 | 0.04 | 0.94 | 0.02 | 0.43 | 0.08 |
Genome statistics | ||||||
Genome fraction (%) | 100 | 100 | 99.652 | 98.76 | 99.567 | 99.194 |
Duplication ratio | 1.116 | 1.098 | 1.065 | 1.048 | 1.047 | 1.028 |
# genes | 4495 + 2 part | 4495 + 2 part | 4475 + 16 part | 4432 + 31 part | 4458 + 30 part | 4434 + 43 part |
NGA50 | 418393 | 405538 | 235822 | 193833 | 194196 | 199657 |
We discarded the contigs which fewer than 100 reads aligned. more detail
Statistics without reference | 2_raw_asm.ctg | 2_raw_25X_asm.ctg | 2_100X_asm.ctg | 2_100X_25X_asm.ctg | 2_118X_asm.ctg | 2_118X_25X_asm.ctg |
# contigs | 16 | 17 | 22 | 33 | 35 | 32 |
Largest contig | 762045 | 757702 | 767781 | 405645 | 520095 | 570062 |
Total length | 4650035 | 4675233 | 4651814 | 4574523 | 4648591 | 4588060 |
N50 | 514903 | 405539 | 331262 | 193986 | 194625 | 223426 |
Misassemblies | ||||||
# misassemblies | 4 | 6 | 10 | 4 | 5 | 8 |
Misassembled contigs length | 1564680 | 1697372 | 1683620 | 141677 | 569893 | 1424613 |
Mismatches | ||||||
# mismatches per 100 kbp | 2.43 | 2.23 | 4.94 | 1.8 | 5.56 | 5.97 |
# indels per 100 kbp | 1.76 | 0..84 | 6.19 | 1.8 | 4.48 | 2.86 |
# N's per 100 kbp | 0.06 | 0 | 0.26 | 0 | 0.13 | 0.04 |
Genome statistics | ||||||
Genome fraction (%) | 99.371 | 99.633 | 99.536 | 98.364 | 99.163 | 98.603 |
Duplication ratio | 1.006 | 1.013 | 1.009 | 1.003 | 1.011 | 1.004 |
# genes | 4458 + 13 part | 4466 + 8 part | 4462 + 22 part | 4404 + 39 part | 4438 + 32 part | 4405 + 42 part |
NGA50 | 418393 | 405538 | 235822 | 193833 | 194196 | 199657 |
Three SMRT cells reads were corrected with raw, 100X, and 118 short reads. The PBcR were then filtered to 25X or directly assembled by runCA.
Statistics without reference | 3_raw_asm.ctg | 3_raw_25X_asm.ctg | 3_100X_asm.ctg | 3_100X_25X_asm.ctg | 3_118X_asm.ctg | 3_118X_25X_asm.ctg |
# contigs | 219 | 74 | 98 | 32 | 86 | 39 |
Largest contig | 771076 | 1426293 | 981874 | 822480 | 1091515 | 520962 |
Total length | 5873961 | 5171438 | 5051244 | 4730819 | 4906749 | 4668968 |
N50 | 247798 | 317846 | 413464 | 600008 | 286035 | 218547 |
Misassemblies | ||||||
# misassemblies | 25 | 10 | 22 | 8 | 25 | 11 |
Misassembled contigs length | 1361077 | 1372143 | 2201123 | 1855654 | 1800186 | 1350243 |
Mismatches | ||||||
# mismatches per 100 kbp | 4.03 | 2.5 | 2.030 | 1.5 | 4.71 | 5.45 |
# indels per 100 kbp | 1.68 | 0.97 | 5.64 | 3.3 | 4.13 | 3.86 |
# N's per 100 kbp | 0.34 | 0.140 | 0.46 | 0.11 | 0.18 | 0.02 |
Genome statistics | ||||||
Genome fraction (%) | 100 | 100 | 99.733 | 99.197 | 99.69 | 98.93 |
Duplication ratio | 1.268 | 1.116 | 1.092 | 1.028 | 1.063 | 1.018 |
# genes | 4494 + 3 part | 4495 + 2 part | 4484 + 9 part | 4460 + 19 part | 4468 + 20 part | 4427 + 35 part |
NGA50 | 286997 | 323732 | 348693 | 599239 | 286035 | 193471 |
We discarded the contigs which fewer than 100 reads aligned. more detail
Statistics without reference | 3_raw_asm.ctg | 3_raw_25X_asm.ctg | 3_100X_asm.ctg | 3_100X_25X_asm.ctg | 3_118X_asm.ctg | 3_118X_25X_asm.ctg |
# contigs | 29 | 15 | 20 | 18 | 27 | 29 |
Largest contig | 771076 | 1426293 | 981876 | 822480 | 1091515 | 520962 |
Total length | 4672216 | 4666874 | 4656670 | 4613610 | 4650236 | 4593856 |
N50 | 316212 | 323732 | 413464 | 600008 | 286035 | 218547 |
Misassemblies | ||||||
# misassemblies | 6 | 4 | 10 | 6 | 7 | 7 |
Misassembled contigs length | 1270144 | 1321332 | 2130162 | 1844194 | 1736227 | 1326538 |
Mismatches | ||||||
# mismatches per 100 kbp | 2.07 | 2.14 | 1.67 | 1.48 | 4.35 | 5.09 |
# indels per 100 kbp | 0.98 | 0.65 | 5.54 | 3.31 | 2.88 | 3.76 |
# N's per 100 kbp | 0.06 | 0 | 0.17 | 0.02 | 0.04 | 0 |
Genome statistics | ||||||
Genome fraction (%) | 99.037 | 99.636 | 99.615 | 99.087 | 99.478 | 98.614 |
Duplication ratio | 1.017 | 1.01 | 1.008 | 1.004 | 1.009 | 1.005 |
# genes | 4439 + 24 part | 4467 + 11 part | 4472 + 19 part | 4454 + 22 part | 4457 + 21 part | 4410 + 35 part |
NGA50 | 286997 | 323732 | 348693 | 599239 | 286035 | 193471 |
Four SMRT cell reads were corrected with raw, 100X, and 118X short reads. The PBcR were then filtered to 25X or directly assembled by runCA.
Statistics without reference | 4_raw_asm.ctg | 4_raw_25X_asm.ctg | 4_100X_asm.ctg | 4_100X_25X_asm.ctg | 4_118X_asm.ctg | 4_118X_25X_asm.ctg |
# contigs | 286 | 51 | 123 | 23 | 71 | 40 |
Largest contig | 532128 | 1812746 | 688723 | 1257198 | 983533 | 621920 |
Total length | 6162978 | 5045811 | 5144868 | 4693193 | 4862387 | 4665855 |
N50 | 147254 | 834736 | 398131 | 694380 | 412226 | 285200 |
Misassemblies | ||||||
# misassemblies | 24 | 13 | 26 | 8 | 31 | 13 |
Misassembled contigs length | 800651 | 3633076 | 2341550 | 2708632 | 2412628 | 1302367 |
Mismatches | ||||||
# mismatches per 100 kbp | 3.41 | 2.240 | 5.060 | 1.93 | 4.45 | 7.3 |
# indels per 100 kbp | 1.36 | 0.97 | 5.21 | 3.82 | 4.08 | 5.71 |
# N's per 100 kbp | 1.2 | 0.16 | 0.39 | 0.04 | 0.41 | 0 |
Genome statistics | ||||||
Genome fraction (%) | 100 | 100 | 99.74 | 99.337 | 99.798 | 98.559 |
Duplication ratio | 1.331 | 1.089 | 1.112 | 1.019 | 1.052 | 1.022 |
# genes | 4495 + 2 part | 4494 + 3 part | 4482 + 10 part | 4470 + 9 part | 4481 + 12 part | 4413 + 36 part |
NGA50 | 182232 | 476726 | 317322 | 694380 | 286770 | 214828 |
We discarded the contigs which fewer than 100 reads aligned. more detail
Statistics without reference | 4_raw_asm.ctg | 4_raw_25X_asm.ctg | 4_100X_asm.ctg | 4_100X_25X_asm.ctg | 4_118X_asm.ctg | 4_118X_25X_asm.ctg |
# contigs | 40 | 12 | 21 | 12 | 21 | 26 |
Largest contig | 532128 | 1812746 | 688723 | 1257198 | 983533 | 621920 |
Total length | 4726973 | 4659487 | 4656544 | 4602600 | 4654299 | 4508804 |
N50 | 180844 | 834736 | 398131 | 1071366 | 412226 | 285200 |
Misassemblies | ||||||
# misassemblies | 7 | 8 | 8 | 6 | 15 | 9 |
Misassembled contigs length | 736689 | 3595196 | 2252884 | 2698687 | 2362706 | 1274915 |
Mismatches | ||||||
# mismatches per 100 kbp | 2.35 | 2.28 | 2.53 | 2 | 4.41 | 6.26 |
# indels per 100 kbp | 0.87 | 0.75 | 3.94 | 3.81 | 3.31 | 4.49 |
# N's per 100 kbp | 0.17 | 0.13 | 0.02 | 0.04 | 0.04 | 0 |
Genome statistics | ||||||
Genome fraction (%) | 99.193 | 99.215 | 99.62 | 98.967 | 99.687 | 97.023 |
Duplication ratio | 1.028 | 1.012 | 1.008 | 1.003 | 1.009 | 1.003 |
# genes | 4443 + 26 part | 4456 + 12 part | 4474 + 17 part | 4452 + 11 part | 4474 + 15 part | 4344 + 37 part |
NGA50 | 182232 | 476726 | 317322 | 694380 | 286770 | 247828 |