Escherichia coli K12 MG1655. The E. coli MG1655 consists of a circular chromosome of 4,639,675 bp in length.
Software | Version | Parameters | Download |
ABySS | 1.3.0 | k=31 | Abyss |
Velvet | 1.1.04 | k=29 ins_length=215 cov_cutoff=12 exp_cov=24 min_contig_lgth=100 scaffolding=no | Velvet |
Edena | 3 | m=30 | Edena |
SOAPdenovo | 1.05 | K=29 M=3 | SOAPdenovo |
CLC | 4.7.2 | insert_size_range=194,236 minimum_contig_length=100 | CLC |
Merged File: Set1_Contig
Abyss parameter | Download |
k=29 | Abyss_k29 |
k=31 | Abyss_k31 |
k=33 | Abyss_k33 |
Merged File: Set2_Contig
SOAPdenovo parameter | Download |
k=29 | SOAP_k29 |
k=31 | SOAP_k31 |
k=33 | SOAP_k33 |
Merged File: Set3_Contig
Download |
CISA+SSPACE |
ABySS+SSPACE |
Minimus2+SSPACE |
Input | Download |
Set1 | CISA_Set1 |
Set2 | CISA_Set2 |
Set3 | CISA_Set3 |
Set2+Set3 | CISA_Set2_3 |
Set1+2+3+2_3 | CISA_Set1+2+3+2_3 |
The input file for CISA is an assembled set of contigs, e.g., the set1 contains the contigs obtained from Abyss, CLC, Edena, SOAPdenovo, and Velvet.
The integrated contigs generated by CISA can be directly downloaded via the Download link.
maia('./assembly_list.txt','./NC_000913.fna')
assembly_list.txt:
Edena ./Edena_contigs.fa 2 SOAP ./SOAP_contigs.fa 2 Velvet ./Velvet.fa 2 CLC ./CLC.fa 2 Abyss ./Abyss_contigs.fa 3
A file named maia_assembly.fa was generated, containing a draft genome of 4640055 bp with 126761 uncalled bases (Ns). We split the genome into contigs (at >10Ns) and got the set of contigs maia.fa.
In using MAIA for other examples, we found that Matlab crashed while the recursion reached its limitation. We therefore split the reference genome into several parts, and then performed MAIA.
maia('./assembly_list.txt','./NC_000913_1.fna') maia('./assembly_list.txt','./NC_000913_2.fna') maia('./assembly_list.txt','./NC_000913_3.fna')
The split reference and the integrated results can be downloaded maia_ecoli_36.
All_M2_GAA
Beacuase minimus2 and GAA merge two assemblies at a time, we iteratively integrate the five assemblies in random order.
minimus2: C_S_V_A_E, A_C_V_S_E, S_V_E_C_A, A_E_C_V_S, A_E_S_C_V, E_V_S_C_A, E_V_C_A_S, V_E_S_A_C, V_S_E_A_C, A_E_V_C_S
GAA: E_S_C_V_A, V_S_E_C_A, V_S_C_E_A, S_V_C_E_A, V_E_A_S_C, E_C_V_S_A, V_A_S_C_E, V_C_E_S_A, A_S_C_V_E, S_C_V_E_A
Set1
Name | NumContigs | NumAssemblyBases | DCJ_Distance | NumMisCalled | NumUnCalled | NumGapsRef | NumGapsAssembly | TotalBasesMissed | %Missed | ExtraBases | %Extra | BrokenCDS | IntactCDS | ContigN50 | ContigN90 | MaxContigLength | Blast_IntactCDS | Units(>200) | N50 | cor.Units | cor.N50 | Errors,(Indel>=5,Inv,Rel) |
Abyss | 133 | 4626205 | 108 | 334 | 69 | 123 | 119 | 57847 | 1.2468 | 29424 | 0.636 | 57 | 4263 | 96157 | 26096 | 222425 | 4257 | 108 | 96511 | 116 | 92933 | 8,(6,0,2) |
CLC | 379 | 4546926 | 304 | 100 | 0 | 288 | 287 | 130550 | 2.8138 | 3405 | 0.0749 | 62 | 4258 | 29767 | 8447 | 107342 | 4233 | 288 | 28450 | 290 | 28036 | 2,(0,1,1) |
Edena | 211 | 4569446 | 154 | 17 | 0 | 129 | 125 | 86780 | 1.8704 | 2078 | 0.0455 | 66 | 4254 | 54405 | 13642 | 186686 | 4204 | 182 | 54405 | 186 | 52796 | 4,(2,1,1) |
SOAPdenovo | 553 | 4547211 | 475 | 36 | 0 | 461 | 412 | 124407 | 2.6814 | 6972 | 0.1533 | 100 | 4220 | 17902 | 5384 | 103369 | 4146 | 450 | 17892 | 451 | 17892 | 1,(0,0,1) |
Velvet | 283 | 4550675 | 207 | 138 | 0 | 208 | 203 | 116542 | 2.5119 | 2783 | 0.0612 | 74 | 4246 | 52474 | 12537 | 166094 | 4204 | 217 | 52474 | 224 | 49022 | 8,(5,0,3) |
CISA_Set1 | 72 | 4627549 | 70 | 241 | 50 | 91 | 92 | 49487 | 1.0666 | 32028 | 0.6921 | 44 | 4276 | 119107 | 32288 | 312018 | 4290 | 69 | 135136 | 83 | 113511 | 14,(7,1,6) |
GAA* | 311 | 4602917 | 224 | 156 | 3 | 225 | 216 | 93476 | 2.0147 | 11942 | 0.2591 | 76 | 4244 | 49990 | 12208 | 163308 | 4208 | 245 | 51075 | 238 | 47954 | 6,(3,0,2) |
MAIA | 110 | 4513348 | 96 | 82 | 54 | 100 | 95 | 129936 | 2.8005 | 1090 | 0.0242 | 48 | 4272 | 112717 | 30950 | 312145 | 4222 | 95 | 126075 | 97 | 107674 | 5,(2,0,3) |
minimus2* | 73 | 4597392 | 67 | 323 | 0 | 96 | 80 | 155862 | 3.3593 | 102792 | 2.2503 | 52 | 4268 | 121942 | 35207 | 296685 | 4199 | 72 | 127420 | 83 | 113511 | 11,(7,1,3) |
GAA (Abyss,Edena) | 133 | 4637982 | 102 | 328 | 93 | 118 | 112 | 54835 | 1.1819 | 28888 | 0.6229 | 57 | 4263 | 96157 | 26096 | 222425 | 4267 | 108 | 96511 | 115 | 92933 | 8,(6,0,2) |
GAA (A,C,E,S,V) | 138 | 4639673 | 103 | 305 | 93 | 119 | 113 | 54254 | 1.1693 | 29292 | 0.6311 | 57 | 4263 | 96157 | 26096 | 222425 | 4267 | 108 | 96511 | 115 | 92933 | 8,(6,0,2) |
minimus2(A,C,E,S,V) | 74 | 4608653 | 68 | 285 | 0 | 97 | 78 | 76881 | 1.657 | 35464 | 0.7695 | 50 | 4270 | 126075 | 34542 | 417704 | 4268 | 73 | 134584 | 83 | 113511 | 10,(7,1,2) |
minimus2(S,C,V,E,A) | 69 | 4215087 | 69 | 214 | 249 | 90 | 78 | 548181 | 11.8151 | 113137 | 2.6841 | 51 | 4269 | 119108 | 35441 | 312145 | 3869 | 69 | 115198 | 79 | 105796 | 10,(5,2,3) |
[*] Please note that the scores of minimus2 and GAA were taken from the average of ten random combinations.
Set2-Set3
Name | NumContigs | NumAssemblyBases | DCJ_Distance | NumMisCalled | NumUnCalled | NumGapsRef | NumGapsAssembly | TotalBasesMissed | %Missed | ExtraBases | %Extra | BrokenCDS | IntactCDS | ContigN50 | ContigN90 | MaxContigLength | Blast_IntactCDS | Units(>200) | N50 | cor.Units | cor.N50 | Errors,(Indel>=5,Inv,Rel) |
Abyss_k29 | 130 | 4634010 | 108 | 322 | 30 | 118 | 115 | 61835 | 1.3327 | 40405 | 0.8719 | 54 | 4266 | 95691 | 26567 | 268182 | 4267 | 105 | 96157 | 111 | 89001 | 6,(4,0,2) |
Abyss_k31 | 133 | 4626205 | 107 | 334 | 69 | 123 | 119 | 57847 | 1.2468 | 29424 | 0.636 | 57 | 4263 | 96157 | 26096 | 222425 | 4257 | 108 | 96511 | 116 | 92933 | 8,(6,0,2) |
Abyss_k33 | 135 | 4644184 | 106 | 354 | 338 | 139 | 119 | 66355 | 1.4302 | 44937 | 0.9676 | 78 | 4242 | 89001 | 24907 | 268398 | 4263 | 112 | 96157 | 119 | 89001 | 8,(5,0,3) |
CISA_Set2 | 106 | 4635648 | 93 | 321 | 131 | 117 | 104 | 55343 | 1.1928 | 39776 | 0.858 | 64 | 4256 | 113377 | 27272 | 222663 | 4268 | 94 | 113512 | 102 | 105936 | 8,(5,0,3) |
SOAP_k29 | 1373 | 4582756 | 479 | 48 | 0 | 466 | 415 | 124372 | 2.6806 | 7247 | 0.1581 | 100 | 4220 | 17892 | 5276 | 103369 | 4146 | 450 | 17892 | 451 | 17892 | 1,(0,0,1) |
SOAP_k31 | 1295 | 4583165 | 519 | 56 | 0 | 510 | 466 | 121606 | 2.621 | 9201 | 0.2008 | 121 | 4199 | 17003 | 4286 | 77302 | 4094 | 502 | 16924 | 503 | 16924 | 1,(0,0,1) |
SOAP_k33 | 2170 | 4608265 | 1519 | 105 | 0 | 1470 | 1380 | 126273 | 2.7216 | 41165 | 0.8933 | 507 | 3813 | 5391 | 1449 | 22953 | 3379 | 1459 | 5368 | 1459 | 5365 | 1,(0,0,1) |
CISA_Set3 | 440 | 4532470 | 383 | 35 | 0 | 379 | 338 | 133920 | 2.8864 | 5809 | 0.1282 | 88 | 4232 | 23332 | 6264 | 103369 | 4165 | 385 | 23328 | 385 | 23328 | 2,(1,0,1) |
CISA_Set2&3 | 105 | 4636952 | 91 | 344 | 159 | 117 | 103 | 57192 | 1.2327 | 40304 | 0.8692 | 60 | 4260 | 113377 | 27272 | 222663 | 4269 | 96 | 113512 | 102 | 105936 | 6,(5,0,1) |
CISA_Set_1_2_3_2&3 | 72 | 4637760 | 72 | 554 | 53 | 112 | 100 | 43060 | 0.9281 | 37481 | 0.8082 | 44 | 4276 | 115185 | 35678 | 310691 | 4291 | 71 | 119686 | 80 | 105929 | 9,(7,0,2) |
Name | NumContigs | NumAssemblyBases | DCJ_Distance | NumMisCalled | NumUnCalled | NumGapsRef | NumGapsAssembly | TotalBasesMissed | %Missed | ExtraBases | %Extra | BrokenCDS | IntactCDS | ContigN50 | MaxContigLength | Blast_IntactCDS | Units(>200) | N50 | cor.Units | cor.N50 | Errors,(Indel>=5,Inv,Rel) |
CISA+SSPACE | 69 | 4627867 | 70 | 237 | 50 | 90 | 91 | 52150 | 1.124 | 37320 | 0.8064 | 44 | 4276 | 134584 | 418148 | 4290 | 69 | 135136 | 83 | 113511 | 14,(7,1,6) |
Abyss+SSPACE | 101 | 4627104 | 89 | 393 | 735 | 114 | 119 | 54956 | 1.1845 | 33747 | 0.7293 | 57 | 4263 | 107040 | 268750 | 4272 | 90 | 113372 | 113 | 95959 | 26,(15,1,10) |
minimus2+SSPACE | 64 | 4608774 | 64 | 337 | 54 | 93 | 76 | 75502 | 1.6273 | 36021 | 0.7816 | 49 | 4271 | 150458 | 420117 | 4268 | 65 | 150458 | 79 | 119105 | 12,(7,2,3) |