Escherichia coli K12 MG1655. The E. coli MG1655 consists of a circular chromosome of 4,639,675 bp in length.
Read source
- The illuminia read data of E. coli (Paired-end sequencing library with 200 bp inserts) were downloaded from Sequence Read Archive (SRA). More than 20.8 M reads
Sequence assembly
- Set1 (Different Assemblers)
Software |
Version |
Parameters |
Download |
ABySS |
1.3.0 |
k=31 |
Abyss |
Velvet |
1.1.04 |
k=29 ins_length=215 cov_cutoff=12 exp_cov=24 min_contig_lgth=100 scaffolding=no |
Velvet |
Edena |
3 |
m=30 |
Edena |
SOAPdenovo |
1.05 |
K=29 M=3 |
SOAPdenovo |
CLC |
4.7.2 |
insert_size_range=194,236 minimum_contig_length=100 |
CLC |
Merged File: Set1_Contig
- Set2 (Different parameters for Abyss - the assembler provides the lowest number of contigs in Set1)
Merged File: Set2_Contig
- Set3 (Different parameters for SOAPdenovo - the assembler provides the largest number of contigs in Set1)
Merged File: Set3_Contig
Contig integrator
Evaluation
- Eshcherichia coli K12 MG1655
- Evaluate by Mauve Assembly Metrics
- How to score genome assemblies using the Mauve system
- Score with Mauve metrics:
Set1
Name |
NumContigs |
NumAssemblyBases |
NumMisCalled |
NumUnCalled |
NumGapsRef |
NumGapsAssembly |
TotalBasesMissed |
PercBasesMissed |
ExtraBases |
PercExtraBases |
BrokenCDS |
IntactCDS |
ContigN50 |
ContigN90 |
MaxContigLength |
Abyss |
133 |
4626205 |
334 |
69 |
123 |
119 |
57847 |
1.2468 |
29424 |
0.636 |
57 |
4263 |
96157 |
26096 |
222425 |
CLC |
379 |
4546926 |
100 |
0 |
288 |
287 |
130550 |
2.8138 |
3405 |
0.0749 |
62 |
4258 |
29767 |
8447 |
107342 |
Edena |
211 |
4569446 |
17 |
0 |
129 |
125 |
86780 |
1.8704 |
2078 |
0.0455 |
66 |
4254 |
54405 |
13642 |
186686 |
SOAPdenovo |
553 |
4547211 |
36 |
0 |
461 |
412 |
124407 |
2.6814 |
6972 |
0.1533 |
100 |
4220 |
17902 |
5384 |
103369 |
Velvet |
283 |
4550675 |
138 |
0 |
208 |
203 |
116542 |
2.5119 |
2783 |
0.0612 |
74 |
4246 |
52474 |
12537 |
166094 |
CISA_Set1 |
77 |
4625581 |
288 |
73 |
93 |
96 |
52449 |
1.1304 |
32037 |
0.6926 |
44 |
4276 |
115197 |
32288 |
310695 |
Minimus2 |
74 |
4608653 |
285 |
0 |
97 |
78 |
76881 |
1.657 |
35464 |
0.7695 |
50 |
4270 |
126075 |
34542 |
417704 |
Set2-Set3
Name |
NumContigs |
NumAssemblyBases |
NumMisCalled |
NumUnCalled |
NumGapsRef |
NumGapsAssembly |
TotalBasesMissed |
PercBasesMissed |
ExtraBases |
PercExtraBases |
BrokenCDS |
IntactCDS |
ContigN50 |
ContigN90 |
MaxContigLength |
Abyss_k29 |
130 |
4634010 |
322 |
30 |
118 |
115 |
61835 |
1.3327 |
40405 |
0.8719 |
54 |
4266 |
95691 |
26567 |
268182 |
Abyss_k31 |
133 |
4626205 |
334 |
69 |
123 |
119 |
57847 |
1.2468 |
29424 |
0.636 |
57 |
4263 |
96157 |
26096 |
222425 |
Abyss_k33 |
135 |
4644184 |
354 |
338 |
139 |
119 |
66355 |
1.4302 |
44937 |
0.9676 |
78 |
4242 |
89001 |
24907 |
268398 |
CISA_Set2 |
105 |
4635199 |
332 |
130 |
117 |
103 |
55567 |
1.1976 |
39517 |
0.8525 |
63 |
4257 |
113377 |
27272 |
222663 |
SOAP_k29 |
1373 |
4582756 |
48 |
0 |
466 |
415 |
124372 |
2.6806 |
7247 |
0.1581 |
100 |
4220 |
17892 |
5276 |
103369 |
SOAP_k31 |
1295 |
4583165 |
56 |
0 |
510 |
466 |
121606 |
2.621 |
9201 |
0.2008 |
121 |
4199 |
17003 |
4286 |
77302 |
SOAP_k33 |
2170 |
4608265 |
105 |
0 |
1470 |
1380 |
126273 |
2.7216 |
41165 |
0.8933 |
507 |
3813 |
5391 |
1449 |
22953 |
CISA_Set3 |
465 |
4546819 |
117 |
0 |
402 |
366 |
133247 |
2.8719 |
19266 |
0.4237 |
95 |
4225 |
21543 |
6065 |
103369 |
CISA_Set2&3 |
105 |
4636783 |
351 |
160 |
118 |
104 |
54999 |
1.1854 |
39905 |
0.8606 |
60 |
4260 |
113377 |
27272 |
222663 |
CISA_Set_1_2_3_2&3 |
72 |
4637107 |
529 |
53 |
109 |
97 |
43390 |
0.9352 |
37158 |
0.8013 |
44 |
4276 |
115185 |
35678 |
310556 |
- Scaffold the contigs using SSPACE
- Since we have the paired-end reads of E. coli, it is possible to assess the order, distance and orientation of contigs and combine them into scaffolds.
We, therefore, used SSPACE to scaffold the contigs and quantified the scaffolds by Mauve assembly metrics.
Name |
NumContigs |
NumAssemblyBases |
NumMisCalled |
NumUnCalled |
NumGapsRef |
NumGapsAssembly |
TotalBasesMissed |
PercBasesMissed |
ExtraBases |
PercExtraBases |
BrokenCDS |
IntactCDS |
ContigN50 |
MaxContigLength |
CISA+SSPACE |
69 |
4625880 |
362 |
157 |
93 |
98 |
52261 |
1.1264 |
34643 |
0.7489 |
43 |
4277 |
126254 |
316040 |
Abyss+SSPACE |
101 |
4627104 |
393 |
735 |
114 |
119 |
54956 |
1.1845 |
33747 |
0.7293 |
57 |
4263 |
107040 |
268750 |
minimus2+SSPACE |
64 |
4608774 |
337 |
54 |
93 |
76 |
75502 |
1.6273 |
36021 |
0.7816 |
49 |
4271 |
150458 |
420117 |