E. coli

Revision as of 09 December 2011 23:23 by admin (Comments | Contribs)

Escherichia coli K12 MG1655

Read source

The illuminia read data of E. coli (Paired-end sequencing library with 200 bp inserts) downloaded from Sequence Read Archive (SRA).

Sequence assembly

  • Set1 (Different Assemblers)
Software Version Parameters Download
ABySS 1.3.0 k=31 Abyss
Velvet 1.1.04 k=29 ins_length=215 cov_cutoff=12 exp_cov=24 min_contig_lgth=100 scaffolding=no Velvet
Edena 3 m=30 Edena
SOAPdenovo 1.05 K=29 M=3 SOAPdenovo
CLC 4.7.2 insert_size_range=194,236 minimum_contig_length=100 CLC

Merged File: Set1_Contig

  • Set2 (Different parameters for Abyss - the assembler provides the lowest number of contigs in Set1)
Abyss parameter Download
k=29 Abyss_k29
k=31 Abyss_k31
k=33 Abyss_k33

Merged File: Set2_Contig

  • Set3 (Different parameters for SOAPdenovo - the assembler provides the largest number of contigs in Set1)
SOAPdenovo parameter Download
k=29 SOAP_k29
k=31 SOAP_k31
k=33 SOAP_k33

Merged File: Set3_Contig


Contig integrator

  • CISA
Input Download
Set1 CISA_Set1
Set2 CISA_Set2
Set3 CISA_Set3
Set2+Set3 CISA_Set2&3
  • minimus2
Input Download
Set1 minimus2_Set1


Evaluation

  • Benchmark genome
Eshcherichia coli K12 MG1655
  • Evaluate by Mauve Assembly Metrics
How to score genome assemblies using the Mauve system
  • Scored with Mauve metrics:

- Set1

Name NumContigs NumAssemblyBases NumMisCalled NumUnCalled NumGapsRef NumGapsAssembly TotalBasesMissed PercBasesMissed ExtraBases PercExtraBases BrokenCDS IntactCDS ContigN50 ContigN90 MaxContigLength
Abyss 133 4626205 334 69 123 119 57847 1.2468 29424 0.636 57 4263 96157 26096 222425
CLC 379 4546926 100 0 288 287 130550 2.8138 3405 0.0749 62 4258 29767 8447 107342
Edena 211 4569446 17 0 129 125 86780 1.8704 2078 0.0455 66 4254 54405 13642 186686
SOAPdenovo 553 4547211 36 0 461 412 124407 2.6814 6972 0.1533 100 4220 17902 5384 103369
Velvet 283 4550675 138 0 208 203 116542 2.5119 2783 0.0612 74 4246 52474 12537 166094
CISA_Set1 81 4625471 229 73 92 93 54849 1.1822 32166 0.6954 46 4274 113510 29195 268608
Minimus2 74 4608653 285 0 97 78 76881 1.657 35464 0.7695 50 4270 126075 34542 417704

- Set2-Set3

Name NumContigs NumAssemblyBases NumMisCalled NumUnCalled NumGapsRef NumGapsAssembly TotalBasesMissed PercBasesMissed ExtraBases PercExtraBases BrokenCDS IntactCDS ContigN50 ContigN90 MaxContigLength
Abyss_k29 130 4634010 322 30 118 115 61835 1.3327 40405 0.8719 54 4266 95691 26567 268182
Abyss_k31 133 4626205 334 69 123 119 57847 1.2468 29424 0.636 57 4263 96157 26096 222425
Abyss_k33 135 4644184 354 338 139 119 66355 1.4302 44937 0.9676 78 4242 89001 24907 268398
CISA_Set2 105 4635199 332 130 117 103 55567 1.1976 39517 0.8525 63 4257 113377 27272 222663
SOAP_k29 1373 4582756 48 0 466 415 124372 2.6806 7247 0.1581 100 4220 17892 5276 103369
SOAP_k31 1295 4583165 56 0 510 466 121606 2.621 9201 0.2008 121 4199 17003 4286 77302
SOAP_k33 2170 4608265 105 0 1470 1380 126273 2.7216 41165 0.8933 507 3813 5391 1449 22953
CISA_Set3 465 4546819 117 0 402 366 133247 2.8719 19266 0.4237 95 4225 21543 6065 103369
CISA_Set2&3 105 4636783 351 160 118 104 54999 1.1854 39905 0.8606 60 4260 113377 27272 222663