E. coli 100 bp

Revision as of 15 September 2012 00:31 by admin (Comments | Contribs) | (Evaluation)

Escherichia coli K12 MG1655. The E. coli MG1655 consists of a circular chromosome of 4,639,675 bp in length.

Read source

The paired-end illuminia read data of E. coli were downloaded from Illumina (|Illumina) with a median insert size of 214 bp. More than 28.4 M reads

Sequence assembly

Software Version Parameters Download
ABySS 1.3.0 k=75 Abyss
Velvet 1.1.04 VelvetOptimiser --s 59 --e 97 Velvet
Edena 3 m=75 Edena
SOAPdenovo 1.05 k=75 M=3 avg_ins=215 SOAPdenovo

Merged File: E100_Contigs

Contig integrator

Integrator Download
CISA CISA
MAIA maia_ecoli_100bp
minimus2 minimus2(AEVS),minimus2(ASEV), minimus2(ASVE), minimus2(ESAV),minimus2(ESVA), minimus2(SEAV), minimus2(SEVA), minimus2(SVAE), minimus2(VASE), minimus2(VEAS)
GAA GAA(AESV),GAA(AEVS), GAA(ASEV), GAA(EASV),GAA(EAVS), GAA(ESAV), GAA(EVAS), GAA(EVSA), GAA(VAES), GAA(VASE)

Beacuase minimus2 and GAA merge two assemblies at a time, we iteratively integrate the four assemblies in random order.

Evaluation

  • Benchmark genome
Eshcherichia coli K12 MG1655
  • Evaluated by Mauve Assembly Metrics
How to score genome assemblies using the Mauve system
  • Evaluated by Blast with Features
  • Evaluated by Gage
Gage
  • Score with Mauve metrics:
Name NumContigs NumAssemblyBases DCJ_Distance NumMisCalled NumUnCalled NumGapsRef NumGapsAssembly TotalBasesMissed %Missed ExtraBases %Extra BrokenCDS IntactCDS ContigN50 ContigN90 MaxContigLength Blast_IntactCDS Units(>200) N50 cor.Units cor.N50 Errors,(Indel>=5,Inv,Rel)
Abyss 190 4642915 119 125 75 147 128 50280 1.0837 23618 0.5087 76 4244 66424 20680 222412 4243 135 68544 134 68545 3,(0,0,3)
Edena 421 4584984 331 22 0 281 271 86700 1.8687 6753 0.1473 145 4175 24375 6689 104739 4048 377 24375 373 24375 3,(1,1,1)
SOAPdenovo 560 4596003 272 100 0 245 256 112014 2.4143 6408 0.1394 111 4209 31788 7901 105615 4105 356 31837 358 31837 3,(1,0,2)
Velvet 264 4569720 191 127 33 204 210 102966 2.2193 3936 0.0861 92 4228 46975 11116 161713 4159 228 46975 241 44024 16,(4,7,5)
CISA 110 4641820 106 156 64 153 120 44382 0.9566 39138 0.8432 69 4251 77896 25720 222524 4248 106 79212 118 70571 12,(3,4,5)
MAIA (split3) 3 4732065 3 109 10520 152 150 412160 8.8834 507027 10.7147 94 4226 1420571 1420571 1861168 3886 - - - - -
MAIA (split3&n) 263 4351338 239 129 12 242 236 422260 9.1011 20759 0.4771 94 4226 47744 11114 161713 3882 232 45905 206 44024 1,(0,0,1)
GAA* 311 4636486 222 105 39 216 197 59416 1.2806 22776 0.4906 107 4213 48108 13772 162326 4152 261 50871 248 47458 6,(3,1,2)
minimus2* 94 4588207 88 233 0 129 103 322633 6.9538 209679 4.5169 76 4244 86379 27964 225809 4077 94 88183 107 68430 13,(6,3,4)

[#] Please note that GAA and minimus2 were designed to merge two assemblies at a time, we thus performed all ([ runs]) and took the average scores. [*] Please note that the scores of minimus2 and GAA were taken from the average of ten random combinations (details).