S. aureu 101 bp

Staphylococcus aureus strain USA300_TCH1516. The S. aureu USA300_TCH1516 consists of a circular chromosome of 2,872,915 bp and two plasmid of 3,125 bp and 27,041 bp in length, respectively.

Read source

The paired-end illuminia read data of S. aureus were downloaded from Gage with a median insert size of 180 bp. More than 1.2 M reads

Sequence assembly

Software Version Parameters
ABySS 1.3.0 k=41
Velvet 1.1.04 VelvetOptimiser --s 29 --e 97
Edena 3 m=41
SOAPdenovo 1.05 k=41 M=3 avg_ins=170

The name of merged file: Merged_ctg.fa

Contig integrator

All Contigs
Beacuase minimus2 and GAA merge two assemblies at a time, we iteratively integrate the four assemblies in random order.
minimus2: A_E_V_S, E_V_S_A, E_S_V_A, E_V_A_S, S_V_E_A, S_A_V_E, S_E_A_V, V_A_S_E, V_E_A_S, V_S_E_A
GAA: A_S_E_V, A_V_E_S, A_E_V_S, E_A_V_S, E_V_S_A, S_E_A_V, S_V_A_E, V_S_A_E, V_A_E_S, V_A_S_E

The split references for MAIA and the integrated results can be downloaded maia_staphy_100.

Evaluation

  • Benchmark genome
S. aureus USA300_TCH1516
  • Evaluated by Mauve Assembly Metrics to calculate the values for the left columns of "N50, Blast_IntactCDS"
How to score genome assemblies using the Mauve system (mauve_linux_snapshot_2011-08-31)
  • Evaluated by Blast with Features
  • Evaluated by GAGE to calculate the values for the right columns of "Blast_IntactCDS"
Gage
  • Score with Mauve Assembly Metrics, N50, Blast and GAGE:
Name NumContigs NumAssemblyBases DCJ_Distance NumMisCalled NumUnCalled NumGapsRef NumGapsAssembly TotalBasesMissed %Missed ExtraBases %Extra BrokenCDS IntactCDS ContigN50^ ContigN90 MaxContigLength N50^ Blast_IntactCDS Units(>200) N50^ cor.Units cor.N50^ Errors,(Indel>=5,Inv,Rel)
Abyss 659 2854631 590 132 6 436 499 70014 2.4117 7077 0.2479 207 2486 9223 2512 35459 9229 2305 548 9154 554 9115 7,(2,0,5)
Edena 3287 2557545 3143 224 1 2957 3033 390614 13.4552 15975 0.6246 784 1909 1256 359 8680 1256 1053 2679 1073 2680 1072 4,(1,0,3)
SOAPdenovo 674 2872327 522 85 1 482 437 71040 2.4471 10463 0.3643 154 2539 9626 3069 47607 9762 2361 509 9626 511 9626 3,(2,0,1)
Velvet 502 2858949 432 153 12 377 386 69466 2.3928 7682 0.2687 137 2556 12962 3811 54726 13005 2421 405 12685 422 12217 19,(10,4,5)
CISA 347 2866024 330 266 15 316 278 60323 2.0779 11406 0.398 121 2572 14992 4664 54747 15327 2482 322 14916 343 14743 20,(6,5,9)
GAA# 1287 2798306 1166 191 6 1069 1084 144098 4.9636 12985 0.4691 322 2371 8319 2459 36637 8374 2045 1035 8209 1040 8095 9,(4,1,4)
GAA* 1150 2827068 1022 219 7 970 952 123336 4.24845 17517 0.6216 292 2401 8977 2614 38358 9026 2123 919 8943 925 8835 10,(4,2,4)
MAIA (split4) 4 2924771 6 127 6767 504 505 87938 3.0291 115881 3.9621 146 2547 1426408 5502 1464437 1464437 2390 - - - - -
MAIA (split4&n) 505 2859291 498 105 120 478 407 103565 3.5674 24695 0.8637 141 2552 12570 3840 52790 12800 2376 404 12469 401 11838 2,(1,0,1)
minimus2# 421 2863142 399 206 1 359 343 70296 2.4214 15467 0.5400 142 2552 13049 3951 50951 13159 2425 396 13042 408 12725 13,(5,2,6)
minimus2* 302 2852733 302 276 1 308 272 90585 3.12 26239 0.92 114 2579 16577 5117 54766 16835 2468 299 16473 319 15528 22,(6,7,9)

[^] Please note that the ContigN50 calculated by Mauve Assembly Metrics is incorrect (off-by-one error). We have followed the definition of N50 (A contig N50 is calculated by first ordering every contig by length from longest to shortest. Next, starting from the longest contig, the lengths of each contig are summed, until this running sum equals one-half of the total length of all contigs in the assembly. The contig N50 of the assembly is the length of the shortest contig in this list. ref) to calculate N50s. As stated in GAGE, GAGE's N50 was calculated using the total reference genome length rather than the sum total of contig lengths. The GAGE's cor.N50 values were computed after correcting contigs by breaking them at each error.

[#] Please note that GAA and minimus2 were designed to merge two assemblies at a time, we thus performed all runs and took the average scores.

[*] Please note that the scores of minimus2 and GAA were taken from the average of ten random combinations (details).