S. aureu 35 bp

Staphylococcus aureus strain MW2. The S. aureu MW2 consists of a circular chromosome of 2.82 Mb and a plasmid of 20.7 kb in length, respectively.


Read source

The sequence reads were from the supplemental data (supp data) of David Hernandez et al.'s publication (Genome Research 2008). The data set is made up of 3.86 million of 35-bp reads.

Sequence assembly

Software Version Parameters
ABySS 1.3.0 k=23
Velvet 1.1.04 k=23 min_contig_lgth=100 scaffolding=no
Edena 3 m=21
SOAPdenovo 1.05 K=23 M=3

The name of merged file: Merged_ctg.fa

Contig integrator

All Contigs
Beacuase minimus2 and GAA merge two assemblies at a time, we iteratively integrate the four assemblies in random order.
minimus2: A_S_E_V, A_S_V_E, A_V_S_E, A_E_V_S, E_V_S_A, E_A_V_S, S_A_V_E, S_V_E_A, S_E_V_A, V_A_E_S
GAA: A_S_E_V, A_S_V_E, A_E_S_V, A_V_S_E, E_S_V_A, E_V_S_A, S_V_A_E, V_E_S_A, V_E_A_S, V_A_S_E

The split references for MAIA and the integrated results can be downloaded Maia_Stapy_35.

Evaluation

  • Benchmark genome
Staphylococcus aureus strain MW2
  • Evaluated by Mauve Assembly Metrics to calculate the values for the left columns of "N50, Blast_IntactCDS"
How to score genome assemblies using the Mauve system (mauve_linux_snapshot_2011-08-31)
  • Evaluated by Blast with Features
  • Evaluated by GAGE to calculate the values for the right columns of "Blast_IntactCDS"
Gage
  • Score with Mauve Assembly Metrics, N50, Blast and GAGE:
Name NumContigs NumAssemblyBases DCJ_Distance NumMisCalled NumUnCalled NumGapsRef NumGapsAssembly TotalBasesMissed %Missed ExtraBases %Extra BrokenCDS IntactCDS ContigN50^ ContigN90 MaxContigLength N50^ Blast_IntactCDS Units(>200) N50^ cor.Units cor.N50^ Errors,(Indel>=5,Inv,Rel)
Abyss 929 2769174 898 41 0 796 601 95493 3.3611 8641 0.312 171 2480 7793 1635 32717 7810 2337 738 7467 740 7463 2,(0,0,2)
Edena 931 2757686 882 9 0 705 711 99275 3.4942 6966 0.2526 188 2463 6962 1672 37100 6969 2293 746 6778 748 6774 2,(0,0,2)
Soap 944 2781524 917 47 0 853 597 79607 2.802 11796 0.4241 166 2485 6386 1614 26967 6427 2348 780 6360 782 6348 3,(0,0,3)
Velvet 1152 2775301 1124 49 0 1010 866 89319 3.1438 15087 0.5436 230 2421 5337 1329 22892 5348 2238 941 5312 943 5312 2,(0,0,2)
CISA 665 2776108 635 50 0 571 388 79806 2.809 5651 0.2036 100 2551 10533 2460 42008 10605 2446 534 10423 536 10423 2,(0,0,2)
GAA# 1015 2783335 959 75 0 860 677 82390 2.8999 12532 0.4502 187 2464 6601 1547 29922 6634 2312 805 6489 807 6482 2,(0,0,2)
GAA* 1046 2794625 956 78 0 870 693 78662 2.76872 14929 0.53414 189 2463 6708 1521 30079 6750 2314 808 6598 810 6597 2,(0,0,2)
MAIA (split3) 3 2885095 3 2 16801 482 482 86882 3.058 122056 4.2306 180 2471 20697 20697 1435644 1428754 2403 - - - - -
MAIA (split3&n) 769 2776022 767 9 440 690 618 103881 3.6563 19111 0.6884 177 2474 8590 1929 51874 8610 2360 620 8489 619 8130 1,(0,0,1)
minimus2# 739 2770378 723 48 0 675 505 87738 3.0881 8900 0.3212 135 2516 8941 2047 35867 9006 2401 633 8770 635 8760 2,(0,0,2)
minimus2* 568 2769000 560 74 0 530 403 85757 3.0184 7241 0.26149 103 2548 10672 2586 42022 11094 2450 516 10549 518 10467 2,(0,0,2)

[^] Please note that the ContigN50 calculated by Mauve Assembly Metrics is incorrect (off-by-one error). We have followed the definition of N50 (A contig N50 is calculated by first ordering every contig by length from longest to shortest. Next, starting from the longest contig, the lengths of each contig are summed, until this running sum equals one-half of the total length of all contigs in the assembly. The contig N50 of the assembly is the length of the shortest contig in this list. ref) to calculate N50s. As stated in GAGE, GAGE's N50 was calculated using the total reference genome length rather than the sum total of contig lengths. The GAGE's cor.N50 values were computed after correcting contigs by breaking them at each error.

[#] Please note that GAA and minimus2 were designed to merge two assemblies at a time, we thus performed all runs and took the average scores.

[*] Please note that the scores of minimus2 and GAA were taken from the average of ten random combinations (details).