H. volc

Revision as of 09 May 2012 19:08 by admin (Comments | Contribs) | (Evaluation)

Haloferax volcanii DS2

Contigs source

Three assemblies are available from How to score genome assemblies using the Mauve system.


Sequence assembly

Assembly Description Download §
volc454 It was sequenced using 454 pyrosequencing by Roach Inc on a GS FLX Titanium instrument. 25x coverage of reads were obtained. Reads were assembled to contigs with Newbler by Roache. assembly1.fasta
volcV It was sequenced to 25x coverage using Illumina 100 nt read pairs with 500 nt inserts, and 15x coverage of 50 nt Illumina mate-pairs with 6.5 kbp insert. Both data type were generated by BGI. The assembly was constructed with velvet using the above ginve insert size estimates and default parameters. No read error ecoorection or quality trimming steps were performed. assembly2.fasta
volcIDBA It was sequenced with 80x coverage 76 nt read pairs with 300 nt inserts on an Illumina GAIIx instrument at UC Davis Genome Center, and 2x coverage of 50 nt mate-pairs with 6.5 kbp insert sequences at BGI. The reads were error corrected with REPTILE using default parameters, contigs assembled with IDBA using the custome parameters --mink 33 --maxk 78 and evertything else default, and scaffolded with SSPACE using the custom parameter -a 0.5 and everything else default. assembly3.fasta
§The files only contain the contigs, where scaffolds were split whenever >10 Ns occurs.

Merged File: Mauve_Contigs.fa

  • Scored with Mauve metrics:
Name NumContigs NumAssemblyBases NumMisCalled NumUnCalled NumGapsRef NumGapsAssembly TotalBasesMissed %Missed ExtraBases %Extra BrokenCDS IntactCDS ContigN50 ContigN90 MaxContigLength
volc454 157 3920004 90 0 141 128 119928 2.9886 1818 0.0464 30 3908 123582 11735 217295
volcV 1394 4394403 925 30431 1848 1739 161388 4.0217 503214 11.4512 505 3433 843300 57 1354539
volcIDBA 367 3880100 1209 5884 999 991 155857 3.8839 22465 0.579 442 3496 19349 5537 99636

Contig integrator

Integrator Download
CISA CISA
minimus2 minimus2 (1,2,3)
minimus2 minimus2 (1,3,2)
minimus2 minimus2 (2,1,3)
minimus2 minimus2 (2,3,1)
minimus2 minimus2 (3,1,2)
minimus2 minimus2 (3,2,1)

Since minimus2 can only merge two assemblies at a time, we iteratively applied it to integrate more assemblies. We have thoroughly test all combinations for minimus2 in the case of H. volc because only three assemblies were available.

Evaluation

  • Benchmark genome
Haloferax_volcanii_DS2.gbk
  • Evaluate by Mauve Assembly Metrics
How to score genome assemblies using the Mauve system
  • Score with Mauve metrics:
Name NumContigs NumAssemblyBases DCJ_Distance NumMisCalled NumUnCalled NumGapsRef NumGapsAssembly TotalBasesMissed %Missed ExtraBases %Extra BrokenCDS IntactCDS ContigN50 ContigN90 MaxContigLength Blast_IntactCDS
assembly1 157 3920004 121 90 0 141 128 119928 2.9886 1818 0.0464 30 3908 123582 11735 217295 3894
assembly2 1555 3855484 1680 762 1527 1541 1542 197907 4.9318 16765 0.4348 450 3488 9037 1862 55518 3124
assembly3 580 3871717 611 988 548 1080 989 162341 4.0455 20132 0.52 438 3500 12787 3473 53121 3362
CISA 72 4041410 76 187 39 140 129 199738 4.9774 119648 2.9606 42 3896 107315 23789 222317 3867
minimus2(1,2) 88 4044409 94 571 545 231 121 86680 2.16 112937 2.7924 128 3810 137837 20480 289094 3915
minimus2(1,3) 85 3942969 87 160 860 445 162 158935 3.9606 85484 2.168 214 3724 150060 17200 312972 3861
minimus2(2,1) 87 4044428 91 569 547 227 124 89124 2.2209 112937 2.7924 133 3805 137837 20480 289095 3913
minimus2(2,3) 363 4517347 407 1180 1335 963 700 397765 9.9122 862299 19.0886 434 3504 22506 6499 72168 3622
minimus2(3,1) 85 3942775 91 202 834 447 166 159470 3.9739 87592 2.2216 215 3723 150060 17200 312956 3859
minimus2(3,2) 363 4517334 405 952 1323 951 695 405157 10.0964 869696 19.2524 427 3511 22508 6499 72168 3622
minimus2(1,2,3) 65 4087988 73 530 1041 496 153 95469 2.3791 155563 3.8054 253 3685 171050 22079 342018 3907
minimus2(1,3,2) 71 4178001 80 551 964 531 170 199565 4.9731 337543 8.0791 258 3680 169924 21429 341963 3897
minimus2(2,1,3) 65 4089672 74 455 1061 505 157 97620 2.4327 171785 4.2005 259 3679 171050 21825 342018 3907
minimus2(2,3,1) 75 4296848 77 387 483 370 137 250571 6.2441 468713 10.9083 142 3796 146572 27186 312727 3854
minimus2(3,1,2) 71 4178049 82 574 927 533 175 198653 4.9504 337599 8.0803 262 3676 169924 21429 342039 3896
minimus2(3,2,1) 78 4341081 84 762 493 390 165 244751 6.0991 495987 11.4254 145 3793 137147 24300 312741 3858


NumContigs NumAssemblyBases DCJ_Distance NumDCJBlocks NumMisCalled NumUnCalled NumGapsRef NumGapsAssembly TotalBasesMissed PercBasesMissed ExtraBases PercExtraBases BrokenCDS IntactCDS ContigN50 ContigN90 MaxContigLength
117 4008739 79 81 114 2962 175 153 417408 10.4017 152882 3.8137 62 3876 146504 23521 338557
56 6860449 78 83 565 12302 442 258 818184 20.3888 3517672 51.2747 194 3744 230346 44977 1354541