H. volc

Revision as of 11 September 2012 03:10 by admin (Comments | Contribs)

Haloferax volcanii DS2

Contigs source

Three assemblies are available from How to score genome assemblies using the Mauve system.


Sequence assembly

Assembly Description
volc454 It was sequenced using 454 pyrosequencing by Roach Inc on a GS FLX Titanium instrument. 25x coverage of reads were obtained. Reads were assembled to contigs with Newbler by Roache.
volcV It was sequenced to 25x coverage using Illumina 100 nt read pairs with 500 nt inserts, and 15x coverage of 50 nt Illumina mate-pairs with 6.5 kbp insert. Both data type were generated by BGI. The assembly was constructed with velvet using the above ginve insert size estimates and default parameters. No read error ecoorection or quality trimming steps were performed.
volcIDBA It was sequenced with 80x coverage 76 nt read pairs with 300 nt inserts on an Illumina GAIIx instrument at UC Davis Genome Center, and 2x coverage of 50 nt mate-pairs with 6.5 kbp insert sequences at BGI. The reads were error corrected with REPTILE using default parameters, contigs assembled with IDBA using the custome parameters --mink 33 --maxk 78 and evertything else default, and scaffolded with SSPACE using the custom parameter -a 0.5 and everything else default.
  • Scored with Mauve metrics:
Name NumContigs NumAssemblyBases NumMisCalled NumUnCalled NumGapsRef NumGapsAssembly TotalBasesMissed %Missed ExtraBases %Extra BrokenCDS IntactCDS ContigN50 ContigN90 MaxContigLength
volc454 157 3920004 90 0 141 128 119928 2.9886 1818 0.0464 30 3908 123582 11735 217295
volcV 1394 4394403 925 30431 1848 1739 161388 4.0217 503214 11.4512 505 3433 843300 57 1354539
volcIDBA 367 3880100 1209 5884 999 991 155857 3.8839 22465 0.579 442 3496 19349 5537 99636

Contig integrator

All Contigs

Since minimus2 can only merge two assemblies at a time, we iteratively applied it to integrate more assemblies. We have thoroughly test all combinations for minimus2 in the case of H. volc because only three assemblies were available.

Evaluation

  • Benchmark genome
Haloferax_volcanii_DS2.gbk
  • Evaluate by Mauve Assembly Metrics
How to score genome assemblies using the Mauve system
  • Score with Mauve metrics:
Name NumContigs NumAssemblyBases DCJ_Distance NumMisCalled NumUnCalled NumGapsRef NumGapsAssembly TotalBasesMissed %Missed ExtraBases %Extra BrokenCDS IntactCDS ContigN50 ContigN90 MaxContigLength Blast_IntactCDS Units(>200) N50 cor.Units cor.N50 Errors,(Indel>=5,Inv,Rel)
Hvolc.454 157 3920004 121 90 0 141 128 119928 2.9886 1818 0.0464 30 3908 123582 11735 217295 3953 145 123582 137 121280 8,(3,0,5)
Hvolc.V 1555 3855484 1680 762 1527 1541 1542 197907 4.9318 16765 0.4348 450 3488 9037 1862 55518 3144 997 8440 1302 5773 201,(154,1,46)
Hvolc.IDBA 580 3871717 611 988 548 1080 989 162341 4.0455 20132 0.52 438 3500 12787 3473 53121 3411 580 12333 1100 6229 499,(479,9,11)
CISA 72 4041410 76 187 39 140 129 199738 4.9774 119648 2.9606 42 3896 107315 23789 222317 3908 69 127585 107 85026 37,(30,3,4)
minimus2(1,2,3) 65 4087988 73 530 1041 496 153 95469 2.3791 155563 3.8054 253 3685 171050 22079 342018 3977 65 182445 284 29652 213,(205,2,6)
minimus2(1,3,2) 71 4178001 80 551 964 531 170 199565 4.9731 337543 8.0791 258 3680 169924 21429 341963 3956 71 171745 293 29632 216,(205,6,5)
minimus2(2,1,3) 65 4089672 74 455 1061 505 157 97620 2.4327 171785 4.2005 259 3679 171050 21825 342018 3974 65 182445 290 28788 218,(210,2,6)
minimus2(2,3,1) 75 4296848 77 387 483 370 137 250571 6.2441 468713 10.9083 142 3796 146572 27186 312727 3915 75 150030 204 47330 122,(114,4,4)
minimus2(3,1,2) 71 4178049 82 574 927 533 175 198653 4.9504 337599 8.0803 262 3676 169924 21429 342039 3955 71 171745 294 29652 217,(206,6,5)
minimus2(3,2,1) 78 4341081 84 762 493 390 165 244751 6.0991 495987 11.4254 145 3793 137147 24300 312741 3929 78 150030 216 48253 131,(122,4,5)