Version Differences for Home

(Completing microbial genome assemblies: strategy and performance comparisons)
(Completing microbial genome assemblies: strategy and performance comparisons)
Line 1:
- = Completing microbial genome assemblies: strategy and performance comparisons =   + = Completing bacterial genome assemblies: strategy and performance comparisons =  
       
  <font size=3>    <font size=3> 
- Determining the genomic sequences of microorganisms is the basis and prerequisite for understanding their biology and functional characterization. While the advent of low-cost, extremely high-throughput second-generation sequencing technologies and the parallel development of assembly algorithms have generated rapid and cost-effective genome assemblies, such assemblies are often unfinished, fragmented draft genomes as a result of the short read lengths and long repeats present in multiple copies. Several methods, such as ''' ALLPATH-LG, hybrid and non-hybrid approaches''', have been proposed to utilize the third-generation sequencing long reads that can span many thousands of bases to facilitate the assembly of complete microbial genomes. However, standardized procedures that aim at evaluating and comparing these approaches are currently insufficient in the context of assembly large libraries of genomic data.   + Determining the genomic sequences of microorganisms is the basis and prerequisite for understanding their biology and functional characterization. While the advent of low-cost, extremely high-throughput second-generation sequencing technologies and the parallel development of assembly algorithms have generated rapid and cost-effective genome assemblies, such assemblies are often unfinished, fragmented draft genomes as a result of the short read lengths and long repeats present in multiple copies. Third-generation, PacBio sequencing technologies address this problem by greatly increasing read length. Hybrid and non-hybrid approaches, have therefore been proposed to utilize the PacBio long reads that can span many thousands of bases to facilitate the assembly of complete microbial genomes. However, standardized procedures that aim at evaluating and comparing these approaches are currently insufficient.  
  </font>    </font> 
       
Line 8:
- In this article, we provide a comprehensive review of the above-motioned methods and collect datasets for the comparative assessment of the non-hybrid approaches—hierarchical genome-assembly process (HGAP) and self-correction approach (SCA). In addition to offering explicit and useful recommendations to practitioners, the review aims to aid in the design of a paradigm positioned to complete microbial genome assembly. Following a special methodology proposed by ALLPATHS-LG, the algorithm is supplied with three pre-prepared libraries—fragment, jump and long reads. ALLPATHS-LG is subsequently able to complete microbial genomes as the sequencing coverage is controlled at 100X. Although the hybrid approach could improve continuity over the assembly produced by second-generation sequencing reads, we remained unsuccessful in the completion of a complete genome using this approach. Both non-hybrid approaches—HGAP and SCA—are able to produce complete genomes provided that the third generation sequencing reads are adequately long and complete.   + We therefore provide a comprehensive comparison by collecting datasets for the comparative assessment. In addition to offering explicit and useful recommendations to practitioners, this study aims to aid in the design of a paradigm positioned to complete bacterial genome assembly.  
       
    + Following a special methodology proposed by ALLPATHS-LG, the algorithm is supplied with three pre-prepared libraries—fragment, jump and long reads. ALLPATHS-LG is subsequently able to complete bacterial genomes as the sequencing coverage of fragment library is controlled at 100X. Although other hybrid approaches (including PacBio corrected reads pipeline, SPAdes, SSPACE-LongRead) could greatly improve continuity over the assembly produced by second-generation sequencing reads, we have demonstrated that such a hybrid approach is not efficient way to complete bacterial genomes. Both non-hybrid approaches—hierarchical genome-assembly process and PacBio corrected reads pipeline via self-correction—are able to produce complete genomes provided that the third generation sequencing reads are adequately long and complete.  
  </font>    </font> 
  = Datasets employed in this study =    = Datasets employed in this study =