Version Differences for SPAdes - hybrid

Revision as of 15 August 2014 02:37 n

 								We did assembly by SPAdes with [[Data|Dataset 4]] raw data first, and then used different subreads depths of [[Data|Dataset 5]] and [[Pacbio Data|Dataset 9]] to scaffold by SSPACE-longread.
 								We arbitrary chose 1-4 SMRT cells:<br>
 								One single SMRT cell: m120208_071634<br>
 								Two SMRT cells: m120228_210845 + m120208_122534<br>
 								Three SMRT cells: m120228_115504 + m120228_152936 + m120228_100807<br>
 								Four SMRT cells: m120228_171636 + m120228_223624 + m120228_100807 + m120228_190630 <br>
 								 spades.py -1 reads_1.fastq -2 reads_2.fastq -o output
 								 SSPACE-LongRead.pl -c contig.fasta -p filter_subreads.fasta -b output
 								= Evaluation =
 								We have evaluated the assemblies with [http://bioinf.spbau.ru/en/quast QUAST 2.3](reference genome [ftp://ftp.ncbi.nih.gov/genomes/Bacteria/Escherichia_coli_K_12_substr__MG1655_uid57779/ NC_000913] and [[Media: Ec_gene_result.ncbi |  Ec_gene_list]]). [[more detail]]
 								{| {{table}} border="1"
 								| align="center" style="background:#f0f0f0;"|'''Statistics without reference'''
 								| align="center" style="background:#f0f0f0;"|'''Miseq_only'''
 								| align="center" style="background:#f0f0f0;"|'''Miseq_1cell'''
 								| align="center" style="background:#f0f0f0;"|'''Miseq_2cell'''
 								| align="center" style="background:#f0f0f0;"|'''Miseq_3cell'''
 								| align="center" style="background:#f0f0f0;"|'''Miseq_4cell'''
 								| align="center" style="background:#f0f0f0;"|'''Miseq_17cell'''
 								| align="center" style="background:#f0f0f0;"|'''Miseq_d9'''
 								|-
 								| # contigs||86||15||18||16||15||17||14
 								|-
 								| Largest contig||285889||2497845||1260980||2501081||3194637||1954649||3392211
 								|-
 								| Total length||4577132||4632009||4633058||4636174||4638657||4633857||4632677
 								|-
 								| N50||139882||2497845||1238868||2501081||3194637||1238635||3392211
 								|-
 								| style="background:#f0f0f0;"| Misassemblies||||||||||||||
 								|-
 								| # misassemblies||2||9||10||10||10||8||8
 								|-
 								| Misassembled contigs length||215581||3193893||3244631||2705788||3657574||3243566||4050696
 								|-
 								| style="background:#f0f0f0;"| Mismatches||||||||||||||
 								|-
 								| # mismatches per 100 kbp||3.02||7.15||6.2||6.9||7.17||6.05||6.43
 								|-
 								| # indels per 100 kbp||0.46||1.06||1||1.32||1.27||0.95||1.06
 								|-
 								| # N's per 100 kbp||0||97.89||67.67||77.33||77.37||91.03||123.88
 								|-
 								| style="background:#f0f0f0;"| Genome statistics||||||||||||||
 								|-
 								| Genome fraction (%)||98.451||99.498||99.483||99.664||99.748||99.432||99.587
 								|-
 								| Duplication ratio||1.001||1.002||1.002||1.002||1.002||1.003||1.001
 								|-
 								| # genes||4399 + 32 part||4467 + 14 part||4465 + 13 part||4476 + 11 part||4477 + 11 part||4467 + 11 part||4470 + 13 part
 								|-
 								| NGA50||133059||571664||425173||852639||1039467||1039472||1039654
 								|-
 								|}