Dataset2

Revision as of 21 August 2013 20:39 by admin (Comments | Contribs)
Contents

Data

Short reads

Paired reads are available at Illumina Miseq (Mate1, Mate2)

Read length: 151bp

Read amount: 5,729,470 X2

Insert size ~ 300bp

We combined MiSeq_Ecoli_MG1655_110721_PF_R1.fastq and MiSeq_Ecoli_MG1655_110721_PF_R2.fastq to MiSeq_PE.fastq.

java convertFastqToFastaAndQual MiSeq_PE.fastq MiSeq_PE.fna MiSeq_PE.qual
convert-fasta-to-v2.pl -mean 297 -stddev 35 -m Mate_info -l Illumia_Ecoli -s MiSeq_PE.fna -q MiSeq_PE.qual > MiSeq_PE.frg

The data in frg format were downloaded from Miseq100X.frg

We have trimmed the sequence reads to be of error probability less than 0.05. The paired-end reads were discarded if one read is shorter than 150bp.

We therefore obtained 1,839,935 paired-end reads (~118X, tMiSeq_PE.frg) with high quality for further analysis.

Long reads

17 SMRT Cells for E. coli MG1655 were downloaded (details in Data and Read Depths) and filtered with smrtpipe.

Assemblies with short reads only

Abyss, Edena, SPAdes, SOAPdenovo2, Velvet, CISA (Note at: 20130718_MiSeq_with_verious_Assemblers) MaSuRCA (note at: MaSuRCA for MiSeq Data)

Assemblies with hybrid methods

Assemble corrected long reads

runCA (Details can be seen in Read Depths, pacBioToCA and runCA), MIRA3

Hybrid assemble from pre-assembled contigs and long reads

AHA, PBJelly, Cerulean, Patch