Dataset1

Revision as of 21 August 2013 02:31 by admin (Comments | Contribs) | (Short reads)
Contents

Data

Short reads

The paired-end illumina read data of E. coli were downloaded from Illumina (|Illumina) with a median insert size of 214 bp. More than 28.4 M reads.

java -jar SamToFastq.jar INPUT=Ecoli_MG1655_s_6_1_bfast.bam FASTQ=Ecoli_MG1655_s1.fastq
java -jar SamToFastq.jar INPUT=Ecoli_MG1655_s_6_2_bfast.bam FASTQ=Ecoli_MG1655_s2.fastq

We have combined MiSeq_Ecoli_MG1655_110721_PF_R1.fastq and MiSeq_Ecoli_MG1655_110721_PF_R2.fastq to MiSeq_PE.fastq.

java convertFastqToFastaAndQual MiSeq_PE.fastq MiSeq_PE.fna MiSeq_PE.qual
convert-fasta-to-v2.pl -mean 214 -stddev 21 -m Mate_info -l Illumia_Ecoli -s Ecoli_MG1655_PE.fna -q Ecoli_MG1655_PE.qual > Ecoli_MG1655_PE.frg

Long reads

1 SMRT Cell of 10 kbp continuous long reads (CLR) for Escherichia coli K12 MG1655 were downloaded from this link.


Assemblies with short reads only

Abyss, Edena, SPAdes, SOAPdenovo2, Velvet, CISA (Note at: 20130807_MG1655_s1s2_with_verious_Assemblers) MaSuRCA (note at: MaSuRCA assembler)

PacBio corrected reads (PBcR)

pacBioToCA -length 1000 -partitions 200 -l PacBio_Illumia -s pacbio.spec -fastq PacBio_10kb_CLR.fastq Ecoli_MG1655_PE.frg 1>PacBio_Illumia_Ecoli.out 2>error.out

Assemblies with hybrid methods

Assemble corrected long reads

runCA, MIRA3


Hybrid assemble from pre-assembled contigs and long reads

AHA, PBJelly, Cerulean, Patch