Dataset1

Revision as of 21 August 2013 02:45 by admin (Comments | Contribs) | (→Long reads)

(diff) ← Previous revision | Current revision | Next revision → (diff)

Contents [hide]
1 Data 1.1 Short reads 1.2 Long reads 2 Assemblies with short reads only 3 PacBio corrected reads (PBcR) 4 Assemblies with hybrid methods 4.1 Assemble corrected long reads 4.2 Hybrid assemble from pre-assembled contigs and long reads

Data

Short reads

The paired-end illumina read data of E. coli were downloaded from Illumina (|Illumina) with a median insert size of 214 bp. More than 28.4 M reads.

We converted the required format using Picard tools (ref).

java -jar SamToFastq.jar INPUT=Ecoli_MG1655_s_6_1_bfast.bam FASTQ=Ecoli_MG1655_s1.fastq
java -jar SamToFastq.jar INPUT=Ecoli_MG1655_s_6_2_bfast.bam FASTQ=Ecoli_MG1655_s2.fastq

We combined MiSeq_Ecoli_MG1655_110721_PF_R1.fastq and MiSeq_Ecoli_MG1655_110721_PF_R2.fastq to MiSeq_PE.fastq.

java convertFastqToFastaAndQual MiSeq_PE.fastq MiSeq_PE.fna MiSeq_PE.qual
convert-fasta-to-v2.pl -mean 214 -stddev 21 -m Mate_info -l Illumia_Ecoli -s Ecoli_MG1655_PE.fna -q Ecoli_MG1655_PE.qual > Ecoli_MG1655_PE.frg

Long reads

1 SMRT Cell of 10 kbp continuous long reads (CLR) for Escherichia coli K12 MG1655 were downloaded from this link.

The file of PacBio_10kb_CLR.fastq contains ~21X of E. coli CLR reads from a 10kb library that was filtered using standard PacBio filtering thresholds (minimum RQ=0.75, RL=50bp) (http://files.pacb.com/datasets/secondary-analysis/e-coli-k12-de-novo/1.3.0/README.txt ref).

Assemblies with short reads only

Abyss, Edena, SPAdes, SOAPdenovo2, Velvet, CISA (Note at: 20130807_MG1655_s1s2_with_verious_Assemblers) MaSuRCA (note at: MaSuRCA assembler)

PacBio corrected reads (PBcR)

pacBioToCA -length 1000 -partitions 200 -l PacBio_Illumia -s pacbio.spec -fastq PacBio_10kb_CLR.fastq Ecoli_MG1655_PE.frg 1>PacBio_Illumia_Ecoli.out 2>error.out

Dataset1

Data

Short reads

Long reads

Assemblies with short reads only

PacBio corrected reads (PBcR)

Assemblies with hybrid methods

Assemble corrected long reads

Hybrid assemble from pre-assembled contigs and long reads