Contents |
---|
Paired reads are available at Illumina Miseq (Mate1, Mate2)
Read length: 151bp
Read amount: 5,729,470 X2
Insert size ~ 300bp
We combined MiSeq_Ecoli_MG1655_110721_PF_R1.fastq and MiSeq_Ecoli_MG1655_110721_PF_R2.fastq to MiSeq_PE.fastq.
java convertFastqToFastaAndQual MiSeq_PE.fastq MiSeq_PE.fna MiSeq_PE.qual convert-fasta-to-v2.pl -mean 297 -stddev 35 -m Mate_info -l Illumia_Ecoli -s MiSeq_PE.fna -q MiSeq_PE.qual > MiSeq_PE.frg
The data in frg format were downloaded from Miseq100X.frg
We have trimmed the sequence reads to be of error probability less than 0.05. The paired-end reads were discarded if one read is shorter than 150bp.
We therefore obtained 1,839,935 paired-end reads (~118X, tMiSeq_PE.frg) with high quality for further analysis.
17 SMRT Cells for E. coli MG1655 were downloaded (details in Data).
Abyss, Edena, SPAdes, SOAPdenovo2, Velvet, CISA (Note at: 20130807_MG1655_s1s2_with_verious_Assemblers) MaSuRCA (note at: MaSuRCA assembler)
To correct the PacBio CLR with raw short reads:
pacBioToCA -length 1000 -partitions 200 -l PacBio_Illumia -s pacbio.spec -fastq PacBio_10kb_CLR.fastq Ecoli_MG1655_PE.frg 1>PacBio_Illumia_Ecoli.out 2>error.out
To correct the PacBio CLR with 100X high-quality reads (p<0.05, length of paired-end read>=100bp)
pacBioToCA -length 500 -partitions 200 -l PacBio_Illumia -s pacbio.spec -fastq PacBio_10kb_CLR.fastq 100X_Ecoli_PE.frg
runCA (20130807_MG1655_s1s2_with_verious_Assemblers for 100X, 20120628_20120628_PacBio_With_CA(Celera Assembler)_Wgs), MIRA3
AHA, PBJelly, Cerulean, Patch