Contents |
---|
Paired reads are available at Illumina Miseq (Mate1, Mate2)
Read length: 151bp
Read amount: 5,729,470 X2
Insert size ~ 300bp
We combined MiSeq_Ecoli_MG1655_110721_PF_R1.fastq and MiSeq_Ecoli_MG1655_110721_PF_R2.fastq to MiSeq_PE.fastq.
java convertFastqToFastaAndQual MiSeq_PE.fastq MiSeq_PE.fna MiSeq_PE.qual convert-fasta-to-v2.pl -mean 297 -stddev 35 -m Mate_info -l Illumia_Ecoli -s MiSeq_PE.fna -q MiSeq_PE.qual > MiSeq_PE.frg
The data in frg format were downloaded from Miseq100X.frg
We have trimmed the sequence reads to be of error probability less than 0.05. The paired-end reads were discarded if one read is shorter than 150bp.
We therefore obtained 1,839,935 paired-end reads (~118X, tMiSeq_PE.frg) with high quality for further analysis.
17 SMRT Cells for E. coli MG1655 were downloaded (details in Data and Read Depths) and filtered with smrtpipe.
Abyss, Edena, SPAdes, SOAPdenovo2, Velvet, CISA (Note at: 20130718_MiSeq_with_verious_Assemblers) MaSuRCA (note at: MaSuRCA for MiSeq Data)
To correct the PacBio CLR with raw short reads:
pacBioToCA -length 1000 -partitions 200 -l PacBio_Illumia -s pacbio.spec -fastq PacBio_10kb_CLR.fastq Ecoli_MG1655_PE.frg 1>PacBio_Illumia_Ecoli.out 2>error.out
To correct the PacBio CLR with 100X high-quality reads (p<0.05, length of paired-end read>=100bp)
pacBioToCA -length 500 -partitions 200 -l PacBio_Illumia -s pacbio.spec -fastq PacBio_10kb_CLR.fastq 100X_Ecoli_PE.frg
runCA (20130807_MG1655_s1s2_with_verious_Assemblers for 100X, 20120628_20120628_PacBio_With_CA(Celera Assembler)_Wgs), MIRA3
AHA, PBJelly, Cerulean, Patch