The sequence data of E. coli K12 MG1655 were available at http://www.cbcb.umd.edu/software/PBcR/closure/index.html. A reference paper "Reducing assembly complexity of microbial genomes with single-molecule sequencing" was posted on http://arxiv.org/abs/1304.3752.
Contents |
---|
Paired reads are available at Illumina Miseq (Mate1, Mate2)
Read length: 151bp
Read amount: 5,729,470 X2
Insert size ~ 300bp
The data in frg format were downloaded from Miseq100X
We have trimmed the sequence reads to be of error probability less than 0.05. The paired-end reads were discarded if one read is shorter than 150bp.
We therefore obtained 1,839,935 paired-end reads (~118X) with high quality for further analysis.
Although the PacBio sequence reads are available at SRA, we can not handle adapters correctly by using fastq-dump. We therefore requested for the h5 files from NCBI help desk.