We have re-run correction and assembly with the data provided in PBcR closure project.
We have corrected the long read sequence data (200X) with illumina short reads (100X), with or without specifying genome size.
pacBioToCA -l viaMiseq -s pacbio.spec -t 10 -partitions 200 fastqFile=filtered_subreads.200X.fastq.bz2 miseq.100X.frg.bz2
pacBioToCA -l viaMiseq -s pacbio.spec -t 10 -partitions 200 fastqFile=filtered_subreads.200X.fastq.bz2 genomeSize=4650000 miseq.100X.frg.bz2
200X filtered long reads | Without genomeSize | genomeSize=4650000 |
seqs amount:383482 | seqs amount:332880 | seqs amount:37879 |
seq avg len:2422.877720 | seq avg len:2260.68262 | seq avg len:4927.683492 |
total:929.13 Mb | total:752.54 Mb | total:186.66 Mb |
depth: 199.81X | depth: 161.84X | depth: 40.14X |