In running pacBioToCA, we found that the amount of PBcR was influenced by the parameter of genomeSize.
Short reads: 118X, long reads: one ~ four SMRT cell reads, w/wo genome size
Name | 200X filtered long reads | m120228_192221 | m120228_210845 | Two SMRT cells | Three SMRT cells | Four SMRT cells |
Filtered_subreads | seqs amount:383482 | seqs amount:38542 | seqs amount:44794 | seqs amount:77117 | seqs amount:113284 | seqs amount:136333 |
seq avg len:2422.877720 | seq avg len:2322.679985 | seq avg len:2334.414140 | seq avg len:2184.208709 | seq avg len:2333.977711 | seq avg len:2386.664674 | |
total:929.13 Mb | total:89.52 Mb | total:104.57 Mb | total:168.44 Mb | total:264.40 Mb | total:325.38 Mb | |
depth: 199.81X | depth: 19.25X | depth: 22.49X | depth: 36.22X | depth: 56.86X | depth: 69.97X | |
without genome size | ||||||
seqs amount:332880 | seqs amount:35199 | seqs amount:40811 | seqs amount:64201 | seqs amount:99285 | seqs amount:120296 | |
seq avg len:2260.68262 | seq avg len:2095.143186 | seq avg len:2086.568670 | seq avg len:2150.165184 | seq avg len:2221.782394 | seq avg len:2252.656963 | |
total:752.54 Mb | total:73.75 Mb | total:85.15 Mb | total:138.04 Mb | total:220.59 Mb | total:270.99 Mb | |
depth: 161.84X | depth: 15.86X | depth: 18.31X | depth: 29.69X | depth: 47.44X | depth: 58.28X | |
genomeSize=4650000 | ||||||
seqs amount:37879 | seqs amount:34852 | seqs amount:40486 | seqs amount:63411 | seqs amount:70468 | seqs amount:56298 | |
seq avg len:4927.683492 | seq avg len:2130.841559 | seq avg len:2120.237712 | seq avg len:2198.455315 | seq avg len:2815.903020 | seq avg len:3495.604515 | |
total:186.66 Mb | total:74.26 Mb | total:85.84 Mb | total:139.41 Mb | total:198.43 Mb | total:196.80 Mb | |
depth: 40.14X | depth: 15.97X | depth: 18.46X | depth: 29.98X | depth: 42.67X | depth: 42.32X |
runCA unitigger=bogart merSize=14 ovlMinLen= <ovl value> utgErrorRate=0.015 utgGraphErrorRate=0.015 utgGraphErrorLimit=0 utgMergeErrorRate=0.03 utgMergeErrorLimit=0 -p asm -d asm viaMiseq.frg
The <ovl value> parameter was set to approximately 40% of your average corrected sequence lengths (ref). As a general rule, if the average corrected length is less than 2.5Kbp, set it to 1000, if it is less than 3Kbp, set it to 1500, if it is less than 5.5Kbp, set it to 2000, if it is greater than 5.5Kbp, set it to 2500, and if it is greater than 6.5Kbp, set it to 3000.
The PBcR were filtered to 25X and then assembled by runCA.
genomeSize=4650000, 25X | Two SMRT cells | Three SMRT cells | Four SMRT cells |
seqs amount:40382 | seqs amount:24448 | seqs amount:21641 | |
seq avg len:2878.787529 | seq avg len:4754.996196 | seq avg len:5371.762719 | |
total:116.25 Mb | total:116.25 Mb | total:116.25 Mb | |
depth: 25.00X | depth: 25.00X | depth: 25.00X |