Contents |
---|
[Summary]
pacbioToCA estimates genome_size, It's will get exact genome_size without genome_size parameter when there are only one or two SMRT cell. When amount of SMRT cell is bigger than 2, pacbioToCA with genome_size parameter will halp it get exact genome_size.
pacBioToCA -l viaMiseq -s pacbio.spec -t 10 -partitions 200 fastqFile=m120228_192221.fastq genomeSize=4650000 ../../tMiSeq_PE.frg
m120228_192221_42129_c100298890010000001523009207231260_s1_p0.fastq
seqs amount:38542
seq avg len:2322.679985
total:89.52 Mb
depth: 19.25X
(without genomeSize)PacBio_Illumia.fasta
seqs amount:34981
seq avg len:2133.783826
total:74.64 Mb
depth: 16.05X
4650000
(with genomeSize)viaMiseq.fasta
seqs amount:34852
seq avg len:2130.841559
total:74.26 Mb
depth: 15.97X => 看起來在只有一組的情況下,沒有差很多
pacBioToCA -l viaMiseq -s pacbio.spec -t 10 -partitions 200 fastqFile=Filtered_two.fastq genomeSize=4650000 ../tMiSeq_PE.frg
Filtered_two.fastq
seqs amount:77117
seq avg len:2184.208709
total:168.44 Mb
depth: 36.22X
(without genomeSize)PacBio_Illumia.fasta
seqs amount:63760
seq avg len:2199.845561
total:140.26 Mb
depth: 30.16X
4650000
(with genomeSize)viaMiseq.fasta
seqs amount:63411
seq avg len:2198.455315
total:139.41 Mb
depth: 29.98X => 看起來在只有二組的情況下,沒有差很多
pacBioToCA -l viaMiseq -s pacbio.spec -t 10 -partitions 200 fastqFile=Filtered_three.fastq genomeSize=4650000 ../tMiSeq_PE.frg
Filtered_three.fastq
seqs amount:113284
seq avg len:2333.977711
total:264.40 Mb
depth: 56.86X
(without genomeSize)PacBio_Illumia.fasta
seqs amount:98165
seq avg len:2286.482249
total:224.45 Mb
depth: 48.27X
4650000
(with genomeSize)viaMiseq.fasta
seqs amount:70468
seq avg len:2815.903020
total:198.43 Mb
depth: 42.67X => 看起來要有三組以上的的情況下,genomeSize才有效果。
pacBioToCA -l viaMiseq -s pacbio.spec -t 10 -partitions 200 fastqFile=Filtered_four.fastq genomeSize=4650000 ../tMiSeq_PE.frg
Filtered_four.fastq
seqs amount:136333
seq avg len:2386.664674
total:325.38 Mb
depth: 69.97X
(without genomeSize)PacBio_Illumia.fasta
seqs amount:118901
seq avg len:2320.548322
total:275.92 Mb
depth: 59.34X
4650000
(with genomeSize)viaMiseq.fasta
seqs amount:56298
seq avg len:3495.604515
total:196.80 Mb
depth: 42.32X => 看起來要有三組以上的的情況下,genomeSize才有效果。
1
基本用法 runCA with asm.spec
runCA -p asm -d asm -s asm.spec PBcR.viaMiseq.frg > asm.out 2>&1
2
web 提供,RunCA with parameters(Celera_Assembler_Parameters):
runCA unitigger=bogart merSize=14 ovlMinLen=2000 utgErrorRate=0.015 utgGraphErrorRate=0.015 utgGraphErrorLimit=0 utgMergeErrorRate=0.03 utgMergeErrorLimit=0 -p asm -d asm asm.overCov.frg
3
Paper Script提供(asmCorrected.sh), RunCA with asm.spec and parameters: 照asmCorrected.sh上面的順序不會work,後來依照runCA 指令說明去放=> usage: runCA -d <dir> -p <prefix> [options] <frg> 就ok了。 並且使用wgs-package提供的asm.spec提供 D:\Boss Jade\201306\20130614_Hybrid assembly to_do_list\Filter_good_long_read\wgs-package\doc 但記得將grid 設0和 sge 關掉(mark掉)
runCA -p asm -d asm -s asm.spec unitigger=bogart utgErrorRate=0.015 ovlMinLen=2000 ovlErrorRate=0.03 cgwErrorRate=0.10 cnsErrorRate=0.10 utgGraphErrorLimit=0 utgGraphErrorRate=0.015 utgMergeErrorLimit=0 utgMergeErrorRate=0.03 asm.overCov.frg