Version Differences for Dataset2

Line 1:
    + = Data =  
    + == Short reads ==  
    + Paired reads are available at [http://www.illumina.com/systems/miseq/scientific_data.ilmn Illumina Miseq] ([ftp://webdata:webdata@ussd-ftp.illumina.com/Data/SequencingRuns/MG1655/MiSeq_Ecoli_MG1655_110721_PF_R1.fastq.gz Mate1], [ftp://webdata:webdata@ussd-ftp.illumina.com/Data/SequencingRuns/MG1655/MiSeq_Ecoli_MG1655_110721_PF_R2.fastq.gz Mate2])  
       
    + Read length: 151bp  
       
    + Read amount: 5,729,470 X2  
       
    + Insert size ~ 300bp  
       
    + We converted the required format using Picard tools ([http://picard.sourceforge.net/command-line-overview.shtml ref]).  
       
    + java -jar SamToFastq.jar INPUT=Ecoli_MG1655_s_6_1_bfast.bam FASTQ=Ecoli_MG1655_s1.fastq  
    + java -jar SamToFastq.jar INPUT=Ecoli_MG1655_s_6_2_bfast.bam FASTQ=Ecoli_MG1655_s2.fastq  
       
    + We combined MiSeq_Ecoli_MG1655_110721_PF_R1.fastq and MiSeq_Ecoli_MG1655_110721_PF_R2.fastq to MiSeq_PE.fastq.  
       
    + java convertFastqToFastaAndQual MiSeq_PE.fastq MiSeq_PE.fna MiSeq_PE.qual  
    + convert-fasta-to-v2.pl -mean 214 -stddev 21 -m Mate_info -l Illumia_Ecoli -s Ecoli_MG1655_PE.fna -q Ecoli_MG1655_PE.qual > Ecoli_MG1655_PE.frg  
    + == Long reads ==  
    + 1 SMRT Cell of 10 kbp continuous long reads (CLR) for Escherichia coli K12 MG1655 were downloaded from [https://github.com/PacificBiosciences/DevNet/wiki/E%20coli%20K12%20MG1655%20Hybrid%20Assembly this link].  
       
    + The file of PacBio_10kb_CLR.fastq contains ~21X of E. coli CLR reads from a 10kb library that was filtered using standard PacBio filtering thresholds (minimum RQ=0.75, RL=50bp) ([http://files.pacb.com/datasets/secondary-analysis/e-coli-k12-de-novo/1.3.0/README.txt ref]).  
    + = Assemblies with short reads only =  
    + Abyss, Edena, SPAdes, SOAPdenovo2, Velvet, CISA (Note at: 20130807_MG1655_s1s2_with_verious_Assemblers)  
    + MaSuRCA (note at: MaSuRCA assembler)  
       
    + = PacBio corrected reads (PBcR) =  
    + To correct the PacBio CLR with raw short reads:  
    + pacBioToCA -length 1000 -partitions 200 -l PacBio_Illumia -s pacbio.spec -fastq PacBio_10kb_CLR.fastq Ecoli_MG1655_PE.frg 1>PacBio_Illumia_Ecoli.out 2>error.out  
       
    + To correct the PacBio CLR with 100X high-quality reads (p<0.05, length of paired-end read>=100bp)  
    + pacBioToCA -length 500 -partitions 200 -l PacBio_Illumia -s pacbio.spec -fastq PacBio_10kb_CLR.fastq 100X_Ecoli_PE.frg  
    + = Assemblies with hybrid methods =  
       
    + == Assemble corrected long reads ==  
    + runCA (20130807_MG1655_s1s2_with_verious_Assemblers for 100X, 20120628_20120628_PacBio_With_CA(Celera Assembler)_Wgs), MIRA3  
       
    + == Hybrid assemble from pre-assembled contigs and long reads ==  
       
    + AHA, PBJelly, Cerulean, Patch