E. coli

Revision as of 17 October 2013 02:40 by admin (Comments | Contribs)

The Illumina sequencing data were available at ALLPATHS-LG website, Please refer to Finished bacterial genomes from shotgun sequence data. Genome Research 2012 for detail.

Contents

E. coli

Website data

The Illumina and pacbio data were downloaded from ALLPATHS-LG website : ecoli_data_alt.tar.gz

[[[Fragment library]]] Reads length : 101bp
Reads amount : 1186190 X2
Insert size : 180bp
Coverage : 46.02X Jumping library 1 Reads length : 93bp
Reads amount : 1615702 X2
Insert size : 3000bp
Jumping library 2 Reads length : 93bp
Reads amount : 362199 X2
Insert size : 3000bp
PacBio reads Reads average length : 1514.24bp
Reads amount : 409304
Coverage : 133.58X

Raw data

The raw data of website data from Sequence Read Archive (SRA)

Fragment library

Accession : SRX131033
Reads length : 101bp
Reads amount : 13457571 X2
Insert size : 180bp
Coverage : 522.1X

Jumping library 1

Accession : SRX117481

Jumping library 2

Accession : SRR492488

PacBio reads

Accession : SRX109917, SRX109901(SRR386913, SRR387092, SRR386907, SRR387035), SRX109936

Self-fraction data

We randomly selected the same fraction as website data from fragment library of raw data by prepare.sh.

PrepareAllPathsInputs.pl\
DATA_DIR=$PWD/test.genome/data\
PLOIDY=1\
FRAG_FRAC=0.088\
IN_GROUPS_CSV=in_groups.csv\
IN_LIBS_CSV=in_libs.csv\
OVERWRITE=True\
| tee prepare.out 

100X fragment reads

We randomly selected 100X coverage data from fragment library of raw data by prepare.sh.

Fraction = 100 / 522.1 = 0.192

PrepareAllPathsInputs.pl\
DATA_DIR=$PWD/test.genome/data\
PLOIDY=1\
FRAG_FRAC=0.192\
IN_GROUPS_CSV=in_groups.csv\
IN_LIBS_CSV=in_libs.csv\
OVERWRITE=True\
| tee prepare.out