Line 54: |
|
OVERWRITE=True\ |
|
OVERWRITE=True\ |
|
| tee prepare.out |
|
| tee prepare.out |
|
|
|
|
|
|
+ |
== Evaluation ==
|
|
|
|
|
|
|
+ |
* '''Benchmark genome'''
|
|
|
|
|
|
|
+ |
: [ftp://ftp.ncbi.nih.gov/genomes/Bacteria/Escherichia_coli_K_12_substr__MG1655_uid57779/ E. coli MG1655]
|
|
|
|
|
|
|
+ |
* '''Evaluated by QUAST'''
|
|
|
+ |
: [http://bioinf.spbau.ru/en/quast/ QUAST] (QUAST v2.2)
|
|
|
+ |
: Running QUAST needs [[Media:R_sphaeroides.ncbi.gz|gene]] and [[Media:R_sphaeroides.fna.gz|sequence]] information. There are 4497 genes in total.
|
|
|
|
|
|
|
+ |
*'''Score with QUAST: With PacBio Long Reads''' more detail
|
|
|
|
|
|
|
+ |
{| {{table}} border="1"
|
|
|
+ |
| align="left" style="background:#f0f0f0;"|'''Basic statistics'''
|
|
|
+ |
| align="left" style="background:#f0f0f0;"|'''Raw Data'''
|
|
|
+ |
| align="left" style="background:#f0f0f0;"|'''Website Data'''
|
|
|
+ |
| align="left" style="background:#f0f0f0;"|'''Self-fraction Data'''
|
|
|
+ |
| align="left" style="background:#f0f0f0;"|'''100 Coverage'''
|
|
|
|
|
|
|
+ |
|-
|
|
|
+ |
| # contigs||14||1||1||1
|
|
|
+ |
|-
|
|
|
+ |
| Largest contig||4625005||4638970||4638970||4638970
|
|
|
+ |
|-
|
|
|
+ |
| Total length||4652215||4638970||4638970||4638970
|
|
|
+ |
|-
|
|
|
+ |
| N50||4625005||4638970||4638970||4638970
|
|
|
+ |
|-
|
|
|
+ |
| align="left" style="background:#f0f0f0;"|'''Misassemblies'''
|
|
|
+ |
|-
|
|
|
+ |
| # misassemblies||5||1||1||1
|
|
|
+ |
|-
|
|
|
+ |
|Misassembled contigs length ||4625005||4638970||4638970||4638970
|
|
|
+ |
|-
|
|
|
+ |
| align="left" style="background:#f0f0f0;"|'''Mismatches'''
|
|
|
+ |
|-
|
|
|
+ |
|# mismatches per 100kbp ||1.06||0.11||0.06||0.06
|
|
|
+ |
|-
|
|
|
+ |
|# indels per 100kbp ||0.61||0.09||0.09||0.11
|
|
|
+ |
|-
|
|
|
+ |
|# N's per 100kbp ||282.94||0||0||0
|
|
|
+ |
|-
|
|
|
+ |
| align="left" style="background:#f0f0f0;"|'''Genome statistics'''
|
|
|
+ |
|-
|
|
|
+ |
| Genome fraction (%)||99.418||99.983||99.983||99.983
|
|
|
+ |
|-
|
|
|
+ |
| Duplication ratio ||1.013||1||1||1
|
|
|
+ |
|-
|
|
|
+ |
| # genes||4471 + 2 part ||4494 + 1 part ||4494 + 1 part ||4495 + 0 part
|
|
|
+ |
|-
|
|
|
+ |
| NGA50 || 2714032 || 4638970 || 3763133 || 4032768
|
|
|
+ |
|-
|
|
|
+ |
|}
|
|
|
|
|
|
|
|
|
|
|
+ |
*'''Score with QUAST: Without PacBio Long Reads ''' more detail
|
|
|
+ |
{| {{table}} border="1"
|
|
|
+ |
| align="left" style="background:#f0f0f0;"|'''Basic statistics'''
|
|
|
+ |
| align="left" style="background:#f0f0f0;"|'''Raw Data'''
|
|
|
+ |
| align="left" style="background:#f0f0f0;"|'''Website Data'''
|
|
|
+ |
| align="left" style="background:#f0f0f0;"|'''Self-fraction Data'''
|
|
|
+ |
| align="left" style="background:#f0f0f0;"|'''100 Coverage'''
|
|
|
|
|
|
|
+ |
|-
|
|
|
+ |
| # contigs||1||2||5||3
|
|
|
+ |
|-
|
|
|
+ |
| Largest contig||4633080||4631220||4575759||4560636
|
|
|
+ |
|-
|
|
|
+ |
| Total length||4633080||4633146||4698903||4713335
|
|
|
+ |
|-
|
|
|
+ |
| N50||4633080||4631220||4575759||4590636
|
|
|
+ |
|-
|
|
|
+ |
| align="left" style="background:#f0f0f0;"|'''Misassemblies'''
|
|
|
+ |
|-
|
|
|
+ |
| # misassemblies||7||3||8||8
|
|
|
+ |
|-
|
|
|
+ |
|Misassembled contigs length ||4633080||4631220||4577746||4711603
|
|
|
+ |
|-
|
|
|
+ |
| align="left" style="background:#f0f0f0;"|'''Mismatches'''
|
|
|
+ |
|-
|
|
|
+ |
|# mismatches per 100kbp ||1.42||1.19||2.84||1.89
|
|
|
+ |
|-
|
|
|
+ |
|# indels per 100kbp ||0.83||1.13||3.26||0.61
|
|
|
+ |
|-
|
|
|
+ |
|# N's per 100kbp ||1545.02||533.22||698.87||801.7
|
|
|
+ |
|-
|
|
|
+ |
| align="left" style="background:#f0f0f0;"|'''Genome statistics'''
|
|
|
+ |
|-
|
|
|
+ |
| Genome fraction (%)||98.343||99.345||99.265||99.284
|
|
|
+ |
|-
|
|
|
+ |
| Duplication ratio ||1.016||1.012||1.021||1.028
|
|
|
+ |
|-
|
|
|
+ |
| # genes||4395 + 31 part ||4465 + 11 part ||4460 + 14 part ||4459 + 9 part
|
|
|
+ |
|-
|
|
|
+ |
| NGA50 ||687701||3180483||654008||1295677
|
|
|
+ |
|-
|
|
|
+ |
|}
|