| Home | Draft reconstruction | About | How-to | Contact | Supplementary |

MrBac  |  Annotation Sources  |  SBML

About MrBac

MrBac facilitates bacterial genome-scale metabolic network reconstruction through genome comparison by
    (i) performing reciprocal blast on query and reference bacterial ORFs of user’s choice,
    (ii) retrieving gene annotations extracted from metabolic related databases with specified blast parameter settings and
    (iii) reconstructing metabolic network of query species from reference network distributed in Systems Biology Markup Language (SBML).

By utilizing published reconstructed metabolic networks in SBML, MrBac provides an automated draft genome-scale generation service benefiting the systems biology research community by minimizing the time spent from genome ORFs to draft genome-scale metabolic networks.

The user can select a bacterial species of interest from the menu with one reference species at a time and select a desired e-value and a percent identity as the thresholds for the reciprocal blastp best ORF hits. Further parameter adjustments can be done to obtain filtered best hits before the retrieving of metabolic gene annotations extracted from KEGG, NCBI COG and TransportDB. Draft metabolic network generation requires a list of reciprocal blast best hits as well as the reference SBML file to proceed. A draft in SBML format can be saved as the final output and can be further curated and edited in flux balance analysis softwares.

Flow Chart

Annotation Sources

KEGG KO & Enzyme

We retrieved 855 bacterial organisms from
KEGG database.
KEGG Orthology (KO) database consists of manually defined ortholog groups that corresponds to KEGG pathway nodes and BRITE heirarchy nodes.
KEGG Enzyme database is an implementation of the Enzyme Nomenclature (EC number) of the IUBMB/IUPAC biochemical nomenclature committee.
Metabolic reactions catalized by EC numbers of the genes can give user clues on the functions involved in the organism.

NCBI COG

The database of Clusters of Orthologous Groups of proteins (COGs) classifies functions of proteins encoded in available whole genome sequences through comparison of protein sequences.
The COGs functional categories can be divided into: INFORMATION STORAGE AND PROCESSING, CELLULAR PROCESSES AND SIGNALING, METABOLISM, and POORLY CHARACTERIZED.
For metabolic network reconstruction, user can thus focuses on the ORFs of METABOLISM.

TransportDB

TransportDB is a database of information on cytoplasmic membrane transporters and outer membrane channels in organisms. The annotations are based on experimental and bioinformatics evidence, and classifications are made according to the mode of transport, bioenergetics, molecular phylogeny and substrate specificities.
We retrieved 274 bacterial organisms (excluding 14 organisms with unmatched GeneID compared to RefSeq).
For those organism names from TransportDB that do not match exactly to RefSeq database, the names are manually edited (Thermodesulfovibrio yellowstonii DSM11347 vs Thermodesulfovibrio yellowstonii DSM 11347, for example); for organisms with slightly different GeneID format, the GeneIDs are matched to the corrsponding GeneID in RefSeq (FTT_0001 vs FTT0001).

SBML Sources

The Systems Biology Markup Language
(SBML) is a mechaine-readable format for describing qualitative and quantitative models of biochemical networks. It is applicable to simulations of metabolism, cell-signaling, and many other topics.

Currently, over 1000 bacterial genomes have been fully sequenced, whereas only about 20 organism-specific genome-scale metabolic models have been constructed. A list of currently available genome-scale metabolic reconstructions can be found. Among the reconstructions, the models published in SBML format can be used as reference species for the draft metabolic reconstruction of related species.

Organism Version Publication Metabolic network file Note
Escherichia coli str. K-12 substr. MG1655  iAF1260 Feist et al. SBML  
Mycoplasma genitalium  iPS189 Suthers et al. SBML MG### -> MG_###
Thermotoga maritima MSB8   Zhang et al. SBML TM_#### -> TM####
Salmonella  iMA945 AbuOun et al. SBML jbc.M109.005868-7.txt -> iMA945.xml
Staphylococcus aureus N315  iSB619 Becker and Palsson. SBML Exported from BiGG
Helicobacter pylori strain 26695  iIT341 Thiele et al. SBML Exported from BiGG

Recently, a web-based resource, Model SEED, has been built for generating genome-scale metabolic models for more than 100 bacterial organisms. These metabolic models can be used as references in MrBac to construct non-existing metabolic networks for the bacterial species whose genomes are already sequenced. Please note that small corrections, as listed below, are needed to be made before importing those SBML models downloaded from Model SEED into MrBac.
GENE_ASSOCIATION -> SEED_ASSOCIATION
GENE_LOCUS_TAG -> GENE ASSOCIATION
& -> &
'' -> "
' -> '


Division of Biostatistics and Bioinformatics,
Institute of Population Health Sciences,
National Health Research Institutes, Zhunan, Taiwan
All Rights Reserved
Last Updated: 04/27/2011