Mathematics and Computer Science Division
Computational Biology Group
 
  WIT2 
WIT2 -- an integrated system for support of genetic sequence and  comparative analysis of sequenced genomes, it also supports, metabolic reconstructions from the sequence data.  WIT2 now contains data from 38 completely and almost completely sequenced genomes. WIT2 provides access to thoroughly annotated genomes within a framework of metabolic reconstructions, connected to the sequence data;  protein alignments and phylogenetic trees and data on gene clusters, potential operons and functional domains.
EMP
 
EMP , the largest publicly available database of Enzymes and Metabolic Pathways in the world. It represents a collaborative effort between Argonne National Laboratory and The Institute of Theoretical Biophysics (Russian Academy of Sciences, Puschino, Russia), led by Prof. E. Selkov. It represents a key resource for developing detailed metabolic reconstructions for newly sequenced genomes far more rapidly than researchers even a few years ago would have thought possible. EMP/MPW database currently contains 28, 100 records, which encode the full factual content of 17, 500 publications describing more than 8, 000 organisms. The database contains information on 3, 900 enzymes, including data on enzyme specificity, enzymological constants, purification protocols, regulation, inhibitors and activators. 
MPW
MPW , Prof. Selkov's collection of metabolic and functional diagrams. The MPW collection contains more then 3, 000 metabolic and functional diagrams representing functionality in 211 different organisms.
PUMA2 
(a prototype)  
 PUMA2 , an environment for comparative analysis of metabolic subsystems  automated reconstruction of metabolism  of  microbial consortia, and individual  organisms from sequence data  
Analyses in PUMA2 will be based on a collection of metabolic modules connected to sequence data. The results of such analyses will be presented in graphical form based on hierarchical representation of the functional subsystems and annotated with sequence data and literature information.
 SENTRA
SENTRA , a database of sensory signal transduction proteins, includes information about sensory transduction proteins in 38 WIT genomes, as well as annotated data from the SwissProt and EMBL databases.  
Other databases
A Database of Translation-Associated Proteins This project involves analysis of all translation factors in complete archaeal genomes. 
A Database of Transcription-Associated Proteins This project involves analysis of all transcription factors in complete archaeal genomes. These projects were carried out by Dr. Nikos Kyrpides in collaboration with UIUC.  
Database of Phenotypes -- We have started developing a database of phenotypic data for the sequenced genomes . We believe that availability of phenotypic data will enhance annotation of the genomes and will aid in the curation and development of metabolic reconstructions in PUMA
 PatScan  
PatScan -- is a publicly-accessible web-application which enables users to search for patterns in nucleotide and protein sequences. User-defined patterns, (e.g. potential stem-loop structures in nucleotide sequences, or motifs in protein sequences) could be searched against the protein sequence databases such as SWISS-PROT, TrEMBL, PDB, as well as EMBL nucleotide sequence database. 
 
 

Computational Biology Group, Mathematics and Computer Science Division, Argonne National Laboratory
dsouza@mcs.anl.gov, maltsev@mcs.anl.gov , evgeni@mcs.anl.gov, selkovjr@mcs.anl.gov