Using Metacomputing Tools to Facilitate
Large-Scale Analyses of Biological Databases

Allison Waugh, Glenn A. Williams, Liping Wei, and Russ B. Altman
Stanford Medical Informatics
Stanford University

Pacific Symposium on Biocomputing, January, 2001


We use a distributed computing environment, LEGION, to enable large-scale computations on the
Protein Data Bank (PDB). In particular, we employ the FEATURE program to scan all protein structures
in the PDB in search of unrecognized potential cation binding sites. FEATURE is a site-characterization
and recognition system that identifies functional or structural sites of interest in a query protein.


The major steps to preparing for a LEGION run involve

  1. Downloading, unpacking, and installing the LEGION binaries and setup scripts
    (see http://www.legion.virginia.edu/ for further instructions).

  2. Writing a LEGION makefile. This file is used to compile and register a code for various architectures.

  3. Creating a LEGION specfile. This is a short file for instructing where to receive input and deposit output.



The output from a FEATURE scan on a query protein consists of a list of the (x,y,z) location
and score for each of the predicted sites.

Output from our FEATURE scan of the entire PDB for cation binding sites can be viewed by
  1. Entering the PDB ID here
       
    or

  2. Choosing a file from the list of all available output files.