Probabilistic Methods for Structure Computation and Display

This project, supported by NIH NLM-05652, is part of the Helix Group at Stanford School of Medicine. Please address inquiries to russ.altman@stanford.edu.



1. Summary of Project Goals

The importance of good structural models in biomedicine can not be overestimated. At the organ level, models can be used for surgical or radiation therapy planning. At the molecular level, models can be used for drug design or functional analysis. In either case, it is critically important that the application program have access to both the certainty with which the structure is known, and the important correlations and covariances between individual structural parameters. Before a radiation oncologist can irradiate a sensitive region, a model of that region--along with an understanding of its accuracy--is critical for effective delivery of radiation. Similarly, a static molecular structure is less useful for the process of drug design than a structure in which the regions of mobility or structural uncertainty are clearly defined.

One long term goal of the HELIX group is to develop a methodology for both representing and manipulating biological structural information, especially with respect to the uncertainty within individual structures, and the variation across related biological structures. We have targeted biological macromolecules as the domain in which these issues will be investigated. Protein structure and nucleic acid structure are critical in many areas within biomedicine--ranging from understanding cellular metabolism to designing new pharmaceuticals. The anticipated explosion in the availability of structures produced by both experimental and predictive technologies makes the issue of representing and manipulating these structures in a uniform manner critically important. As these structures are made available, we can not predict the array of uses to which they will be put. The technologies used to define structure are not perfect, and will produce structures in which the reliability of subsegments varies greatly. It is therefore important to have technologies for representing and manipulating these structures at the proper level of precision.

A summary of the Helix Group page. The current Helix group is a descendant of the Helix project directed by Bruce Buchanan at Stanford 1984-1988.

2. Project Personnel

3. Shared Software and Data from Project

We published a paper in J. Mol. Graphics describing the program, proteand which is designed to display various representations of structural uncertainty for macromolecules, including overlapping stick drawings, ellipsoids of uncertainty, and secondary structure accessible volumes. The program runs on Silicon Graphics (SGI) machines and sample input files and binary executable code are available in ftp://ftp-smi.stanford.edu/pub/altman/tar.proteand.
 

4. Overview of Studies and Results

Previously, we have designed and built a hierarchical, probabilistic system for representing structure. We have shown its performance in the determination of protein structure from NMR data, and in the prediction of RNA structure from statistical, correlative data. The work has been based on the hypothesis that structural representations that are stated as probability distributions can be used for a wide variety of applications--independent of the technology used to determine the structure. Currently, we are further testing this hypothesis with a combination of (1) basic investigation into the mathematical and implementational issues of computing with biological structure, and (2) collaborative applications to test the real-world utility of new representations and algorithms on problems of macromolecular structure. Specific areas of current progress include:
  1. We are developing a method for computing structures using van der Waals constraints implicitly.  Packing constraints are the most important and most difficult to satisfy in molecular structural computations.  We are developing optimization methods that perform their search only in van der Waals-legal regions of space, so that all resulting structures satisfying specified packing distances.   Contact gaw@smi.stanford.edu for details on this project.

  2.  
  3. The hierarchical organization of molecular computations.  We have shown that significant computational savings can be realized by decomposing a molecular computation into a hierarchy based both on the natural structural divisions, as well as the data.  This is reported by in J. Comp. Bio. by Chen et al, 1998 (see references below).

  4.  
  5. Using imperfect secondary structure predictions to improve the quality of computed structures.  We have developed a method for processing imperfect predictions so that erroneous predictions do not significantly detract from the quality of computed structures.  This is reported in Bioinformatics, 1999, by Cheng, Singh and Altman (see references below).

  6.  
  7. Using surface information to improve the quality of computed structures.  We have developed a new metric for surface-ness of an atom within a molecule, and have shown that this metric can be used to constrain structural computations and improve upon a baseline set of data.  This is reported in ISMB, 1998 by Schmidt et al (see references below).

  8.  
  9. Using volume information to improve the quality of computed structures. We have shown that information about the size of a molecule can be used to constrain a structural computation, and improve the result of the computation compared to a baseline data set.  This is reported in CABIOS, 1996 by Chen et al (see references below).
  1. Using probabilistic algorithms to model the structure of RNA. We  published a three dimensional structure of tRNA (in collaboration with Tod Klingler and Doug Brutlag) group which is calculated solely from the predicted secondary structure and base-pair covariations that indicate physical proximity. This work demonstrated the applicability of our algorithms to RNA (its applicability to proteins having already been demonstrated). The CAMIS resource has provided significant support in both hardware for computation, as well as support for the unique graphic display needs of our probabilistic structure representations.

  2.  
  3. Using probabilistic representations for analyzing biological macromolecules. We have reported an automatic procedure for finding the structurally invariant core of a set of related proteins. We have shown that the representations used by our structure determination algorithm are also useful in problems of structure analysis. See the project on protein core structures.

  4.  
  5. Application to Ribosomal RNA. In collaboration with Dr. Harry Noller, of UCSC we have begun to use our algorithms for uncertain structure determination in helping determine the structure of the 16S ribosomal subunit. See the project on ribosomal structure.

  6.  
  7. Algorithmic convergence progress. We have published work on the probabilistic algorithm which shows that it is relatively robust to local minima, and is able to converge to proper solutions given a wide range of input data. This work is important because it demonstrates the general applicability of the algorithm to a variety of constraint satisfaction problems, in addition to the determination of molecular structure.

  8.  
  9. Parallel implementation of algorithms. We have implemented the algorithm on  parallel processors,  in order to provide prototype high speed servers which are capable of quickly performing the experimental computations necessary in our collaborative work.

  10.  
  11. Application of our algorithms to other biomedical problems. In collaboration with Dr. James Brinkley, of the University of Washington, we have published a report of the use of our algorithm for probabilistic constraint satisfaction in the analysis of CT images (edge detection). We have also applied the method to radiosurgery dose planning. These are described elsewhere.

  12.  
  13. Development of graphical rendering tools for our representations. We have developed a program, proteanD, for the display of structural uncertainty using a variety of rendering techniques. The manuscript describing proteand is in press at Journal of Molecular Graphics. You can get proteanD binaries (running on SGI platforms) by anonymous ftp at ftp://ftp-smi.stanford.edu/pub/altman/tar.proteand.

  14.  

5. References

  1. PubMed references to papers from this project.
GO back to Helix Page.
Last update June 14, 2000.
Crosby@smi.stanford.edu