Probabilistic Methods for Structure Computation and Display
This project, supported by NIH NLM-05652, is part of the Helix
Group at Stanford School of Medicine. Please address inquiries to
russ.altman@stanford.edu.
1. Summary of Project Goals
The importance of good structural models in biomedicine can not be overestimated.
At the organ level, models can be used for surgical or radiation therapy
planning. At the molecular level, models can be used for drug design or
functional analysis. In either case, it is critically important that the
application program have access to both the certainty with which the structure
is known, and the important correlations and covariances between individual
structural parameters. Before a radiation oncologist can irradiate a sensitive
region, a model of that region--along with an understanding of its accuracy--is
critical for effective delivery of radiation. Similarly, a static molecular
structure is less useful for the process of drug design than a structure
in which the regions of mobility or structural uncertainty are clearly
defined.
One long term goal of the HELIX group is to develop a methodology for
both representing and manipulating biological structural information, especially
with respect to the uncertainty within individual structures, and the variation
across related biological structures. We have targeted biological macromolecules
as the domain in which these issues will be investigated. Protein structure
and nucleic acid structure are critical in many areas within biomedicine--ranging
from understanding cellular metabolism to designing new pharmaceuticals.
The anticipated explosion in the availability of structures produced by
both experimental and predictive technologies makes the issue of representing
and manipulating these structures in a uniform manner critically important.
As these structures are made available, we can not predict the array of
uses to which they will be put. The technologies used to define structure
are not perfect, and will produce structures in which the reliability of
subsegments varies greatly. It is therefore important to have technologies
for representing and manipulating these structures at the proper level
of precision.
A summary of the Helix
Group page. The current Helix group is a descendant of the Helix project
directed by Bruce Buchanan at Stanford 1984-1988.
2. Project Personnel
3. Shared Software and Data from Project
We published a paper in J. Mol. Graphics describing the program,
proteand
which is designed to display various representations of structural uncertainty
for macromolecules, including overlapping stick drawings, ellipsoids of
uncertainty, and secondary structure accessible volumes. The program runs
on Silicon Graphics (SGI) machines and sample input files and binary executable
code are available in ftp://ftp-smi.stanford.edu/pub/altman/tar.proteand.
4. Overview of Studies and Results
Previously, we have designed and built a hierarchical, probabilistic system
for representing structure. We have shown its performance in the determination
of protein structure from NMR data, and in the prediction of RNA structure
from statistical, correlative data. The work has been based on the hypothesis
that structural representations that are stated as probability distributions
can be used for a wide variety of applications--independent of the technology
used to determine the structure. Currently, we are further testing this
hypothesis with a combination of (1) basic investigation into the mathematical
and implementational issues of computing with biological structure, and
(2) collaborative applications to test the real-world utility of new representations
and algorithms on problems of macromolecular structure. Specific areas
of current progress include:
-
We are developing a method for computing structures using van der Waals
constraints implicitly. Packing constraints are the most important
and most difficult to satisfy in molecular structural computations.
We are developing optimization methods that perform their search only in
van der Waals-legal regions of space, so that all resulting structures
satisfying specified packing distances. Contact gaw@smi.stanford.edu
for details on this project.
-
The hierarchical organization of molecular computations. We
have shown that significant computational savings can be realized by decomposing
a molecular computation into a hierarchy based both on the natural structural
divisions, as well as the data. This is reported by in J. Comp.
Bio. by Chen et al, 1998 (see references below).
-
Using imperfect secondary structure predictions to improve the quality
of computed structures. We have developed a method for processing
imperfect predictions so that erroneous predictions do not significantly
detract from the quality of computed structures. This is reported
in Bioinformatics, 1999, by Cheng, Singh and Altman (see references
below).
-
Using surface information to improve the quality of computed structures.
We have developed a new metric for surface-ness of an atom within a molecule,
and have shown that this metric can be used to constrain structural computations
and improve upon a baseline set of data. This is reported in ISMB,
1998 by Schmidt et al (see references below).
-
Using volume information to improve the quality of computed structures.
We
have shown that information about the size of a molecule can be used to
constrain a structural computation, and improve the result of the computation
compared to a baseline data set. This is reported in
CABIOS,
1996 by Chen et al (see references below).
-
Using probabilistic algorithms to model the structure of RNA.
We
published a three dimensional structure of tRNA (in collaboration with
Tod
Klingler and Doug Brutlag) group which is calculated solely from the
predicted secondary structure and base-pair covariations that indicate
physical proximity. This work demonstrated the applicability of our algorithms
to RNA (its applicability to proteins having already been demonstrated).
The CAMIS resource has provided significant support in both hardware for
computation, as well as support for the unique graphic display needs of
our probabilistic structure representations.
-
Using probabilistic representations for analyzing biological macromolecules.
We
have reported an automatic procedure for finding the structurally
invariant core of a set of related proteins. We have shown that the
representations used by our structure determination algorithm are also
useful in problems of structure analysis. See the project
on protein core structures.
-
Application to Ribosomal RNA. In collaboration with Dr. Harry Noller,
of UCSC we have begun to use our algorithms for uncertain structure determination
in helping determine the structure
of the 16S ribosomal subunit. See the project
on ribosomal structure.
-
Algorithmic convergence progress.
We have published work on the
probabilistic algorithm which shows that it is relatively robust to local
minima, and is able to converge to proper solutions given a wide range
of input data. This work is important because it demonstrates the general
applicability of the algorithm to a variety of constraint satisfaction
problems, in addition to the determination of molecular structure.
-
Parallel implementation of algorithms.
We have implemented the algorithm
on parallel processors, in order to provide prototype high
speed servers which are capable of quickly performing the experimental
computations necessary in our collaborative work.
-
Application of our algorithms to other biomedical problems.
In collaboration
with Dr. James Brinkley, of the University of Washington, we have published
a report of the use of our algorithm for probabilistic constraint satisfaction
in the analysis of CT images (edge detection). We have also applied the
method to radiosurgery dose planning. These are described elsewhere.
-
Development of graphical rendering tools for our representations.
We
have developed a program, proteanD, for the display of structural uncertainty
using a variety of rendering techniques. The manuscript describing proteand
is in press at Journal of Molecular Graphics. You can get proteanD binaries
(running on SGI platforms) by anonymous ftp at ftp://ftp-smi.stanford.edu/pub/altman/tar.proteand.
5. References
-
PubMed references to papers
from this project.
GO back
to Helix Page.
Last update June 14, 2000.
Crosby@smi.stanford.edu