Accession Number:

ADA503639

Title:

Parallel Computing in Protein Structure Topology Determination

Descriptive Note:

Conference paper

Corporate Author:

NEW MEXICO STATE UNIV LAS CRUCES DEPT OF COMPUTER SCIENCE

Personal Author(s):

Report Date:

2008-12-01

Pagination or Media Count:

8.0

Abstract:

The knowledge of 3-dimensional virus structures is essential in understanding the mechanism of viral pathogenesis. It also provides insights to the stabilizing mechanisms of a nano-sized particle, since many viruses are less than 100 nanometers in diameter. This paper reports the results towards the development of a scalable parallel code for structural prediction of virus particles through ab initio structure prediction using geometrical constraints. One of the critical steps in computational derivation of a protein structure is to reduce the huge number of topologies of the secondary structures, such as helices and strands, of a protein chain. In this paper, we study a particular question emerged from experimental data that carry the geometrical relationship of the secondary structures. We explored the question if the native topology is likely to be identified among a large set of all possible topologies. The secondary structure topology in this paper refers to the order and the directionality of the secondary structures. For a given protein sequence N helices and M beta-strands, the number of possible secondary structure topology is N factorial x 2exp N x M factorial x 2exp M, a huge number to compute even when N and M are small numbers. We have developed a computational method and its parallel code to generate all the possible topologies and to evaluate the energy of each topology. By mutating residue side chains of the secondary structures, connection orders are switched and a new topology is created. The large number of permutations is partitioned and distributed to different CPUs. We compared the speedup between two approaches of distributing the work the even distribution and the dynamic distribution. Our current parallel algorithms can handle the computation when N is less than 7 on a small scale cluster for testing the algorithm. A large cluster is needed to extend the scale of computation.

Subject Categories:

  • Biochemistry
  • Numerical Mathematics
  • Computer Systems

Distribution Statement:

APPROVED FOR PUBLIC RELEASE