An Image Processing Approach to Computing Distances Between RNA Secondary Structures Dot Plots (PREPRINT)
MINNESOTA UNIV MINNEAPOLIS INST FOR MATHEMATICS AND ITS APPLICATIONS
Pagination or Media Count:
Computing the distance between two RNA secondary structures can contribute in understanding the functional relationship between them. When used repeatedly, such a procedure may lead to finding a query RNA structure of interest in a database of structures. Several methods are available for computing distances between RNAs represented as strings or graphs, but none utilize the RNA representation with dot plots. Since dot plots are essentially digital images, there is a clear motivation to devise an algorithm for computing the distance between dot plots based on image processing methods. Results We have developed a new metric dubbed DoPloCompare, which compares two RNA structures. The method is based on comparing dot plot diagrams that represent the secondary structures. When analyzing two diagrams and motivated by image processing, the distance is based on a combination of histogram correlations and a geometrical distance measure. We illustrate the procedure by an application that utilizes this metric on RNA sequences in order to locate peculiar point mutations that induce significant structural alternations relative to the wild type predicted secondary structure. The method was tested on several RNA sequences with known secondary structures to affirm their prediction, as well as on a data set of ribosomal pieces. These pieces were computationally cut from a ribosome for which an experimentally derived secondary structure is available, and on each piece the prediction conveys similarity to the experimental result. The new algorithm shows benefit when compared to standard methods used for assessing the distance similarity between two RNA secondary structures. Conclusions Inspired by image processing, we have managed to provide a conceptually new and potentially beneficial metric for comparing two RNA secondary structures, and illustrated it on an application that utilized the measurement to detect conformational rearranging point mutations on an RNA sequence.
- Statistics and Probability