Accession Number : ADA546617


Title :   Interactive Visualization Systems and Data Integration Methods for Supporting Discovery in Collections of Scientific Information


Descriptive Note : Doctoral thesis


Corporate Author : DREXEL UNIV PHILADELPHIA PA


Personal Author(s) : Pellegrino, Jr, Donald A


Full Text : https://apps.dtic.mil/dtic/tr/fulltext/u2/a546617.pdf


Report Date : May 2011


Pagination or Media Count : 120


Abstract : Technological developments have been enabling additional sharing and reuse of scientific information. Current indexing methods support query-based search and filtering, however they do not support overviews and exploration. Due to these limitations of existing indexing methods, it is challenging to discover records and connections that relate information in new and potentially insightful ways. We developed prototype systems and computational methods for integrating collections from multiple sources within a domain into a single, unified graph data structure. Graph-theoretic measures and visualizations were then applied to identify relations and records that support discovery tasks. Three collections of molecular information were studied: (1) influenza protein sequences from the National Center for Biotechnology Information, (2) Open Notebook Science notebooks and databases from Drexel University and other academic chemical research laboratories, and (3) project data from drug discovery projects at Pfizer R&D. We designed methods for data integration within these collections. We then analyzed the integrated collections to design interactive visual tools and computational methods that could systematically identify relations and records that have a high potential to lead to novel discoveries in these areas. We conducted interviews with domain experts to evaluate the effectiveness of these designs. These studies demonstrate the feasibility of the new indexing methods to improve the discoverability of novel connections across multiple collections within a domain.


Descriptors :   *DATA MANAGEMENT , *SCIENTIFIC RESEARCH , *INFORMATION SYSTEMS , INTERACTIONS , TOOLS , MOLECULES , PROTEINS , GRAPHS , THEORY , PROTOTYPES , SEQUENCES , INDEXES , VISION , NUMERICAL METHODS AND PROCEDURES , BIOTECHNOLOGY , COLLECTION , MANUALS , INFLUENZA , CHEMICAL LABORATORIES , SCHOOLS , INTEGRATED SYSTEMS , DATA BASES , SOURCES


Subject Categories : Information Science
      Computer Programming and Software


Distribution Statement : APPROVED FOR PUBLIC RELEASE