Accession Number : ADA553613


Title :   Exploring the Power of Heterogeneous Information Sources


Descriptive Note : Doctoral thesis


Corporate Author : ILLINOIS UNIV AT URBANA-CHAMPAIGN


Personal Author(s) : Gao, Jing


Full Text : https://apps.dtic.mil/dtic/tr/fulltext/u2/a553613.pdf


Report Date : Jan 2011


Pagination or Media Count : 195


Abstract : The big data challenge is one unique opportunity for both data mining and database research and engineering. A vast ocean of data are collected from trillions of connected devices in real time on a daily basis, and useful knowledge is usually buried in data of multiple genres, from different sources, in different formats, and with different types of representation. Many interesting patterns cannot be extracted from a single data collection, but have to be discovered from the integrative analysis of all heterogeneous data sources available. Although many algorithms have been developed to analyze multiple information sources, real applications continuously pose new challenges: Data can be gigantic, noisy, unreliable, dynamically evolving, highly imbalanced, and heterogeneous. Meanwhile, users provide limited feedback, have growing privacy concerns, and ask for actionable knowledge. In this thesis, we propose to explore the power of multiple heterogeneous information sources in such challenging learning scenarios. There are two interesting perspectives in learning from the correlations among multiple information sources: Explore their similarities (consensus combination), or their differences (inconsistency detection).


Descriptors :   *INFORMATION SYSTEMS , ALGORITHMS , DATA ACQUISITION , INFORMATION RETRIEVAL , THESES


Subject Categories : Information Science


Distribution Statement : APPROVED FOR PUBLIC RELEASE