Accession Number:

ADA566334

Title:

Large-scale Heterogeneous Network Data Analysis

Descriptive Note:

Final rept. 1 Jul 2011-30 Jun 2012

Corporate Author:

NATIONAL TAIWAN UNIV TAIPEI

Personal Author(s):

Report Date:

2012-07-31

Pagination or Media Count:

48.0

Abstract:

Large-scale network is a powerful data structure allowing the depiction of relationship information between entities. An unsupervised tensor-based mechanism was proposed, considering higher-order relational information, to model the complex semantics of nodes. The signature profiles are derived as a vector-based representation to enable further mining algorithms. Based on this model, solutions to tackle three critical issues in heterogeneous networks are presented. First, different aspects of central individuals are identified through three proposed measures, including contribution-based, diversity-based, and similarity-based centrality. Second, a role-based clustering method was proposed to identify nodes playing similar roles in the network. Third, to facilitate further explorations and visualization in a complex network data, the egocentric information abstraction was devised and three abstraction criteria was proposed to distill representative and significant information with respect to any given node. The evaluations are conducted on a real-world movie dataset, and an artificial crime dataset. The proposed centralities and role-based clustering can indeed find some meaningful results. The effectiveness of the egocentric abstraction is shown by providing more accurate, efficient, and confidential crime detection for human subjects.

Subject Categories:

  • Information Science

Distribution Statement:

APPROVED FOR PUBLIC RELEASE