Accession Number:

AD1009313

Title:

Visualizing Mixed Variable-Type Multidimensional Data Using Tree Distances

Descriptive Note:

Technical Report

Corporate Author:

Naval Postgraduate School Monterey United States

Personal Author(s):

Report Date:

2015-09-01

Pagination or Media Count:

125.0

Abstract:

This research explores the use of the tree distances of Buttrey and Whitaker to visualize multidimensional data of mixed-variable types, having both numerical and categorical data. Tree distances measure dissimilarities among observations in a data set while exploiting desirable properties of classification and regression trees ease of handling of most variable types, indifference to variable scaling, resistance to noise and outliers, accommodations for missing values, and computational ease. In this research, we map the dissimilarities using Classical Multidimensional Scaling to a lower-dimensional Euclidean space in order to provide an analyst with a comfortable framework, which supplies visual cues in order to help find patterns and gain insights about the data. We offer in this thesis several algorithms for coloring observations in the lower-dimensional mappings in order to focus the analysts attention on the most important and interesting relationships in the data set. In addition, through our visualization, we gain a deeper understanding of the properties of tree distances and propose a modification. Our framework can be used on any military data set that involves mixed or non-mixed variables and is valuable for analysts who wish to shed light on data during the exploratory phase of analysis.

Subject Categories:

  • Statistics and Probability

Distribution Statement:

APPROVED FOR PUBLIC RELEASE