Compact Information Representations
Technical Report,15 Mar 2013,14 Mar 2016
Cornell University Ithaca United States
Pagination or Media Count:
Numerous modern applications in the context of network traffic, information retrieval, and databases are faced with very large, inherently high-dimensional, or naturally streaming datasets. This proposal aims at developing mathematically rigorous and general purpose statistical methods based on stable random projections, to achieve compact information representations, for solving very large-scale engineering problems in data stream computations, real-time network monitoring and anomaly detections e.g., DDoSattacks, machine learning, databases, and search. Fundamentally, compact data representations are highly beneficial because they could substantially reduce memory or disk storage, facilitate efficient data transmission over the networks, accomplish time critical missions, improve experience in user-facing applications, reduce energy consumptions, etc. The proposed research topics largely fall into three categories i Data steam algorithms for network anomaly detections ii Probabilistic quantization for compact information storage, indexing and search and iii Effective sparse recovery from quantized stable random projections.The proposed research is highly interdisciplinary, across statistics, theoretical and applied computer science, and applied math. Withinthe scope of this proposal, the focus is preliminarily on the fundamental, theoretical research which lies in the mission of AFOSR.