Accession Number : AD1048823

Title :   Machine Learning Algorithms for Statistical Patterns in Large Data Sets

Descriptive Note : Technical Report,01 Jan 2012,30 Sep 2017

Corporate Author : Carnegie Mellon University Pittsburgh United States

Personal Author(s) : Dubrawski,Arthur

Full Text :

Report Date : 01 Feb 2018

Pagination or Media Count : 21

Abstract : Modern data analysis operations are continuously flooded with streams of noisy, incomplete, and sometimes intentionally misleading data. Traditional analysis methods cannot scale to handle these issues. We developed a battery of new, efficient, parallel, statistical machine learning algorithms to push the boundaries of machine learning capabilities under these circumstances. We have made much of our mature algorithms available as open source tools and published in peer-reviewed academic journals and conferences. The algorithms cover a wide range of learning applications, but all rest on strong statistical foundations and in that sense that they all speak the same language. We have provided theoretical guarantees and proofs were possible and demonstrated the value of our algorithms on many interesting problems.

Descriptors :   DATA ANALYSIS , machine learning , algorithms , data set , patterns

Subject Categories : Information Science

Distribution Statement : APPROVED FOR PUBLIC RELEASE