Accession Number:



Composable Robust Structured Data Inference

Descriptive Note:

[Technical Report, Final Report]

Corporate Author:

Cornell University

Personal Author(s):

Report Date:


Pagination or Media Count:



Messy data heterogeneous values, missing entries, and large errors presents a major obstacle to automated data-driven discovery of models. Data cleaning is the first step in any data processing pipeline, and has serious consequences for the results of any subsequent analysis. Yet this step is generally performed using ad-hoc methods. This effort seeks to cleanse the data set, and build a structured data interface to reduce noise from data sets, to deliver a production of clean data sets, and leverage model selection and automated techniques.

Subject Categories:

  • Information Science

Distribution Statement:

[A, Approved For Public Release]