Accession Number:

AD1038470

Title:

Working with and Visualizing Big Data Efficiently with Python for the DARPA XDATA Program

Descriptive Note:

Technical Report,01 Oct 2012,01 Mar 2017

Corporate Author:

Continuum Analytics, Inc. Austin United States

Report Date:

2017-08-01

Pagination or Media Count:

43.0

Abstract:

Research performed under the XDATA program focused on computational techniques and software tools for analyzing large volumes of data, both semi-structured e.g. tabular, relational, categorical, meta-data and unstructured e.g. text, documents, message traffic. Several open source project which have seen community and industry adoption grew out of this effort. - Blaze A collection packages for describing and accessing, and manipulating disparate data sources and types - Numba A just-in-time function compiler for Python, based on LLVM compiler project allowing researchers to run their Python code near native speeds on CPUs and GPUs. - Dask Parallelizes generic Python and extends NumPy, Pandas, and Scikit-learn with parallel variants. -Bokeh Create interactive web applications from Python without having to know Javascript, CSS, or HTML.

Subject Categories:

  • Computer Programming and Software

Distribution Statement:

APPROVED FOR PUBLIC RELEASE