THE BERKELEY DATA ANALYSIS SYSTEM (BDAS): AN OPEN SOURCE PLATFORM FOR BIG DATA ANALYTICS

reportActive / Technical Report | Accession Number: AD1039023 | Open PDF

Abstract:

The goal of this proposal was to deliver a modular open-source software stack that can support a new generation of large-scale analytic tools that provide answers over arbitrarily large datasets. This work was carried out by Berkeleys AMPLab, a research lab consisting of eleven faculty members and over 40 students. In addition to this grant, AMPLab which ended in December 2016 was supported by industry affiliates and an NSF Expeditions grant. This grant was instrumental in improving our software stack, Berkeley Data Analytic System BDAS, so that it can serve as a platform for the broader community. In particular, this grant enabled us to implement significant portions of the code-bases, integrate BDAS with commonly used tools, and make BDAS much easier to manage. In addition, it allowed us to extend the functionality of BDAS in several key area, including streaming, and query processing. Thanks to xData, BDAS has enjoyed a big success both in academia and industry. Today, Apache Spark is used by thousands of companies in production and counts over 400K meetup members worldwide, while Apache Mesos and Alluxio formerly known as Tachyon are used by hundreds of companies around the world.

Security Markings

DOCUMENT & CONTEXTUAL SUMMARY

Distribution:
Approved For Public Release
Distribution Statement:
Approved For Public Release;

RECORD

Collection: TR
Identifying Numbers
Subject Terms