Accession Number:



Long-Lived Digital Data Collections Enabling Research and Education in the 21st Century

Descriptive Note:

Corporate Author:


Personal Author(s):

Report Date:


Pagination or Media Count:



It is exceedingly rare that fundamentally new approaches to research and education arise. Information technology has ushered in such a fundamental change. Digital data collections are at the heart of this change. They enable analysis at unprecedented levels of accuracy and sophistication and provide novel insights through innovative information integration. Through their very size and complexity, such digital collections provide new phenomena for study. At the same time, such collections are a powerful force for inclusion, removing barriers to participation at all ages and levels of education. The long-lived digital data collections that are the subjects of this report are those that meet the following definitions. The term data is used in this report to refer to any information that can be stored in digital form, including text, numbers, images, video or movies, audio, software, algorithms, equations, animations, models, simulations, etc. Such data may be generated by various means including observation, computation, or experiment. The term collection is used here to refer not only to stored data but also to the infrastructure, organizations, and individuals necessary to preserve access to the data. The digital collections that are the focus for this report are limited to those that can be accessed electronically, via the Internet for example. This report adopts the definition of long-lived that is provided in the Open Archival Information System OAIS standards, namely a period of time long enough for there to be concern about the impacts of changing technology. The digital data collections that fall within these definitions span a wide spectrum of activities from focused collections for an individual research project at one end to reference collections with global user populations and impact at the other Along the continuum in between are intermediate level resource collections such as those derived from a specific facility or center.

Subject Categories:

  • Information Science

Distribution Statement: