Analysis of Large Born-Digital Media Collections for Long-Term Library Deposit

Project Details

Project Lead
Heidi Dowding 
Project Manager
Heidi Dowding 
Project Members
 
Institution
Indiana University, Library Technology  
Discipline
Social Sciences, n.e.c. (910) 
Subdiscipline
45.99 Social Sciences, Other 

Abstract

This project is a collaboration between the IUBL Library Technology group, University Archives, and IU Communications/Kelley School of Business. Utilizing existing tools such as BitCurator and ExifTool, we will analyze metadata associated with large born-digital media collections. This analysis will provide the groundwork for developing a sustainable workflow for the ingest and long-term preservation of digital objects within the Indiana University-Bloomington Libraries Fedora/Hydra repository infrastructure.

Intellectual Merit

While most institutions have established workflows for collecting physical and digitized heritage, born-digital collection and preservation workflows are still relatively nascent. This is largely due to a lack of understanding of the specific needs of born-digital objects, an area to which this project will contribute considerably. By undertaking large-scale analysis of both technical and descriptive metadata, we will gain a much better understanding of the long-term needs of born-digital objects.

Broader Impacts

In undertaking this project, we will generate documentation of our developed workflow that will be shared widely within the library and digital preservation communities. Results will be shared via publications and conference presentations as a means to help similar institutions establish workflows for preserving digital institutional heritage. Due to IUBL's participation within the Hydra and Fedora communities, this work will also likely contribute to further development of those projects to better support born-digital objects.

Scale of Use

I want to store 5TB of content and run analysis tools periodically. Analysis should not be too time-intensive, and generally will only be run 1-2 times per week.