Shared Infrastructure: A Cooperative Preservation Network for Data

Research and cultural heritage institutions are facing increasing costs to preserve digital objects like scientific data, digital art, and other artifacts. As many institutions move data to cloud services, preservation costs and complexity are quickly becoming concerns. We are announcing a project to prototype shared infrastructure for digital preservation: Cooperative Preservation Network.

Together with other leading institutions in decentralization and digital preservation, Code for Science & Society, the Internet Archive, and California Digital Library, we aim to demonstrate how decentralized technology can bolster existing institutional infrastructure. Built on a decentralized network using the Dat Protocol, this project aims to enable public organizations who preserve digital cultural heritage to backup and monitor digital assets.

Dat is already being used by researchers, developers, and artists. This project aims to identify institutional limitations to using decentralized technology, whether technical and social. To test our assumptions, we will prototype a network allowing each participating entity to view and download the collections of other participants. If successful, this project will demonstrate how to reduce preservation costs while increasing preservation assurance, as members of a cooperative, decentralized network mutually support each other to ensure adequate copies of data are maintained.

Open infrastructure allows institutions to use digital preservation tooling without locking them into specific paid services or locking data into a patchwork of data silos. By working with the Dat Protocol, we will build this project to maximize flexibility and interoperability. Our goal is not to replace existing institutional infrastructure but to make it more capable by linking institutions at a foundational level. Building on value-driven open infrastructure, this project aims to identify new opportunities for collaboration between institutions and community engagement in data preservation.

Moving Forward

Despite improvements in data preservation and access, today’s digital preservation solutions rely on storage of objects in centralized servers. This model is built on traditional web infrastructure, which was designed with the values of commercial organizations. It’s time for scholars to ask whether today’s data preservation technologies align with open scholarship’s values of access, preservation, privacy, and transparency.

This project will be a community-driven infrastructure that values openness and bakes access into the code. Want to learn more? Representatives of this project will be at FORCE 2018, Joint Conference on Digital Libraries, Open Repositories, DLF Forum, and the Decentralized Web Summit.

More about CSS: Code for Science & Society is a nonprofit organization committed to building public interest technology and low-cost decentralized tools with the Dat Project to help people share and preserve versioned digital information. Read more about CSS’ Dat in the Lab project, our recent Community Call), and other activities. codeforscience.org

More about IA: The Internet Archive is a non-profit digital library with the mission to provide “universal access to all knowledge.” It works with hundreds of national and international partners providing web, data, and preservation services and maintains an online library comprising millions of freely-accessible books, films, audio, television broadcasts, software, and hundreds of billions of archived websites. archive.org

More about CDL and UC3: University of California Curation Center (UC3) at California Digital Library (CDL) provides innovative data curation and digital preservation services to the 10-campus University of California system and the wider scholarly and cultural heritage communities. Learn more about UC3’s collaboration with CSS in our previous Dat in the Lab project. https://www.cdlib.org/