Research Digest: The Sustainable Digitization

among the many things i’m behind on is responding to a comment made in response to this great little MBDA summary published on EdLab. Here’s the comment; below it (apologies for the delay and hoping to get an Africa pardon) my response:

While the collection was at risk of physical deterioration, there seems to be no mention of whether there were additional preservation efforts beyond digitization.

I was also kind of surprised that they really didn’t discuss the imaging process at all, especially given that they were using student employees and fragile documents. I feel like they almost left out half of the process in their discussion.

Thanks for taking time to read about MBDA and to comment on the related Code4Lib project sketch! While the Code4Lib piece was intentionally written as a brief overview of our work in-progress, I appreciate your comment about document preservation. For me at least, underlying it are two essential aspects of Digital Humanities project development that demand further discussion: collaboration & the importance of recognizing digital representations as just that, representations.

While Garrick (the programmer) and I have been responsible for designing and developing MBDA’s technical infrastructure, we have worked closely with the Berry College Archivist and Library Director in considering issues of long-term preservation and access. And it’s especially interesting and important to me (a linguist who works primarily with historical manuscripts) to engage in dialogue and research about preservation with others whose expertise complements and extends my own.

MBDA has benefited from the expertise of a programmer, a historian, an archivist, a museum director and a museum curator, another linguist, and a librarian (and we’ve spent time dialoguing and working directly with the IT and networking staff from both Bloomsburg University and Berry College, since the project entails issues like hosting, data storage, and data migration). The diversity of perspectives and aims represented in our inter- and extra- disciplinary conversations and research is rich, but, more importantly, inflected in project-level decision making.

The Martha Berry Digital Archive (MBDA) is a digital archive, but it was borne of an urgent need to preserve the Martha Berry Collection (the material original), for which no other backup exists. While discussion of MBDA necessarily centers on our digital methodology (where we have something new to offer), in practice, our work – because it involves interaction with and representation of a material collection – also encompasses aspects of primary source preservation (where I’m not so sure we really have anything new to offer).

Thankfully, the Martha Berry Collection adheres to existing preservation standards for archival materials (e.g. acid free storage boxes, lignin free flat boxes, file folders) and is carefully maintained by experienced archive and museum staff. Unfortunately, even within the folders maintained within the many file boxes which comprise the collection, all of which are stored in a temperature-controlled archive environment, documents continue to deteriorate.

Some of this degeneration results from the care and condition of the documents (many of which exist on tissuepaper-thin leaves) prior to their careful preservation by the archivists at the Berry College Archives; many were simply stacked in boxes; some were housed in attics or basements (in the heat and humidity and bugginess of Georgia); and some even needed to be retrieved from the public after having been (yikes!) mistakenly discarded. To some documents had been fastened metal or plastic paper clips, resulting in lacunae and tears. And each time an original document is accessed and handled, it is placed at risk of further decline.

During the MBDA imaging process, not only have we had an opportunity to create high quality .tiff scans to serve as a collection backup (in addition to and separate from the digital archive), we’ve had an opportunity to remove clips from documents and to ensure that papers are laid flat within folders. But, because the Martha Berry Collection already complies with international preservation standards, there is little if anything further (within reason) beyond digitization (which reduces the frequency of document handling and thus reduces decline accelerated by handling) that we can do to extend the life of the physical collection.

As to the critical importance of distinguishing between a material (in this instance documentary) artefact and a digital representation of that artefact (a topic I’ve written a bit about here), MBDA is designed to provide a one-to-one correspondence between documents in the digital archive and their material exemplars precisely because we recognize that digitization achieves but a copy of, i.e. a version of (albeit an exceptionally good one) the original. And we want to ensure that even while leveraging a resource such as MBDA to access a document, we can return to the physical object itself any time doing so proves crucial to literary, linguistic, historical, cultural, or any other type of study. In other words, we haven’t fallen prey to the mimetic fallacy ;)

Even still, if those using the collection can access the digital archive in place of the original documents (and in doing so explore connections between documents, search scores of documents quickly and efficiently in ways never achievable via the material collection) and if MBDA can increase and enhance access to the collection [1] while minimizing the kinds of physical handling which result in deterioration, we maintain that we are supporting preservation.

Though the Code4Lib piece isn’t so much written to address the topic of imaging, I’d be glad to answer questions or to share more about our imaging process. We dedicated significant time to researching, discussing and selecting a scanner, identifying the acceptable image type (.tiff) for document backup and preservation (a preservation method we adopted in addition to and separate from the digital archive) and also derivative type (.jpeg) for web-based digital archive use. Designing the imaging workflow was demanding since it entailed modeling transfer of image files from Berry (where the documents are held) to Bloomsburg (where the digital archive is being developed), as well as composing the step-by-step imaging guide for students, testing the guide with students, and training students. but I think we mentioned all those general comments already, so I’ll add

We also seriously considered concerns related to the unintentional, unwitting introduction of new artefacts during the imaging process (e.g. hair, shadows, creases) and this too is a topic of worth discussing further.

For what it’s worth (I’m not sure whether or not this point is implied in your comment, but I find it worth noting): I think the digital editing community would benefit from much more discussion of imaging. It’s foundational stuff, and if we get it wrong or even just a little off, we may just be building a house of cards…

[1] MBDA design and implementation is consistent with international preservation and metadata standards as well as those maintained by the Digital Library of Georgia (DLG), as collaborative project planning early on identified DLG inclusion as a deliverable.