Organizing Botany Hall

In the Fall of 2017, I had the opportunity to conduct research about digital preservation and data management during a term-long assistantship with Dr. Nora Mattern. I was able to apply some of this research directly to one of my own research projects, Botany Hall: Dioramas in Context, as Colleen O’Reilly (my project co-manager) and I were in the midst of contemplating options for the next phase of our interdisciplinary, cross-institutional venture.

As we were attempting to be responsible digital scholars, Colleen and I were concerned about (a) how we created, assembled, and shared our research data and (b) how it is would survive in the long-run. Perhaps we were getting ahead of ourselves, a bit, but we were/are very aware of impending changes in our lives as we both move forward with our dissertation work.

A. Sites for Data Creation, Assembly, and Sharing 

As I’ve written previously in the context of the Visual Media Workshop, knowledge production and dissemination, particularly within the context of digital scholarship, necessitates the coordination of a complex communication network. In a collaborative, interdisciplinary space, in particular, data is gathered and research is conducted in a variety of environments, but meaningful information is ultimately presented in a shared location.

In the case of Botany Hall, the project team consisted of two primary managers but has also incorporated undergraduate student interns, Master student researchers, and staff and faculty from the University of Pittsburgh and the Carnegie Museums.

From the beginning, Colleen and I have utilized both Google Drive and Box, a campus-wide implementation of Box requiring University of Pittsburgh login credentials. Both Drive and Box continue to house a variety of documents, and have become unwieldy, at times. Documentation is also embedded in blog posts on the Constellations website, emails, texts, pieces of paper, and our personal laptops. 

Project documentation is an essential part of our work, and I want to share some of our learning experiences. We experienced the usual challenge of juggling our data and having duplicates in various locations. This became more problematic as time passed and the project approached its current state. 

Below, I attempt to describe the advantages and disadvantages of our various data repositories. 

pitt.box.com
advantages
– easy to use
– accessible online
– secure sharing options (file and folder permissions, etc.)

disadvantages:
– does not facilitate smooth, simultaneous file-editing
– requires online access
– not ideal for uploading and storing large image files (inefficient with an ad-hoc scanning station set up in a library or archive without high-speed Internet, for example!)
– cannot share with non-Pitt collaborators

Google Drive
advantages
– easy to use
– accessible online
– facilitates simultaneous file-editing
– fairly secure sharing options
– easy to sync photo uploads from mobile devices (at the archive, etc.)
– can share with non-Pitt collaborators

disadvantages
– less secure
– less functional in offline mode
– seems somehow less authoritative or legitimate than institutionally-run Box implementation?

WordPress Site
advantages
– easy to use
– accessible online
– fairly secure (password protected back-end)
– publicly visible

disadvantages
– publicly visible!
– not a good place for data storage, but a good place for presentation

Quite organically, we came to utilize Google Drive primarily during the writing and presenting phases of our research, while Box became our research data repository. We are now deciding whether all of this data is worth saving, and how it should be saved. The public-facing site is our primary concern, but it is also only the tip of our research iceberg.

B. Digital Preservation Options

In working with Dr. Mattern, I investigated potential digital preservation options for digital projects such as Botany Hall. The first option, of course, is to use Extensible Markup Language (or XML), as this language is interoperable and archivable (learn more about this in Kathleen Fitzpatrick’s Planned Obsolescence: Publishing Technology and the Future of the Academy, published in 2011). However, the prospect of learning a mark-up language may be too daunting for some, in which case they (like us) may want to use a platform such as Squarespace, Drupal, or WordPress to produce a website.

For collaborative, cross-disciplinary projects in particular, it may be difficult to find researchers from across different departments who are fluent in the same programming language. Thus, the ready-to-use platforms listed above are likely more accessible and better-suited to this type of work.

Options for digital projects run through WordPress (like ours), include the following, among others:

The Digital Preservation Network (DPN Core) (dpn.org)

  • Is Pitt a member? No. There are currently 60+ members (includes 5 TB of content preserved for the year).
  • Costs $20,000/year
  • Consists of five nodes:
    • Academic Preservation Trust (University of Virginia)
    • DuraCloud Vault (University of California at San Diego & DuraSpace)
    • Stanford Digital Repository
    • Texas Preservation Node
    • HathiTrust (University of Michigan)

Academic Preservation Trust (APTrust), operated by the University of Virginia (aptrust.org)

  • Is Pitt a member? No. There are currently only 16 members (includes participation in governance activities, two no-registration-fee consortial in-person meetings per year, requires participation in consortial activities, also includes 10 TB of content preserved for the year).
  • Costs $20,000/year

Archive-It (archive-it.org)

  • Requires organizational membership
  • Schedule screen captures, etc.