I am interested in the impact of digital information on the archival profession. In particular, I am curious about what archivists need to know to thrive in the digital era, to preserve records in digital formats. I believe that archivists need technical skills to be able to work with digital materials at a variety of levels. At the same time, they need to strong grasp of archival principles to be able to translate and adapt those concepts into new ways of working.
Radcliffe Workshop on Technology & Archival Processing
Recently I had the privilege of participating in a workshop
on technology and archival processing sponsored by the Radcliffe Institute for
Advanced Study and the Association for Research Libraries. The
workshop sought to explore ways to apply technology to tangible collections,
with only secondary consideration of born-digital materials. In particular, how can technology facilitate
arranging and describing archival collections?
A second, inherent question focused on how finding aids might change or
be improved through technology.
Aaron Trehub of Auburn and I were asked to offer closing
comments. I offer my observations here,
after taking time to reflect on the many excellent insights and ideas.
§ § §
Throughout the workshop, I wondered (as I often do) if
archivists were confronting evolution or revolution. Are we seeing the transformation of the
profession? Or, have things changed so
much that we’re really witnessing the demise of archives (and archivists) as we
I believe that archivists must persevere in their noble
profession because they serve a distinct role in society. They are focused on the long-tail value of
records, the usefulness of records long after they’ve been created. William Maher observed that archivists “must
stand fast and hold true to [their] role as custodians and guardians of the
authentic record of the past,” “to provide an authentic, comprehensive record that
ensures accountability for our institutions and preservation of cultural heritage
for our publics.”
For years, I've said that what archivists do (at the abstract level) remains the same,
but how we do must change.,
 Among other things, archivists
- Select and acquire records that capture a
complete (representative, if not exhaustive), accurate, and authentic story of
the past. If not, cultural memory will be
lost, the future will not have the records its needs to understand its past,
and individuals and organizations will not have the evidence necessary to
protect their rights and interests.
- Organize and describe those records to
provide physical and intellectual control.
Archivists must help people find their way in what is, for many, a
strange land of primary sources, where meaning often lies in the contextual
relationship between records, relationships that reflect their provenance and
original order – rather than in the document itself. Archives aren’t your grandfather’s Dewey
Decimal library, and can be alien and confusing for many.
- Provide access and reference
services. Where some see archivists as
gatekeepers and barriers to the records, the reality is that the archivists are
advocates for researchers. Not only do archivists
help researchers find relevant records, they often helps researchers hone their
Getting into the weeds a bit, how we do it must change. In the past, when we transferred records
from the file cabinets where they were stored during use, across the archival
threshold, and into our custody, we put carefully placed them in boxes to
preserve their original order. That doesn’t
work for electronic records. The records
are not on paper, but in databases, and may need to be extracted from fielded
data and templates to a document-like report.
Not to mention, placing
electronic records in box doesn’t make sense.
But that change is trivial, as we have a number of readily available
possibilities. Files can be placed in zip
or tar files, then transferred via a network connection, thumb drive, tape, or
disk. The workshop suggested many more
interesting possibilities, changes in how we do our jobs that re-envision new,
more effective ways to work.
Moving finding aids from typewritten paper to DACS/EAD files
on the web was just a start. To a large
extent, digital finding aids are protodigital forms, a replication of the
existing structure and functionality without taking advantage of the virtual
medium. Not that I’m discounting DACS or EAD. We must continue to describe our collections,
but technology offers us much more than markup.
We need to take advantage of technology to go well beyond the
protodigital and find new ways to connect researchers with relevant records
they might formerly have overlooked.
Many would immediately think of the scale of information as the
most significant change facing archivists.
While the size of backlogs and digital information is a problem, it’s
hardly new. Archivists have struggled
with information explosions for years.
After World War I, Jenkinson specifically addressed the issue in his Manual of Archive Administration: Including
the Problem of War Archives and Archive Making. The volume of records that resulted from the
growth of the federal government during the Depression and following World War
II drove Schellenberg and others at the National Archives to come up with new
ways to manage both active records and archives. And the phrase “information explosion” takes
off in the 1960s, and is largely replaced in the 1980s by “paperless office.”
At the workshop, I heard three themes of how technology can change
how we do our job. (Other themes were
mentioned, of course. And, there are
other areas of the archival enterprise where technology will have impact, but the
workshop focused on processing and providing access to collections.)
First, researchers asserted that finding aids remain
valuable. Hierarchical description based on provenance
and original order is largely derived from European tradition. In many ways, the model is as much pragmatic
as theoretical. Archives have never had the resources for item-level description. (In the early 20th century, the Library of
Congress’ manuscripts processing manual bemoaned backlogs, even as it
prescribed item-level calendaring.) The structure remains useful as a
framework. The finding aid is an
important means to document the original order of the collection, to preserve
the contextual relationship between records.
New tools that can search repositories and assemble collections based on
geotagging, name extraction, and more, described by Dan Cohen of the Digital
Public Library of America, are invaluable tools. But those assemblages are artificial and do
not have the authority of the order established by the creators, an order that
reflects the primary value of the records.
Bill Landis observed that recent archival practice has trended
away from item-level description, to higher and higher levels of abstraction. I’ll argue that technology allows us to
reverse that trend. It gives us the
tools to provide much more detailed access.
In the past, we didn’t have the staff or time to provide item-level
access. Now, we have access to computing
power that can provide that access at an even more sophisticated level that
goes beyond item-level access to data mining.
Many researchers don’t have ready access to the software or know how to use
those tools. That’s a service archivists
can – and I think should – provide. Trevor
Owen noted that the fourchan records were put online as a zip file with a
collection level description. But why
not pipe the collection through a full-text indexing tool and let people have
at it. People may find what they’re
looking for in the text, but not in the collection level description.
Second, archivists need to be better at what they do. Which raises the question, what is
better? Ironically, better may be
sloppier. Lambert Schomaker, who
presented on automated recognition of handwriting, noted that Google provides
reasonable results. At one point, he observed
that archivists sought perfect results, an exact hit. In archivists’ defense, I think there’s a
profound difference between searching the web and searching records. More often than not, the web has a range of
documents that contain overlapping information, where archives hold unique
documents that may be the only authoritative, authentic source of a very
specific piece of data. You might find
someone’s birthday scattered across the web, but their birth certificate is likely
in one place. Even so, Schomaker’s point
is well-taken. It’s better to have a
mess of reasonably relevant documents than nothing at all. Google can get you in the neighborhood and
give you clues where to look.
Luis Francisco‐Revilla noted that there was no consistency
in how a group of archivists – working separately – arranged a small collection
of personal papers. In response, one participant expressed
her concern that there were no normative practices for arrangement and much of
archival practice. (I expressed some
skepticism about the test. Original
order is a normative principle, but personal collections are notorious for being
chaotic with no meaningful order to preserve. Moreover, I argued – to tweets in
agreement – that such a small collection didn’t merit any arrangement; to the
extent arrangement facilitates rapid access, it would take very little time for
a researcher to peruse such few records.
Again, providing access without arrangement may be an example where
sloppy may be better.)
Better also means that we need to think about what the
finding aids say about the collections.
Do they answer users’ questions, help them finding relevant collections
and records? One researcher wanted more
back story on how the collections were acquired, something usually missing from
finding aids. One researcher’s comment
that scope notes were of little value might have pained the archivists in
attendance (it broke my heart), but I don’t find the observation
surprising. Recently, I asked my
students to do a survey of mission statements and collecting policies on
university archives’ websites. What they
found were often little more than a few bullet points of questionable value
because they had little substance that would help users (or archivists) know
what was in or out of scope. A recurring
theme at the workshop was that finding aids needed to do more than report the
structure of the collection. I’ve always
admired Cutter’s Rules, although more
than a hundred years old, because he begins with a strategy that focuses on the
user. His last object for the catalog is
“to assist in the choice of a book as to its edition [and] as to its character.” I believe that spirit needs to be at the
heart of finding aids, to be way-finders, to help researchers make sense of the
collection. The quality of description
must be measure by the degree to which they communicate the information
researchers need, not the degree to which they comply with formal rules.
Finally, and possibly most important, are archivists so wed
to the tradition of how we do things that we can’t (or won’t) innovate? When working on a project to explore automated
workflows to process digital collections, a participant whose job was processing
collections and proud of her craft fumed at her supervisor, “You can’t automate
what I do!” He responded, “You’re exactly
right! We don’t want to automate what
you do. We need to do something different.”
That is a revolutionary statement that could portend the demise
of archivists. I am concerned that if
archivists don’t step up to the plate, if they don’t adapt and take advantage
of technology, they may become extinct and others may take our place. I’ve already seen examples of this. When heads of companies and government
agencies get questions about email, they call the head of IT, not the records
manager or archivist. I suspect most
archives are struggling with limited resources to managed an overwhelming
number of tangible records. But to
ignore these tools, to be tied to historical approaches can paint records
managers and archivists into a corner.
Investing at least some time experimenting with and touting innovative
uses of technology may be an essential part of outreach that demonstrates we
remain relevant and current.
At the closing reception, a participant questioned my
observation, asking if the archival function would persist, even if others took
our place. I don’t know that the
fundamental value of archives – the function of cultural memory that sees the
long-tail value of some records – will persist.
Technologists, like the record creators, are appropriately focused on
the job at hand, the here and now. They
aren’t focused on “paperwork” or how the records that result for the work might
be needed in ten, fifty, or a hundred years.
Archivists, I believe, should view the present from a future
perspective. What will the future need
to remember about its past (our present)?
We need to be creative, and we need to put aside practical worries long
enough to think big, think outside the proverbial box (records center or
virtual). We can’t let the desire for
the perfect finding aid be the enemy of the possible. After all, our patrons are accustomed to
Google search results.
See Corydon Ireland, “Books meet Bytes,” Harvard
Gazette (4 April 2014) for a description of the first day of the
conference. http://news.harvard.edu/gazette/story/2014/04/books-meet-bytes/. See also the Twitter feed by searching
#radtech14. Shane Landrum was actively
tweeting and captured a summary at https://github.com/cliotropic/radtech14.
“Lost in a Disneyfied World: Archivists and Society in Late-Twentieth-Century
America,” American Archivist 61 (Fall
1998), p. 261, 263.
“Janus in Cyberspace: Archives on the Threshold of the Digital Era,” American Archivist 70 (Summer/Spring
2007), p. 13-22. Available online at http://archivists.metapress.com/content/n7121165223j6t83/fulltext.pdf.
I would like to acknowledge that Catherine Stollar and Thomas Kiehne challenged
my formulation, proposing instead “What we do as archivists will change
(practice), but why we do it will not (theory).” See Richard Pearce-Moses and Susan E. David, New Skills for a Digital Era (Society of
American Archivists, 2008), p. 64.
Available online at http://www.archivists.org/publications/proceedings/NewSkillsForADigitalEra.pdf.
Kudos to Ken Withers of the Sedona Conference for coining the term
(Clarendon Press, 1922). Available
through Google Books.
Suzanne Kahn and Rhae Lynn Barnes, two historians actively involved in
research, discussed their perspectives on finding aids as part of the
program. Both noted that finding aids,
even if imperfect, were valuable for a variety of reasons. Other speakers on the panel, moderated by
Ellen Shea, included Trevor Owen and Maureen Callahan. Callahan’s presentation is on her blog at http://icantiemyownshoes.wordpress.com/2014/04/04/the-value-of-archival-description-considered/
J. C. Fitzpatrick. Notes on the Care ,Cataloguing,
Calendaring and Arranging of Manuscripts (Library of Congress, 1913).
Available from the Hathi Trust at http://hdl.handle.net/2027/uc2.ark:/13960/t7br8zr3b.
Cohen gave a brilliant opening plenary that did a great job setting the stage
for the discussion.
In defense, Rome was not built in a day, and the archives deserves credit for
what it did, not criticism for not doing even more. I ask the question to illustrate how these
approaches must become so commonplace that they’re routine.
In the spirit of the Chatham House Rules, I omitted names of people making
comments unless they were part of the published program or unless they tweeted
their comments publicly. Anyone who
wishes to be acknowledged may contact me to have this piece edited, or they may
identify themselves in the comments.
Charles A. Cutter, Rules for a Printed
Dictionary Catalogue (Department of the Interior, Bureau of Education, 1876). Accessible through Google Books.
8 April 2014 : 1:48 p.m. EDT. Corrected Dan Cohen's name. I have no idea who Fred Cohen is a participant in the InterPARES Trust project. Apologies! <g>
29 October 2015 : 12:15 p.m. EDT. Grammatical edit, and I remembered who Fred Cohen is.