John. "New Methods for Humanities Research." The 2005 Lyman
Award Lecture. National Humanities Center. Research Triangle Park, NC.
11 Nov. 2005. <http://www3.isrl.uiuc.edu/ ~unsworth/lyman.htm>.
John Unsworth’s 2005 Lyman Award address at the National Humanities Center
lingers as a significant moment in efforts to expand research in the humanities
to include text-mining, data-mining, visualization, modeling, and pattern
recognition. Unsworth establishes an analogy between research methods in the
sciences, which are commonly classified as basic and applied, and research
methods in the humanities, which are better described as "scholarship" and
"criticism." He explains the gold standard for humanities research as
activity reducible to modes of "reading, writing, reflection, and rustication" (para.
9). Why new methods?, Unsworth asks.
After establishing a correspondence between research in the sciences and
research in the humanities, Unsworth argues that both applied and basic research
are manifest in the humanities. Particularly where humanities scholars
give readings (i.e, humanities as a hermeneutic enterprise), knowledge is
subject to revision from any number of forces: shifting theoretical stances, new
or unraveled evidence, and so on. The same is true for the sciences: "all
you can do is offer a hypothesis that withstands being disproven, for some
period of time, until contradictory evidence or a better account of the evidence
comes along" (para. 14). A proof/disproof opposition, however, runs the
risk of favoring rationalism and positivism rather than speculation. This
isn’t Unsworth’s problem, of course, but it continues to be important to
emphasize hybrids and complexity rather than singular models that win out
because of any selective treatment of evidence.
The new methods Unsworth proposes are best demonstrated by the NORA Project.
What do these methods do? "Data-mining delivers a new kind of evidence
into the scene of reading, writing, and reflection, and although it is not easy
to figure out sensible ways of applying this new research method (new, at least,
to the humanities), doing so allows us to check our sense of the gestalt against
the myriad details of the text, and sometimes in that process we will find our
assumptions checked and altered, almost in the way that evidence sometimes
alters assumptions in science" (para. 29). So we have new evidence and
(potentially) unprecedented forms of knowledge made possible by differential
treatments of texts (and related metadata). We have an electrate
complement, an expanded, though not inherently contentious, pluriverse of
evidence: patterns, clusters, maps, concentrations, and networks of association.
Importantly, Unsworth, while advocating for new methods, returns to pragmatic
questions: how can these processes contribute to the things literary scholars
already do? Take the tracing of terms, for example. Interested in how
particular words and phrases rise, fall, transform, evolve? Term-tracing is a
fairly common activity in studies of texts; new, computational methods
make a tremendous contribution in this regard.
Unsworth draws an important distinction between search-and-retrieval
processes (i.e., building a better search engine) and new methods which "produce
new knowledge by exposing similarities or differences, clustering or dispersal,
co-occurence and trends" (para. 17). This doesn’t mean that
search-and-retrieval is unimportant, but it does effectively suggest that
there’s more to the new methods than devising a search scheme more likely to
summon efficient returns. I need to return to this point when I work
through the differences between the text-mining to tagging (del.icio.us) and
text-mining to tagclouds (Mehta’s tagline slider). The first is, in large part,
motivated by search and association; the second is motivated by visual
epistemology and layered listing (a distinctly different arrangement and
presentation). Both have bearing on circulation, on keeping step with
expanding circulatory means, but we must avoid reducing text-mining to improved
Phrases: doing research (para. 2), theory/method (5), systematic (methodical)
thinking (6), recurring conventional units (11), text-mining tool development
(18), lack of explicit awareness (23), cyberinfrastructure (34).
"The other word in my title, "method," raises some issues of its own. A
method is a procedure, or sometimes more specifically (as in French) a
"system of classification, [a] disposition of materials according to a plan or
design" (OED). In the 1980s, in graduate school (and in job interviews), one
sometimes faced the daunting question "what’s your methodology?" Usually, what
that meant was "what’s your theoretical bent: what theoretical flag do you fly?"
There was an older sense of methodology still in force, though: dissertations
still sometimes had chapters on methodology, and graduate programs in English
were wrestling with whether or not to discard requirements for coursework in
research methods (which essentially meant bibliography, sometimes with library
research methods included). Most departments eventually did do away with this
requirement, and by the 1990s, "research" seemed to happen mostly without
attention to method." (5)
"The goal of the nora project is to produce text-mining software for
discovering, visualizing, and exploring significant patterns across large
collections of full-text humanities resources from existing digital
libraries and scholarly projects." (16)
"In search-and-retrieval, we bring specific queries to collections of text and
get back (more or less useful) answers to those queries; by contrast, the goal
of data-mining (including text-mining) is to produce new knowledge by exposing
similarities or differences, clustering or dispersal, co-occurrence and trends"
"There are many more challenges than I’ll mention tonight, but perhaps
the greatest challenge, at the outset and still today, has been in figuring out
exactly what data-mining really has to offer literary research, at a
level more specific than the cleverly non-specific generalities I offered in my
opening description of nora ("software for discovering, visualizing, and
exploring significant patterns across large collections of full-text humanities
resources"). What patterns would be of interest to literary scholars?" (21).
"[O]ne could (in principle) do this kind of modeling and even the quantitative
analysis without computers: you could model the crystal palace with
toothpicks and plastic wrap; you could do the painstaking word-counting and
frequency comparison by hand. But you wouldn’t, because there are other
interesting things you could do in far less time" (30).
"[W]e hope that when it is complete the report will help to foster the
development of the tools and the institutions that we require in order to
reintegrate the human record in digital form, and make it not only
practically available but also intellectually accessible to all those who
might be interested in it" (34).
Against Method. 1975. 3rd Ed. New York: Verso, 1993.