Saturday, March 10, 2007
Unsworth, "New Methods for Humanities Research"
Unsworth, John. "New Methods for Humanities Research." The 2005 Lyman Award Lecture. National Humanities Center. Research Triangle Park, NC. 11 Nov. 2005. <http://www3.isrl.uiuc.edu/ ~unsworth/lyman.htm>.
John Unsworth's 2005 Lyman Award address at the National Humanities Center lingers as a significant moment in efforts to expand research in the humanities to include text-mining, data-mining, visualization, modeling, and pattern recognition. Unsworth establishes an analogy between research methods in the sciences, which are commonly classified as basic and applied, and research methods in the humanities, which are better described as "scholarship" and "criticism." He explains the gold standard for humanities research as activity reducible to modes of "reading, writing, reflection, and rustication" (para. 9). Why new methods?, Unsworth asks.
After establishing a correspondence between research in the sciences and research in the humanities, Unsworth argues that both applied and basic research are manifest in the humanities. Particularly where humanities scholars give readings (i.e, humanities as a hermeneutic enterprise), knowledge is subject to revision from any number of forces: shifting theoretical stances, new or unraveled evidence, and so on. The same is true for the sciences: "all you can do is offer a hypothesis that withstands being disproven, for some period of time, until contradictory evidence or a better account of the evidence comes along" (para. 14). A proof/disproof opposition, however, runs the risk of favoring rationalism and positivism rather than speculation. This isn't Unsworth's problem, of course, but it continues to be important to emphasize hybrids and complexity rather than singular models that win out because of any selective treatment of evidence.
The new methods Unsworth proposes are best demonstrated by the NORA Project. What do these methods do? "Data-mining delivers a new kind of evidence into the scene of reading, writing, and reflection, and although it is not easy to figure out sensible ways of applying this new research method (new, at least, to the humanities), doing so allows us to check our sense of the gestalt against the myriad details of the text, and sometimes in that process we will find our assumptions checked and altered, almost in the way that evidence sometimes alters assumptions in science" (para. 29). So we have new evidence and (potentially) unprecedented forms of knowledge made possible by differential treatments of texts (and related metadata). We have an electrate complement, an expanded, though not inherently contentious, pluriverse of evidence: patterns, clusters, maps, concentrations, and networks of association. Importantly, Unsworth, while advocating for new methods, returns to pragmatic questions: how can these processes contribute to the things literary scholars already do? Take the tracing of terms, for example. Interested in how particular words and phrases rise, fall, transform, evolve? Term-tracing is a fairly common activity in studies of texts; new, computational methods make a tremendous contribution in this regard.
Unsworth draws an important distinction between search-and-retrieval processes (i.e., building a better search engine) and new methods which "produce new knowledge by exposing similarities or differences, clustering or dispersal, co-occurence and trends" (para. 17). This doesn't mean that search-and-retrieval is unimportant, but it does effectively suggest that there's more to the new methods than devising a search scheme more likely to summon efficient returns. I need to return to this point when I work through the differences between the text-mining to tagging (del.icio.us) and text-mining to tagclouds (Mehta's tagline slider). The first is, in large part, motivated by search and association; the second is motivated by visual epistemology and layered listing (a distinctly different arrangement and presentation). Both have bearing on circulation, on keeping step with expanding circulatory means, but we must avoid reducing text-mining to improved search-and-retrieval.
Phrases: doing research (para. 2), theory/method (5), systematic (methodical) thinking (6), recurring conventional units (11), text-mining tool development (18), lack of explicit awareness (23), cyberinfrastructure (34).
"The other word in my title, "method," raises some issues of its own. A
method is a procedure, or sometimes more specifically (as in French) a
"system of classification, [a] disposition of materials according to a plan or
design" (OED). In the 1980s, in graduate school (and in job interviews), one
sometimes faced the daunting question "what's your methodology?" Usually, what
that meant was "what's your theoretical bent: what theoretical flag do you fly?"
There was an older sense of methodology still in force, though: dissertations
still sometimes had chapters on methodology, and graduate programs in English
were wrestling with whether or not to discard requirements for coursework in
research methods (which essentially meant bibliography, sometimes with library
research methods included). Most departments eventually did do away with this
requirement, and by the 1990s, "research" seemed to happen mostly without
attention to method." (5)
"The goal of the nora project is to produce text-mining software for
discovering, visualizing, and exploring significant patterns across large
collections of full-text humanities resources from existing digital
libraries and scholarly projects." (16)
"In search-and-retrieval, we bring specific queries to collections of text and
get back (more or less useful) answers to those queries; by contrast, the goal
of data-mining (including text-mining) is to produce new knowledge by exposing
similarities or differences, clustering or dispersal, co-occurrence and trends"
(17).
"There are many more challenges than I'll mention tonight, but perhaps
the greatest challenge, at the outset and still today, has been in figuring out
exactly what data-mining really has to offer literary research, at a
level more specific than the cleverly non-specific generalities I offered in my
opening description of nora ("software for discovering, visualizing, and
exploring significant patterns across large collections of full-text humanities
resources"). What patterns would be of interest to literary scholars?" (21).
"[O]ne could (in principle) do this kind of modeling and even the quantitative
analysis without computers: you could model the crystal palace with
toothpicks and plastic wrap; you could do the painstaking word-counting and
frequency comparison by hand. But you wouldn't, because there are other
interesting things you could do in far less time" (30).
"[W]e hope that when it is complete the report will help to foster the
development of the tools and the institutions that we require in order to
reintegrate the human record in digital form, and make it not only
practically available but also intellectually accessible to all those who
might be interested in it" (34).
Related sources:
Feyerabend, Paul. Against Method. 1975. 3rd Ed. New York: Verso, 1993.
Posted by Derek Mueller at March 10, 2007 2:16 PM to Dissertation