Saturday, March 10, 2007

Unsworth, "New Methods for Humanities Research"

Unsworth, John. "New Methods for Humanities Research." The 2005 Lyman Award Lecture. National Humanities Center. Research Triangle Park, NC. 11 Nov. 2005. < ~unsworth/lyman.htm>.

John Unsworth's 2005 Lyman Award address at the National Humanities Center lingers as a significant moment in efforts to expand research in the humanities to include text-mining, data-mining, visualization, modeling, and pattern recognition. Unsworth establishes an analogy between research methods in the sciences, which are commonly classified as basic and applied, and research methods in the humanities, which are better described as "scholarship" and "criticism." He explains the gold standard for humanities research as activity reducible to modes of "reading, writing, reflection, and rustication" (para. 9). Why new methods?, Unsworth asks.

After establishing a correspondence between research in the sciences and research in the humanities, Unsworth argues that both applied and basic research are manifest in the humanities. Particularly where humanities scholars give readings (i.e, humanities as a hermeneutic enterprise), knowledge is subject to revision from any number of forces: shifting theoretical stances, new or unraveled evidence, and so on. The same is true for the sciences: "all you can do is offer a hypothesis that withstands being disproven, for some period of time, until contradictory evidence or a better account of the evidence comes along" (para. 14). A proof/disproof opposition, however, runs the risk of favoring rationalism and positivism rather than speculation. This isn't Unsworth's problem, of course, but it continues to be important to emphasize hybrids and complexity rather than singular models that win out because of any selective treatment of evidence.

The new methods Unsworth proposes are best demonstrated by the NORA Project. What do these methods do? "Data-mining delivers a new kind of evidence into the scene of reading, writing, and reflection, and although it is not easy to figure out sensible ways of applying this new research method (new, at least, to the humanities), doing so allows us to check our sense of the gestalt against the myriad details of the text, and sometimes in that process we will find our assumptions checked and altered, almost in the way that evidence sometimes alters assumptions in science" (para. 29). So we have new evidence and (potentially) unprecedented forms of knowledge made possible by differential treatments of texts (and related metadata). We have an electrate complement, an expanded, though not inherently contentious, pluriverse of evidence: patterns, clusters, maps, concentrations, and networks of association. Importantly, Unsworth, while advocating for new methods, returns to pragmatic questions: how can these processes contribute to the things literary scholars already do? Take the tracing of terms, for example. Interested in how particular words and phrases rise, fall, transform, evolve? Term-tracing is a fairly common activity in studies of texts; new, computational methods make a tremendous contribution in this regard.

Unsworth draws an important distinction between search-and-retrieval processes (i.e., building a better search engine) and new methods which "produce new knowledge by exposing similarities or differences, clustering or dispersal, co-occurence and trends" (para. 17). This doesn't mean that search-and-retrieval is unimportant, but it does effectively suggest that there's more to the new methods than devising a search scheme more likely to summon efficient returns. I need to return to this point when I work through the differences between the text-mining to tagging ( and text-mining to tagclouds (Mehta's tagline slider). The first is, in large part, motivated by search and association; the second is motivated by visual epistemology and layered listing (a distinctly different arrangement and presentation). Both have bearing on circulation, on keeping step with expanding circulatory means, but we must avoid reducing text-mining to improved search-and-retrieval.

Phrases: doing research (para. 2), theory/method (5), systematic (methodical) thinking (6), recurring conventional units (11), text-mining tool development (18), lack of explicit awareness (23), cyberinfrastructure (34).

"The other word in my title, "method," raises some issues of its own. A method is a procedure, or sometimes more specifically (as in French) a "system of classification, [a] disposition of materials according to a plan or design" (OED). In the 1980s, in graduate school (and in job interviews), one sometimes faced the daunting question "what's your methodology?" Usually, what that meant was "what's your theoretical bent: what theoretical flag do you fly?" There was an older sense of methodology still in force, though: dissertations still sometimes had chapters on methodology, and graduate programs in English were wrestling with whether or not to discard requirements for coursework in research methods (which essentially meant bibliography, sometimes with library research methods included). Most departments eventually did do away with this requirement, and by the 1990s, "research" seemed to happen mostly without attention to method." (5)

"The goal of the nora project is to produce text-mining software for discovering, visualizing, and exploring significant patterns across large collections of full-text humanities resources from existing digital libraries and scholarly projects." (16)

"In search-and-retrieval, we bring specific queries to collections of text and get back (more or less useful) answers to those queries; by contrast, the goal of data-mining (including text-mining) is to produce new knowledge by exposing similarities or differences, clustering or dispersal, co-occurrence and trends" (17).

"There are many more challenges than I'll mention tonight, but perhaps the greatest challenge, at the outset and still today, has been in figuring out exactly what data-mining really has to offer literary research, at a level more specific than the cleverly non-specific generalities I offered in my opening description of nora ("software for discovering, visualizing, and exploring significant patterns across large collections of full-text humanities resources"). What patterns would be of interest to literary scholars?" (21).

"[O]ne could (in principle) do this kind of modeling and even the quantitative analysis without computers: you could model the crystal palace with toothpicks and plastic wrap; you could do the painstaking word-counting and frequency comparison by hand. But you wouldn't, because there are other interesting things you could do in far less time" (30).

"[W]e hope that when it is complete the report will help to foster the development of the tools and the institutions that we require in order to reintegrate the human record in digital form, and make it not only practically available but also intellectually accessible to all those who might be interested in it" (34).

Related sources:

Feyerabend, Paul. Against Method. 1975. 3rd Ed. New York: Verso, 1993.

Bookmark and Share Posted by at March 10, 2007 2:16 PM to Dissertation