Wednesday, December 12, 2007

Address Keywords

How best to arrive at keywords (before they are tags)? One humorless punchline is that I will not soon have a degree in computational linguistics. I have dealt superficially with the question this week, first by thinking about the relationship of the terms assigned by various methods--where we have keywords at all, that is. The most prominent journals in composition studies do very little with keywords, much less with tags (here I am thinking of tags as the digital iteration of keywords that includes latent, descriptive, and procedural labeling). Why is that?

The table below grew first from parallel questions about the overlaps between Mehta's chronological approach to tag clouds (with hues that explain persistence) and Marlow's process, which remains important because it can return multi-term noun phrases rather than only one-word keywords (also because Marlow's is the one we use for CCCOA). As of yet and because I am short on space, I do very little to account for TagCrowd and ManyEyes: TagCrowd because I too quickly hit the memory ceiling with the files I am working from; ManyEyes because there are copyright concerns with uploading full texts of articles that belong properly to NCTE. Anyway, I will return to ManyEyes in chapter four.

Below I have boldfaced common terms across the three keywording methods. The second two columns apply duplicable computational methods of great relevance to the diss. Still, they are not perfect matches. Is this a flaw? I think of it instead as a sign of life--a slight rattle in the imperfectly fitting (and therefore thought-provoking) works.

Address/Script CompPile
Determined upon data input (it is not clear whether these are assigned by one person or whether, if they are handled by different people, there is any shared effort at reconciling them)
Mehta's PHP Script, Top 10
Uses exclude file and PHP Stemmer
Marlow's Perl Process, Top 10
Uses EN::Lingua::Tagger; nouns and noun phrases only
1999, Villanueva racism, profession, Latin-Am, history, pre-conquest, Aztec American, colonial, color, ethnicity, Europe, group, latinos, numbers, people, racism color (30), racism (23), people (20), america (11), latinos (11), peru (11), ethnicity (10), france (10), gods (10), numbers (10)
2000, Gilyard cross-cultural, literacy, identity, critical-pedagogy, social justice, learning-theory, language, teacher-student, imagination, flight dance, Gilyard, identity, mean, play, social, students, tao, time, work tao (18), time (15), gilyard (13), king (13), students (10), brown (9), cannon (9), money (9), discourse (8), dunbar (8)
2001, Bishop profession, 'Chair's Address', fatigue, renewal composition, convention, field, poem, space, teachers, teaching, time, work, years convention (19), poem (16), composition (14), teaching (11), time (11), members (10), my (10), teachers (10), field (9), rhetoric (9)
2002, Lovas professional, faculty-status, CCCC, Conference on College Composition and Communication, professional identity, literacy autobiography, equity, assignment, curriculum, community college college, community, faculty, program, students, teaching, university, work, writing, years writing (33), college (31), students (25), colleges (24), faculty (20), community (18), work (15), teaching (14), university (14), composition (12)
2003, Logan practice, classroom, language-rights, African-Am, women, mission, Chair's Address, composition, difference, English, language, learning, rights, statement, students, teaching, writing Students (28), composition (19), language (19), writing (18), statement (17), CCCC (16), teaching (16), teachers (11), position (10), conditions (9)
2004, Yancey Chair's Address, literacy, change, profession, faculty status, practice, pedagogy, history, curriculum, media, technology, circulation, production, academic-public, academic-nonacademic composition, literacy, public, reading, school, students, technology, text, words, writing students (60), composition (57), writing (55), literacy (32), text (31), school (29), circulation (25), words (25), moment (23), technology (22)
Bookmark and Share Posted by at December 12, 2007 11:20 AM to Dissertation,Methods