Category Archives: collaboration

Group and Method: Collaboration in the Digital Humanities

Yesterday I gave a talk called “Group and Method: Collaboration in the Digital Humanities” at Case Western Reserve University’s Freedman Center Colloquium on “Exploring Collaboration in Digital Scholarship.” Drawing on my research for “Computing and Communicating Knowledge” and for a series of blog posts, I discussed why collaboration is so common in digital humanities (although of course not all DH work is necessarily collaborative); explored the significance of collaboration in projects to build digital resources, devise new research methods, and promote participatory humanities; and explored challenges to collaboration. I also described how my experiences as a grad student in English convinced me of the value of collaboration–particularly my membership in a dissertation group (I was thrilled that my fellow diss group member Amanda French also gave a talk at the colloquium) and my work at Virginia’s Etext Center.

Here is the pdf of the slides.

Opening the Humanities Part 2: Contexts

In 1813, Thomas Jefferson declared in a letter to Isaac McPherson:

“He who receives an idea from me, receives instruction himself without lessening mine; as he who lights his taper at mine, receives light without darkening me. That ideas should freely spread from one to another over the globe, for the moral and mutual instruction of man, and improvement of his condition, seems to have been peculiarly and benevolently designed by nature….”

“Sharing,” by Josh Harper

Unlike, say, a diamond bracelet, an idea can be freely given to others without diminishing its value for the person who “owns” it–indeed, its value only increases as it spreads. While Jefferson believed that the creators of inventions could not claim permanent, natural rights over them, he acknowledged that society could grant the right to profit from them in order to foster innovation (which, as Chris Kelty notes, Jefferson termed the “the embarrassment of an exclusive patent,” suggesting his discomfort). He cautioned that intellectual property rights may actually endanger innovation by granting monopolies, should exist only long enough to spawn innovation, should be governed by rules limiting their application, and should be differentiated according to what benefit they convey to the public (Boyle, The Public Domain).

Jefferson’s letter raises fundamental questions: what social functions do intellectual property rights play? How can we best encourage the sharing of ideas and the progress of knowledge? In this post, the second in my series on the open humanities, I will explore legal and cultural contexts, focusing on the US.

The view that intellectual property rights are granted to encourage innovation is reflected in Article 1, Section 8  of the US Constitution: “To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries.” Note that that the Constitution describes both the purpose of copyright–”To promote the Progress of Science and useful Arts”–and places limits upon it. Copyright aims to provide an incentive (a limited monopoly) for creators to share their work so that others may make use of it and build upon it. This incentive is balanced by limits, so that after a period of time the work falls into the public domain. The 1790 Copyright Act set the copyright term at 14 years, with the right to renew for another 14 years. Now, after the passage of the Sonny Bono Copyright Term Extension Act, the copyright term has exploded to 70 years after the death of the author. The original intention to encourage the progress of public knowledge seems to have fallen aside in the interest of protecting commercial interests such as Disney’s monopoly over Mickey Mouse.

Expansion of U.S. copyright law (assuming authors create their works 35 years prior to their death) (Wikipedia)

Expansion of U.S. copyright law (assuming authors create their works 35 years prior to their death) (Wikipedia)

With most academic work, the ability to secure a monopoly over one’s ideas is not the primary incentive for sharing. Rather, most academics publish scholarly works in order to make a visible contribution to the scholarly conversation, build their scholarly reputation, and ultimately secure tenure or promotion. Typically researchers do not receive monetary compensation for publishing journal articles; the reward comes in disseminating their research. As Peter Suber suggests, one factor that makes open access more complicated in the humanities is that authors of monographs often expect to receive royalties. However, as Paul Courant points out, the monetary rewards tend to be small; the author of a moderately successful manuscript selling 1000 copies might expect to make less than $4000, and “for many monographs, lifetime royalties are zero or close to it.” As Courant suggests, “The big financial payoff to the author of the great majority of scholarly books is not the royalties but the visibility (and hence the salary and working conditions) of the author in the academic labor market.” If authors aim to contribute to the scholarly conversation and heighten their visibility, it makes sense for them to remove barriers to their work (although they also have an incentive to publish with the top journals or publishers).

Open access facilitates the sharing of scholarly knowledge. Peter Suber, a philosopher and respected advocate for open access, offers a simple definition: “Open-access (OA) literature is digital, online, free of charge, and free of most copyright and licensing restrictions.” Because such literature is digital and available online, distributing it costs almost nothing, and it can be accessed by anyone with an Internet connection. The lack of most restrictions means that the literature could be accessed and mined, which could open up new insights. But creators can put into place some restrictions over open works. For example, they can adopt a Creative Commons license and specify whether the work can be modified and/or used commercially, as well as whether the work must be attributed (CC-BY) and/or whether new versions of the work must be licensed under the same terms (share and share alike). CC-BY upholds the scholarly practice of acknowledging sources (see Bethany Nowviskie’s “why, oh why, CC-BY?” for a smart discussion of the rationale for adopting this license). There are two principal means of disseminating open access scholarly work: green, through depositing works in disciplinary repositories (like arXiv) or institutional repositories (like DSpace@MIT), and gold, through publishing open journals and monographs. Note that many publishers allow scholars to self archive work in repositories; visit SHERPA RoMEO to access publisher policies.

Unfortunately, the humanities seem to be behind the sciences in practicing openness. As Wikipedia explains, the open science movement aims to enlarge access to research, data, and publications, speed up scholarly communication, facilitate collaboration, and improve the sharing and building of knowledge, whether through open lab notebooks, open data, or open access to scholarly literature. There isn’t even a Wikipedia page for open humanities (let’s get to work!). The Directory of Open Access and Hybrid Journals lists nearly 3000 journals in the sciences as opposed to a little over 1300 in the arts & humanities. Much of the rhetoric around openness focuses on science; as a rough measure, there are approximately 973,000 Google results for “open science” versus around 38,000 for “open humanities”.

In a 2004 essay, Peter Suber pointed to a number of reasons why the humanities have been more reluctant to embrace openness than the sciences, including the greater availability of public funding for scientific research (and publishing fees), a deeper sense of a cost crisis with science journals, the significance of pre-print repositories in the sciences, the importance of monographs in the humanities, and the greater public pressure for open access to science. Updating Suber’s analysis eight years later, Gary Daught suggests that the time may be ripe for efforts to promote openness in the humanities. He notes that the price inflation of humanities journals has become a greater concern and that open source tools such as Open Journal Systems have brought down publishing costs. Perhaps most importantly, as scholars become more accustomed to the speed, convenience and openness of online communication, they may more expect that research is easily accessible.

Indeed, I’ve identified a number of open humanities projects, mainly in the digital humanities. Openness in the humanities can take many forms, including:

While these different ways of categorizing openness are helpful, I agree with Clint Lalonde (riffing on Gardner Campbell) that “open is an attitude”– not only being willing to share resources, but also to work in such a way that others can observe, learn and offer to help. In my next post, I’ll provide a number of examples of open humanities projects and initiatives.

Of course, open humanities projects aren’t necessarily focused on digital humanities; note, for instance, publishing initiatives such as Open Humanities Press. With digital humanities, we often see the intersection of humanistic values and what I’ll call Web values. Driven by a desire to make it easier for scientists to share their data and collaborate, Tim Berners-Lee created the foundations of the Web. Rather than being a proprietary system, the Web is built upon open protocols, standards and design principles. The success of the Web comes from the way that it connects people to each other, information, and experiences, enabling them to share ideas, converse with each other, and explore and interact with information. Hence Berners-Lee’s message (appropriately delivered via Twitter) at the 2012 Summer Olympics: “this is for everyone.” What would it take to say the same about humanities scholarship and educational resources?

[Note: This post expands on a presentation I gave at WPI’s Digital Humanities Symposium in November.]

Examples of Collaborative Digital Humanities Projects

Observing that humanities scholars rarely jointly author articles, as I did in my last post, comes as no surprise.  As Blaise Cronin writes, “Collaboration—for which co-authorship is the most visible and compelling indicator—is established practice in both the life and physical sciences, reflecting the industrial scale, capital-intensiveness and complexity of much contemporary scientific research. But the ‘standard model of scholarly publishing,’ one that ‘assumes a work written by an author,” continues to hold sway in the humanities’ (24).   Just as I found that only about 2% of the articles published in American Literary History between 2004 and 2008 were co-authored, so Cronin et al discovered that just 2% of the articles that appeared in the philosophy journal Mind between 1900 and 2000 were written by more than one person, although between 1990 and 2000 that number increased slightly to 4% (Cronin, Shaw, & La Barre).   Whereas the scale of scientific research often requires scientists to collaborate with each other, humanities scholars typically need only something to write with and about.  But as William Brockman, et al suggest, humanities scholars do have their own traditions of collaboration, or at least of cooperation:  “Circulation of drafts, presentation of papers at conferences, and sharing of citations and ideas, however, are collaborative enterprises that give a social and collegial dimension to the solitary activity of writing. At times, the dependence of humanities scholars upon their colleagues can approach joint authorship of a publication” (11).

Information technology can speed and extend the exchange of ideas, as researchers place their drafts online and solicit comments through technologies such as CommentPress, make available conference papers via institutional repositories, and share citations and notes using tools such as Zotero.  Over ten years, ago John Unsworth described an ongoing shift from cooperation to collaboration, indicating perhaps both his prescience and the slow pace of change in academia.

In the cooperative model, the individual produces scholarship that refers to and draws on the work of other individuals. In the collaborative model, one works in conjunction with others, jointly producing scholarship that cannot be attributed to a single author. This will happen, and is already happening, because of computers and computer networks. Many of us already cooperate, on networked discussion groups and in private email, in the research of others: we answer questions, provide references for citations, engage in discussion. From here, it’s a small step to collaboration, using those same channels as a way to overcome geographical dispersion, the difference in time zones, and the limitations of our own knowledge.

The limitations of our own knowledge.  As Unsworth also observes, collaboration, despite the challenges it poses, can open up new approaches to inquiry: “instead of establishing a single text, editors can present the whole layered history of composition and dissemination; instead of opening for the reader a single path through a thicket of text, the critic can provide her with a map and a machete. This is not an abdication of the responsibility to educate or illuminate: on the contrary, it engages the reader, the user, as a third kind of collaborator, a collaborator in the construction of meaning.”  With the interactivity of networked digital environments, Unsworth imagines the reader becoming an active co-creator of knowledge.  Through online collaboration, scholars can divide labor (whether in making a translation, developing software, or building a digital collection), exchange and refine ideas (via blogs, wikis, listservs, virtual worlds, etc.), engage multiple perspectives, and work together to solve complex problems.  Indeed, “[e]mpowering enhanced collaboration over distance and across disciplines” is central to the vision of cyberinfrastructure or e-research (Atkins).  Likewise, Web 2.0 focuses on sharing, community and collaboration.

Work in many areas of the digital humanities seems to both depend upon collaboration and aim to support it.  Out of the 116 abstracts for posters, presentations, and panels given at the Digital Humanities 2008 (DH2008) conference, 41 (35%) include a form of the word “collaboration,” whether they are describing collaborative technologies (“Online Collaborative Research with REKn and PReE”) or collaborative teams (“a collaborative group of librarians, scholars and technologists”).  Likewise, 67 out of 104 (64%) papers and posters presented at DH 2008 have more than one author.  (Both the Digital Humanities conference and LLC tend to focus on the computational side of the digital humanities, so I’d also like to see if the pattern of collaboration holds in what Tara McPherson calls the “multimodal humanities,” e.g. journals such as Vectors.  Given that works in Vectors typically are produced through collaborations between scholars and designers, I’d expect to see a somewhat similar pattern.)

I was having trouble articulating precisely how collaboration plays a role in humanities research until I began looking for concrete examples—and I found plenty.   As computer networks connect researchers to content, tools and each other, we are seeing humanities projects that facilitate people working together to produce, explore and disseminate knowledge.  I interpret the word “collaboration” broadly; it’s a squishy term with synonyms such as teamwork, cooperation, partnership, and working together, and it also calls to mind co-authorship, communication, community, citizen humanities, and social networks.  In Here Comes Everybody, Clay Shirky puts forward a handy hierarchy of collaboration: 1) sharing; 2) cooperation; 3) collaboration; 4) collectivism (Kelly).  In this post, I’ll list different types of computer-supported collaboration in the humanities, note antecedents in “traditional” scholarship, briefly describe example projects, and point to some supporting technologies.  This is an initial attempt to classify a wide range of activity; some of these categories overlap.



  • Historical antecedents: conferences, colloquia, letters
  • Supporting technologies: listservs, online forums, blogs, social networking platforms, virtual worlds, microblogging (e.g. Twitter), video conferencing
  • Key functions: fostering communication and collaboration across a distance
  • Examples:
    • Listervs: Perhaps the most well-known online community in the humanities is H-NET, which was founded in 1992  and thus predates Web 2.0 or even Web 1.0.  According to Mark Kornbluh, H-Net provides an “electronic version of an academic conference, a way for people to come together and to talk about their research and their teaching, to announce what was going on in the field, and to review and critique things that are going on in the field.”  Currently H-Net  supports over 100 humanities email lists and serves over 100,000 subscribers in more than 90 countries.  Although H-Net has been criticized for relying on an old technology, the listserv, and is facing economic difficulties, it remains valued for supporting information sharing and discussion.  For digital humanities folks, the Humanist list, launched in 1987, serves as “an international online seminar on humanities computing and the digital humanities” and has played a vital part in the intellectual life of the community.
    • Online forums: HASTAC, “a virtual network, a network of networks” that supports collaboration across disciplines and institutions, sponsors lively forums about technology and the humanities, often moderated by graduate students.  HASTAC also organizes conferences, administers a grant competition, and advocates for “new forms of collaboration across communities and disciplines fostered by creative uses of technology.” In my experience, online communities often break down the hierarchies separating graduate students from senior scholars and bring recognition to good ideas, no matter what the source.
    • Online communities: Since 1996, Romantic Circles (RC) has built an online community focused on Romanticism, not only fostering communication among researchers but also collaboratively developing content.  Romantic Circles includes a blog for sharing information about news and events of interest to the community; a searchable archive of electronic editions; collections of critical essays; chronologies, indices, bibliographies and other scholarly tools; reviews; pedagogical resources; and a MOO (gaming environment).  Over 30 people have served as editors, while over 300 people have contributed reviews and essays.  Alan Liu aptly summarizes RC’s significance: “Romantic Circles, which helped pioneer collaborative scholarship on the Web, has become the leading paradigm for what such scholarship could be. One can point variously to the excellence of its refereed editions of primary texts, its panoply of critical and pedagogical resources, its inventive Praxis series, its state-of-the-art use of technology or its stirring commitment (nearly unprecedented on the Web) to spanning the gap between high-school and research-level tiers of education. But ultimately, no one excellence is as important as the overall, holistic impact of the site. We witness here a broad community of scholars using the new media vigorously, inventively, and rigorously to inhabit a period of historical literature together.”In building a community that supports digital scholarship, NINES focuses on three main goals: providing peer review for digital scholarship in 19th century American and British studies (thus helping to legitimize and recognize emerging scholarly forms), helping scholars create digital scholarship by providing training and content, and developing software such as Collex and Juxta to support inquiry and collaboration.
    • Advanced videoconferencing: With budgets tight, time scarce, and concern about the environmental costs  of travel increasing, collaborators often need to meet without having to travel.  AccessGrid supports communication among multiple groups by providing high quality video and audio and enabling researchers to share data and scientific instruments seamlessly.  AccessGrid, which was developed by Argonne National Laboratory and uses open source software, employs large displays and multiple projectors to create an immersive environment.   In the arts and humanities, AccessGrid has been used to support “telematic” performances, the study of high resolution images, seminars, and classes.
CollabRoom by Modbob

CollabRoom by Modbob


  • Historical antecedents: laboratories, research centers,
  • Supporting technologies: grid technologies/ advanced networking, large displays, remote instrumentation, simulation software, collaboration platforms such as HubZero, databases, digital libraries
  • Key functions: fostering communication, collaboration, resource sharing, and research regardless of physical distance
  • Examples:

William Wulf coined the term collaboratory in 1989 to describe a “center without walls, in which the nation’s researchers can perform their research without regard to physical location, interacting with colleagues, accessing instrumentation, sharing data and computational resources, [and] accessing information in digital libraries.” Most of the collaboratories listed on the (now somewhat-out-of-date) Science of Collaboratories web site focus on the sciences.  For example, scientific collaboratories such as NanoHub, Space Physics and Astronomy Research Collaboratory (SPARC) and Biomedical Informatics Research Network (BIRN) have supported online data sharing, analysis, and communication.

What would a collaboratory in the humanities do? The term has been used in the humanities to refer to:

“Collaboratory” has thus taken on additional meanings, referring to “a new networked organizational form that also includes social processes; collaboration techniques; formal and informal communication; and agreement on norms, principles, values, and rules” (Cogburn, 2003, via Wikipedia).

“Virtual research environment” seems to be replacing “collaboratory” to refer to online collaborative spaces that provide access to tools and content (e.g. Early Modern Texts VRE, powered by Sakai). Through its funding program focused on Virtual Research Environments, JISC has sponsored the Virtual Research Environment for Archaeology, a VRE for the Study of Documents and Manuscripts, Collaborative Research Events on the Web, and myExperiments for sharing scientific workflows.



  • Historical antecedents: museums, archives, personal collections
  • Supporting technologies: Web publishing platforms (e.g. Omeka, Drupal), databases
  • Key functions: “collecting & exhibiting” content (to borrow from CHNM)
  • Examples:
    When the Valley of the Shadow project was launched in the 1990s, project team members went into communities in Pennsylvania and Virginia to digitize 19th century documents held by families in personal collections, thus building a virtual archive.  As scanners and digital cameras have become ubiquitous and user-contributed content sites such as Flickr and YouTube have taken off, people can contribute their own digital artifacts to online collections.  For example, The Hurricane Digital Memory Bank collects over 25,000 stories, images, and other multimedia files about Hurricanes Katrina and Rita.  Using a simple interface, people can upload items and describe the title, keywords, geographic location, and contributor.  The archive thus becomes a dynamic, living repository of current history, a space where researchers and citizens come together—or, in the terminology of the Center for History and New Media (CHNM), a memory bank that “promote[s] popular participation in presenting and preserving the past.”  As the editors of Vectors write in their introduction to “Hurricane Digital Memory Bank: Preserving the Stories of Katrina, Rita, and Wilma,” “Their work troubles a number of binaries long reified by history scholars (and humanities scholars more generally), including one/many, closed/open, expert/amateur, scholarship/journalism, and research/pedagogy.”  CHNM also sponsors digital memory banks focused on Mozilla, September 11, and the Virginia Tech tragedy.  Likewise, the Great War Archive, sponsored by the University of Oxford, contains over 6,500 items about World War I contributed by the public.


  • Historical antecedents: museums, archives
  • Supporting technologies: databases, open standards
  • Key functions: making it easier to discove, share and use information
  • Examples:
    Too often digital resources reside in silos, as each library or archive puts up its own digital collection.  As a result, researchers must spend more time identifying, searching, and figuring out how to use relevant digital collections.  However, some projects are shifting away from a siloed approach and bringing together collaborators to build digital collections focused on a particular topic or to develop interoperable, federated digital collections.  For instance, the Alliance for American Quilts, MATRIX: Center for Humane Arts, Letters and Social Sciences Online, and Michigan State University Museum have created the Quilt Index, which makes available images and descriptions of quilts provided by 14 contributors, including The Library of Congress American Folklife Center and the Illinois State Museum.  As Mark Kornbluh argues, interoperable content enables new kinds of inquiry: “In the natural sciences, large new datasets, powerful computers, and a rich array of computational tools are rapidly transforming knowledge generation. For the same to occur in the humanities, we need to understand the principle that ‘more is better.’ Part of what the computer revolution is doing is that it is letting us bring huge volumes of material under control. Cultural artifacts have always been held by separate institutions and separated by distance. Large–scale interoperable digital repositories, like the Quilt Index, open dramatically new possibilities to look at the totality of cultural content in ways never before possible.” Other examples of content aggregation and integration projects include the Walt Whitman Archive’s Finding Aids for Poetry Manuscripts and NINES.


  • Historical antecedents: informal exchange of data
  • Supporting technologies: databases (MySQL, etc), web services tools
  • Key functions: support research by enabling discovery and reuse of data sets
  • Example projects:
    By sharing data, researchers can enable others to build on their work and provide transparency.  As Christine Borgman writes, “If related data and documents can be linked together in a scholarly information infrastructure, creative new forms of data- and information-intensive, distributed, collaborative, multidisciplinary research and learning become possible.  Data are outputs of research, inputs to scholarly publications, and inputs to subsequent research and learning.  Thus they are the foundation of scholarship” (Borgman 115).  Of course, there are a number of problems bound up in data sharing—how to ensure participation, make data discoverable through reliable metadata, balance flexibility in accepting a range of formats and the need for standardization, preserve data for the long term, etc.  Several projects focused on humanities and social science data are beginning to confront at least some of these challenges:

    • Open Context “hopes to make archaeological and related datasets far more accessible and usable through common web-based tools.”  Embracing open access and collaboration, Open Context makes it easy for researchers to upload, search, tag and analyze archaeological datasets.
    • Through Open Street Map, people freely and openly share and use geographic data in a wiki-like fashion.  Contributors employ GPS devices to record details about places such as the names of roads, then upload this information to a collaborative database.  The data is used to create detailed maps that have no copyright restrictions (unlike most geographical data).
    • Through the Reading Experience Database researchers can contribute records of British readers engaging with texts.



  • Historical antecedents: genealogical research(?)
  • Supporting technologies: wikis
  • Key functions: share the labor required for transcribing manuscripts
  • Examples:
    Much of the historical record is not yet accessible online because it exists as handwritten documents—letters, diaries, account books, legal documents, etc.  Although work is underway on Optical Character Recognition software for handwritten materials, making these variable documents searchable and easy to read usually still requires a person to manually transcribe the document.  Why not enable people to collaborate to make family documents and other manuscripts available through commons-based peer production? At THATCamp last year, I learned about Ben Brumley’s FromthePage software, which enables volunteers to transcribe handwritten documents through a web-based interface.  The right side of the interface shows a zoomable image of the page, while on the left volunteers enter the transcription through a wiki-like interface.  Likewise, the FamilySearch Indexing Project, sponsored by the LDS, recruits volunteers to transcribe family information from historical documents.   (See Jeanne Kramer-Smyth’s great account of the THATCamp session on crowdsourcing transcription and annotation.)  Not only can collaborative transcription be more efficient, but it can also reduce error.  Martha Nell Smith recounts how she, working solo at the Houghton, transcribed a line of Susan Dickinson’s poetry as “I’m waiting but the cow’s not back.’’  When her collaborators at the Dickinson Electronic Archives, Lara Vetter and Laura Lauth, later compared the transcriptions to digital images of Dickinson’s manuscripts, they discovered that the line actually says “‘I’m waiting but she comes not back.”  As Smith suggests, “Had we not been working in concert with one another, and had we not had the high quality reproductions of Susan Dickinson’s manuscripts to revisit and thereby perpetually reevaluate our keys to her alphabet, my misreading might have been congealed in the technology of a critical print translation and what is very probably a poetic homage to Emily Dickinson would have lain lost in the annals of literary history”(Smith 849).

    Efforts to crowdsource transcription seem similar to the distributed proofreading that powers Project Gutenberg, which has enlisted volunteers to proofread over 15,000 books since 2000.  Likewise, Project Madurai is using distributed proofreading to build a digital library of Tamil texts.


  • Historical antecedents: translation teams, e.g. Pevear and Volokhonsky
  • Supporting technologies: wikis, blogs, machine translation supplemented by human intervention
  • Examples:
    Rather than requiring an individual to undertake the time-intensive work of translating a complex classical text solo, the Suda Online (SOL)  brings together classicists to collaborate in translating into English the Suda, a tenth century encyclopedia of ancient learning written by a committee of Byzantine scholars (and thus itself a collaboration).  In addition to providing translations, SOL also offers commentaries and references, so it serves as a sort of encyclopedic predecessor to Wikipedia.  As Anne Mahoney reports in a recent article from Digital Humanities Quarterly, an email exchange in 1998 sparked the Suda Online; one scholar wondered whether there was an English translation of the Suda (there wasn’t) and others recognized that a translation could be produced through web-based collaboration.  Student programmers at the University of Kentucky quickly developed the technological infrastructure for SOL (a wiki might have been used today, but the custom application has apparently served its purpose well).  Now a self-organizing team of 61 editors and 95 translators from 12 countries has already translated over 21,000 entries, about 2/3 of the total.  Translators make the initial translations, which are then reviewed and augmented by editors (typically classics faculty) and given a quality rating of “draft,” “low,” or “high.”   All who worked on the translation are credited through a sort of open peer review process.  Whereas collaborative projects such as Wikipedia are open to anyone, SOL translators must register with the project.  Mahoney suggests that the collaboration has succeeded in part because it was focused and bounded, so that collaborators could feel the satisfaction of working toward a common goal and meeting milestones, such as 100 entries translated.  According to Mahoney, SOL has made this important text more accessible by offering an English version, making it searchable, and providing commentaries and references.  Moreover, “[a]s a collaboration SOL demonstrates the feasibility of open peer review and the value of incremental progress.” Other collaborative translation projects include The Encyclopédie of Diderot and d’Alembert, Traduwiki, which aims to “eliminate the last barrier of the Internet, the language’; the WorldWide lexicon project; and Babels.


  • Historical antecedents: creating critical editions
  • Supporting technologies: grid computing, XML editors, text analysis tools, annotation tools
  • Example Projects:

As Peter Robinson observed at this year’s MLA, the traditional model for creating a critical edition centralizes authority in an editor, who oversees work by graduate assistants and others.  However, the Internet enables distributed, de-centralized editing.  To create “community-made editions,” a library would digitize texts and produce high quality images, researchers would transcribe those images, others would collate the transcriptions, others would analyze the collations and add commentaries, and so forth.

Explaining the need for collaborative approaches to textual editing, Marc Wilhelm Kiister, Christoph Ludwig and Andreas Aschenbrenner of TextGrid describe how 3 different editors attempted to create a critical edition of the massive “so-called pseudo-capitulars supposedly written by a Benedictus Levita,” dying before they could complete their work.  Now a team of scholars is collaborating to create the edition, increasing their chances of completion by sharing the labor.  The TextGrid project is building a virtual workbench for collaborative editing, annotation, analysis and publication of texts.  Leveraging the grid infrastructure, TextGrid provides a platform for “software agents with well-defined interfaces that can be harnessed together through a user defined workflow to mine or analyze existing textual data or to structure new data both manually and automatically.” TextGrid recently released a beta version of its client application that includes an XML editor, search tool, dictionary search tool, metadata annotator, and workflow modules. As Kiister, Ludwig and Aschenbreener point out, enabling collaboration requires not only developing a technical platform that supports real-time collaboration and automation of routine tasks, but also facilitating a cultural shift toward collaboration among philologists, linguists, historians, librarians, and technical experts.


  • Historical antecedents: shared references, bibliographies
  • Key functions: share citations, notes, and scholarly resources; build collective knolwedge
  • Supporting technologies: social bookmarking, bibliographic tools
  • Projects:
    With the release of Zotero 2.0, Zotero is taking a huge step toward the vision articulated by Dan Cohen of providing access to “the combined wisdom of hundreds of thousands of scholars” (Cohen).  Researchers can set up groups to share collections with a class and/or collaborators on a research project.   I’ve already used Zotero groups to support my research and to collaborate with others; I discovered several useful citations in the collaboration folder for the digital history group, and with Sterling Fluharty I’ve set up a group to study collaboration in the digital humanities (feel free to join).  Ultimately Zotero will provide Amazon-like recommendation services to help scholars identify relevant resources.  As Stan Katz wrote in hailing Zotero’s collaboration with the Internet Archive to create a “Zotero commons” for sharing research documents, “For secretive individualists, which is to say old-fashioned humanists, this will sound like an invasion of privacy and an invitation to plagiarism. But to scholars who value accessibility, collaboration, and the early exchange of information and insight -– the future is available. And free on the Internet.”

    Similarly, the eComma project suggests that collaborative annotation can facilitate collaborative interpretation, as readers catalog poetic devices (personification, enjambment, etc.) and offer their own interpretations of literary works.  You can see eComma at work in the Collaborative Rubáiyát, which enables users to compare different versions of the text, annotate the text, tag it, and access sections through a tag cloud.   Likewise, Philospace will allow scholars to describe philosophical resources, filter them, find resources tagged by others, and submit resulting research for peer review. Other projects and technologies supporting collaborative annotation include Flickr CommonsAus-e-Lit: Collaborative Integration and Annotation Services for Australian Literature Communities, NINES’ Collex, and STEVE.


  • Historical antecedents: Encyclopedias
  • Supporting technologies: Wikis
  • Key functions: sharing knowledge, synthesizing multiple perspectives
  • Examples:
    With the rise of Wikipedia, academics have been debating whether collaborative writing spaces such as wikis undermine authority, expertise, and trustworthiness.  In “Literary Sleuths Online,” Ralph Schroeder and Matthijs Den Besten examine the Pynchon Wiki, a collaborative space where Pynchon enthusiasts annotate and discuss his works.  Schroeder and Den Besten compare the wiki’s section on Pynchon’s Against the Day with a print equivalent, Weisenburger’s “A Gravity’s Rainbow Companion.”  While the annotations in Weisenburger’s book are more concise and consistent, the wiki is more comprehensive, more accurate (because many people are checking the information), and more speedily produced (it only took 3 months for the wiki to cover every page of Pynchon’s novel).   Moreover, the book is fixed, while the wiki is open-ended and expansive. Schroeder and Den Besten suggest that competition, community and curiosity drive participation, since contributors raced to add annotations as they made their way through the novel and “sleuthed” together.

GAMING: “Collaborative Play”/ Games as Research

  • Historical antecedents: role playing games, board games, etc.
  • Key functions: problem solving, team work, knowledge sharing
  • Supporting technologies: gaming engines, wikis, networks
  • Example Projects:
    Perhaps some of the most intense collaboration comes in massively multiplayer online games, as teams of players consult each other for assistance navigating virtual worlds, team up to defeat monsters, join guilds to collaborate on quests, and share their knowledge through wikis such as the WOWWiki, which has almost 74,000 articles.  Focusing on World of Warcraft, Nardi and Harris explore collaborative play as a form of learning.  They also point to potential applications of gaming in research communities: “Mixed collaboration spaces, whether MMOGs or another format, may be useful in domains such as interdisciplinary scientific work where a key challenge is finding the right collaborators.”

    Sometimes those collaborators can be people without specialized training.  Recently Wired featured a fascinating article about FoldIt, a game to come up with different models of proteins that is attracting devoted teams of participants (Bohannon).  The game was devised by the University of Washington Departments of Computer Science & Engineering and Biochemistry to crowdsource solutions to Community-Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP), a scientific contest to predict protein structures.   Previously biochemist David Baker had used Rosetta@home to harness the spare computing cycles of 86,000 PCs that had been volunteered to help determine the shapes of proteins, but he was convinced that human intelligence as well as computing power needed to be tapped to solve spatial puzzles.  Thus he and his colleagues developed a game in which players fold proteins into their optimal shapes, a sort of “global online molecular speed origami.” Over 100,000 people have downloaded the game, and a 13 year-old is one of the game’s best players. Using the game’s chat function, players formed teams, “and collective efforts proved far more successful than any solo folder.”  At the CASP competition, 7 of the 15 solutions contributed through FoldIt worked, and one finished in first place, so “[a] band of gamer nonscientists had beaten the best biochemists.”

    How might gaming be used to motivate and support humanities research?  As we see in the example of FoldIt, games provide motivation and a structure for collaboration; teamwork enables puzzles to be solved more rapidly.  I could imagine, for example, a game in which players would transcribe pieces of a diary to unravel the mystery it recounts, describe the features of a series of images (similar to Google’s Image Labeler game), or offer up their own interpretations of abstruse philosophical or literary passages.  In “Games of Inquiry for Collaborative Concept Structuring,” Mary A. Keeler and Heather D. Pfeiffer envision a “Manuscript Reconstruction Game (MRG)” where Peirce scholars would collaborate to figure out where a manuscript page belongs. “The scholars rely on the mechanism of the game, as a logical editor or ‘logical lens,’ to help them focus on and clarify the complexities of inference and conceptual content in their collaborative view of the manuscript evidence” (407).  There are already some compelling models for humanities game play.  Dan Cohen recently used Twitter to crowdsource solving an historical puzzle. Ian Bogost and collaborators are investigating the intersections between journalism and gaming.  Jerome McGann describes Ivanhoe as an  “online playspace… for organizing collaborative interpretive investigations of traditional humanities materials of any kind,” as two or more players come together to re-imagine and transform a literary work (McGann).


  • Historical antecedents: exchange of drafts, letters, critical dialogs in journals
  • Supporting technologies and protocols: CommentPress, blogs, wikis, Creative Commons licenses, etc.
  • Projects:
    Bob Stein defines the book as “a place where readers (and sometimes authors) congregate.” Recent projects enable readers to participate in all phases of the publishing process, from peer-to-peer review to remixing a work to produce something new.  For instance, LiquidPub aims to transform the dissemination and evaluation of scientific knowledge by enabling “Liquid Publication that can take multiple forms, that evolves continuously, and is enriched by multiple sources.”  Using CommentPress, Noah Wardrip-Fruin  experimented with peer-to-peer review of his new book Expressive Processing alongside traditional peer review, posting a section of the book each week day to the Grand Text Auto blog.  Although it was difficult for many reviewers to get a sense of the book’s overall arguments when they were reading only fragments, Wardrip-Fruin found many benefits to this open approach to peer review: he could engage in conversation with his reviewers and determine how to act on their comments, and he received detailed comments from both academics and non-academics with expertise in the topics being discussed, such as game designers.  Similarly, O’Reilly recently developed the Open Publishing Feedback System to gather comments from the community.  Its first experiment, Programming Scala, yielded over 7000 comments from nearly 750 people. New publishing companies such as WeBook and Vook are exploring collaborative authorship and multimedia.


  • Historical antecedents: Students as research assistants?
  • Supporting technologies: blogs, wikis, social bookmarking, social bibliographies
  • Motto: “We participate, therefore we are.” (via John Seely Brown)
  • Example:
    As John Seely Brown explains, “social learning is based on the premise that our understanding of content is socially constructed through conversations about that content and through grounded interactions, especially with others, around problems or actions.”  Social learning involves “learning to be” an expert through apprenticeship, as well as learning the content and language of a domain.  Brown points to open source communities as exemplifying social learning.  I would guess that many, if not most, collaborative digital humanities projects have depended on contributions from undergraduate and graduate students, whether they digitized materials, did programming, authored metadata, contributed to the project wiki, designed the web site, or even managed the project.

    Why not create a network of research projects, so that students studying a similar topic could jointly contribute to a common resource?  Such is the vision of “Looking for Whitman: The Poetry of Place in the Life and Work of Walt Whitman,” led by Matthew Gold.   Working together to build a common web site on Whitman, students will document their research using Web 2.0 technologies such as CommentPress, BuddyPress (Word Press + social networking), blogs, wikis, YouTube, Flickr, Google Maps, etc.m  Students at City Tech, CUNY’s New York City College of Technology and New York University will focus on Whitman in New York;  those at Rutgers University at Camden will look at Whitman as “sage of Camden”; and those at the University of Mary Washington will examine Whitman and the Civil War.   Similarly, Michael Wesch, the 2008 CASE/Carnegie U.S. Professor of the Year for Doctoral and Research Universities, asks his students to become “co-creators” of knowledge, whether in simulating world history and cultures, creating an ethnography of YouTube, or examining anonymity and new media.

While collaboration in the humanities is certainly not new, these projects suggest how researchers (both professional and amateur) can work together regardless of physical location to share ideas and citations, produce translations or transcriptions, and create common scholarly resources.  Long as this list is, I know I’m omitting many other relevant projects (some of which I’ve bookmarked) and overlooking (for now) the challenges that collaborative scholarship faces.  I’ll be working with several collaborators to explore these issues, but I of course welcome comments….

Works Cited

Atkins, Dan. Report of the National Science Foundation Blue-Ribbon Advisory Panel on Cyberinfrastructure. NSF. January 2003. <>.
Bohannon, John. “Gamers Unravel the Secret Life of Protein.” Wired 20 Apr 2009. 26 May 2009 <>.
Borgman, Christine L. Scholarship in the Digital Age: Information, Infrastructure, and the Internet. Cambridge, Mass.: The MIT Press, 2007.
Brockman, William et al. Scholarly Work in the Humanities and the Evolving Information Environment. CLIR/DLF, 2001. 24 Jul 2007 <>.
Cohen, Daniel J. “Zotero: Social and Semantic Computing for Historical Scholarship.” Perspectives (2007). 27 May 2009 <>.
Cronin, Blaise, Debora Shaw, and Kathryn La Barre. “A cast of thousands: Coauthorship and subauthorship collaboration in the 20th century as manifested in the scholarly journal literature of psychology and philosophy.” Journal of the American Society for Information Science and Technology 54.9 (2003): 855-871.
Cronin, Blaise. The hand of science. Scarecrow Press, 2005.
Kelly, Kevin. “The New Socialism: Global Collectivist Society Is Coming Online.” Wired 22 May 2009. 26 May 2009 <>.
Kornbluh, Mark. “From Digital Repositorities to Information Habitats: H-Net, the Quilt Index, Cyber Infrastruture, and Digital Humanities.” First Monday 13.8: August 4, 2008. 
Kuster, M.W., C. Ludwig, and A. Aschenbrenner. “TextGrid as a Digital Ecosystem.” Digital EcoSystems and Technologies Conference, 2007. DEST ’07. Inaugural IEEE-IES. 2007. 506-511.
Mahoney, Anne. “Tachypaedia Byzantina: The Suda On Line as Collaborative Encyclopedia.”  Digital Humanities Quarterly. 3.1 (2009). 22 Mar 2009 <>.
McGann, Jerome J. “Culture and Technology: The Way We Live Now, What Is to Be Done?.” New Literary History 36.1 (2005): 71-82.
Nardi, Bonnie, and Justin Harris. “Strangers and friends: collaborative play in world of warcraft.” Proceedings of the 2006 20th anniversary conference on Computer supported cooperative work. Banff, Alberta, Canada: ACM, 2006. 149-158. 18 May 2009 <>.
O’Donnell, Daniel Paul. “Disciplinary Impact and Technological Obsolescence in Digital Medieval Studies.” A Companion To Digital Humanities. 2 May 2009 <>.
Schroeder, Ralph, and Matthijs Den Besten. “Literary Sleuths On-line: e-Research collaboration on the Pynchon Wiki.” Information, Communication & Society 11.2 (2008): 167-187.
Smith, Martha Nell. “Computing: What Has American Literary Study To Do with It.” American Literature 74.4 (2002): 833-857.
Unsworth, John M. “Creating Digital Resources: the Work of Many Hands.” 14 Sep 1997. 10 Mar 2009 <>.

Revisions: Fixed From the Page link, 6/1/09; Tanya ] Tara, 6/2/09; fixed typos (6/14/09)

Collaborative Authorship in the Humanities

Recently I heard the editors of a history journal and a literature journal say that they rarely published articles written by more than one author—perhaps a couple every few years.   Around the same time, I was looking over a recent issue of Literary and Linguistic Computing and noticed that it included several jointly-authored articles.  This got me wondering:  is collaborative authorship more common in digital humanities than in “traditional” humanities?

“Collaboration” is often associated with “digital humanities.”  Building digital collections, creating software, devising new analytical methods, and authoring multimodal scholarship typically cannot be accomplished by a solo scholar; rather, digital humanities projects require contributions from people with content knowledge, technical skills, design skills, project management experience, metadata expertise, etc.  Our Cultural Commonwealth identifies enabling collaboration as a key feature of the humanities cyberinfrastructure, funders encourage multi-institutional and even international teams, and proponents of increased collaboration in the humanities like Cathy Davidson and Lisa Ede and Andrea A. Lunsford cite digital humanities projects such as Orlando as exemplifying collaborative possibilities.

As a preliminary investigation, I compared the number of collaboratively-written articles published between 2004 and 2008 in two well-respected quarterly journals, American Literary History (ALH) and Literary and Linguistic Computing (LLC).  Both journals are published by Oxford University Press as part of its humanities catalog. I selected ALH because it is a leading journal on American literature and culture that encourages critical exchanges and interdisciplinary work—and because I thought it would be fun to see what the journal has published since 2004. (The hardest part of my research: resisting the urge to stop and read the articles.)  LLC, the official publication of the Association for Literary and Linguistic Computing and the Association for Computers and the Humanities, includes contributions on digital humanities from around the world—the UK, the US, Germany, Australia, Greece, Italy, Norway, etc.—and from many disciplines, such as literature, linguistics, computer and information science, statistics, librarianship, and biochemistry.  To determine the level of collaborative authorship in each issue, I tallied articles that had more than one author, excluding editors’ introductions, notes on contributors, etc.  For LLC, I counted everything that had an abstract as an article.  While I didn’t count LLC’s reviews, which typically are brief and focus on a single work, I did include the review essays published by ALH, since they are longer and synthesize critical opinion about several works.

So what did I find? Whereas 5 of 259 (1.93%) articles published in ALH—about one a year–feature two authors (none had more than two), 70 out of 145 (48.28%) of the articles published in LLC were written by two or more authors.  Most (4 of 5, or 80%) of the ALH articles were written by scholars from multiple institutions, whereas 49% (34 of 70) of the LLC articles were.  About 16% (11 of 70) of the LLC articles featured contributors from two or more countries, while none of the ALH articles did.  Two of the five ALH articles are review essays, while three focus on hemispheric or transatlantic American studies.  Although this study should be carried out more systematically across a wider range of journals, the initial results do suggest that collaborative authorship is more common in digital humanities. [See the Zotero reports for ALH and LLC for more information.]

Why does LLC feature more collaboratively written articles than ALH? I suspect that because, as I’ve already suggested, digital humanities projects often require collaboration, whereas most literary criticism can be produced by an individual scholar who needs only texts to read, a place to write, and a computer running a word processing application (as well as a library to provide access to texts, colleagues to consult and to review the resulting research, a university and/or funding agency to support the research, a publisher to disseminate the work, etc.).   Moreover, LLC represents a sort of meeting point for a range of disciplines, including several (such as computer science) that have a tradition of collaborative authorship.  Whereas collaborative authorship is common (even expected) in the sciences, in the humanities many tenure and promotion committees have not yet developed mechanisms for evaluating and crediting collaborative work. In a recent blog post, for example, Cathy Davidson tells a troubling story about being told (in a public and humiliating way) by a member of a search committee that her collaborative work and other “non-traditional” research didn’t “count.”  Literary study values individual interpretation, or what Davidson calls “the humanistic ethic of individuality.”

While individual scholarship remains valid and important, shouldn’t humanities scholarship to expand to embrace collaborative work as well?  Indeed, in 2000 the MLA launched an initiative to consider “alternatives to the adversarial academy” and encourage collaborative scholarship.  (By the way, I’m not criticizing ALH; I doubt that it receives many collaboratively-authored submissions, and it has encouraged critical exchange and interdisciplinary research.)  Of course, collaboration poses some significant challenges, such divvying up and managing work, negotiating conflicts, finding funding for complex projects, assigning credit, etc.    But as Lisa Ede and Andrea A. Lunsford point out, collaborative authorship can lead to a “widening of scholarly possibilities.”  In talking to humanities scholars (particularly those in global humanities), I’ve noticed genuine enthusiasm about collaborative work that allows scholars to engage in community, consider alternative perspectives, and undertake ambitious projects that require diverse skills and/or knowledge.

What kind of collaborations do the jointly-written articles in LLC and ALH represent? Since LLC often lists only the authors’ institutional affiliations, not their departments, tracing the degree of interdisciplinary collaboration would require further research.  However, I did find examples of several types of collaboration (which may overlap):

  • Faculty/student collaboration: In the sciences, faculty frequently publish with their postdocs and students, a practice that seems to be rare in the humanities.  I noted at least one example of a similar collaboration in LLC—involving, I should note, computer science rather than humanities grad students.
    • Urbina, Eduardo et al. “Visual Knowledge: Textual Iconography of the Quixote, a Hypertextual Archive.” Lit Linguist Computing 21.2 (2006): 247-258. 5 Apr 2009 <>.
      This article includes contributions by a professor of Hispanic studies, a professor of computer science, a librarian/archivist/adjunct English professor, and three graduate students in computer science.
  • Project teams: In digital humanities, collaborators often work together on projects to build digital collections, develop software, etc.  In LLC, I found a number of articles written by project teams, such as:
    • Barney, Brett et al. “Ordering Chaos: An Integrated Guide and Online Archive of Walt Whitman’s Poetry Manuscripts.” Lit Linguist Computing 20.2 (2005): 205-217. 5 Apr 2009 <>.
      Members of the project team included an archivist, programmer, digital initiatives librarian, English professor, and two English Ph.Ds who serve as library faculty and focus on digital humanities.
  • Interdisciplinary collaborations: In LLC, I noted several instances of teams that included humanities scholars and scientists working together to apply particular methods (text mining, stemmatic analysis) in the humanities.  For example:
    • Windram, Heather F. et al. “Dante’s Monarchia as a test case for the use of phylogenetic methods in stemmatic analysis.” Lit Linguist Computing 23.4 (2008): 443-463. 5 Apr 2009 <>.  The authors include two biochemists, a textual scholar, and a scholar of Italian literature
    • Sculley, D., and Bradley M. Pasanek. “Meaning and mining: the impact of implicit assumptions in data mining for the humanities.” Lit Linguist Computing 23.4 (2008): 409-424. 5 Apr 2009 <>.
      Authored by a computer scientist and a literature professor.
  • Shared interests: Researchers may publish together because they share an intellectual kinship and can accomplish more by working together.  For instance:
    • Auerbach, Jonathan, and Lisa Gitelman. “Microfilm, Containment, and the Cold War.” American Literary History 19.3 (2007).  I noticed that Jonathan Auerbach and Lisa Gitelman thank each other in works that each had previously published as an individual.

Observing that LLC publishes a number of collaboratively-written articles opens up several questions, which I hope to pursue through interviews with the authors of at least some of these articles (if you are one of these authors, you may see an email from me soon….):

1)    What characterizes the LLC articles that have only one author?
Based on a quick look at the tables of contents from past issues, I suspect that these articles are more likely to be theoretical or to focus on particular problems rather than projects.  Here, for example, are the titles of some singly-authored articles:  “The Inhibition of Geographical Information in Digital Humanities Scholarship,” “Monkey Business—or What is an Edition?,” “What Characterizes Pictures and Text?” and “Original, Authentic, Copy: Conceptual Issues in Digital Texts.”

2)    Why was the article written collaboratively?

What led to the collaboration?  Did team members offer complementary skill sets, such as knowledge of statistical methods and understanding of the content? How did the collaborators come together—do they work for the same institution? Did they meet at a conference? Do they cite each other?

3)    What were the outcomes of the collaboration?

What was accomplished through collaboration that would have been difficult to do otherwise?  Would the scale of the project be smaller if it were pursued by a single scholar? Did the project require contributions from people with different types of expertise?

4)    How was the collaboration managed and sustained?

Was one person in charge, or was authority distributed? What tools were used to facilitate communication, track progress on the project, and support collaborative writing? To what degree was face-to-face interaction important?

5)    What was difficult about the collaboration?

What was hard about collaborating: Communicating? Identifying who does what? Agreeing on methods? Coming to a common understanding of results? Finding funding?

We can find answers to some of these questions in Lynne Siemens’ recent article “’It’s a team if you use “reply all” ‘: An exploration of research teams in digital humanities environments.”  Siemens describes factors contributing to the success of collaborative teams in digital humanities, such as clear milestones and benchmarks, strong leadership, equal contributions by members of the team, and a balance between communication through digital tools and in-person meetings.  I particularly liked the description of “a successful team as a ‘round thing’ with equitable contribution by individual members.”

In doing this research, I realized how much it would benefit from collaborators.  For instance, someone with expertise in citation analysis could help enlarge the study and detect patterns in collaborative authorship, while someone with expertise in qualitative research methods could help to interview collaborative research teams and analyze the resulting data.  However, I think anyone with an interest in the topic could make valuable contributions.  This is by way of leading up to my pitch: I’m working on a piece about collaborative research methods in digital humanities for an essay collection and would welcome collaborators.  If you’re interested in teaming up, contact me at

Works Cited

Davidson, Cathy N. “What If Scholars in the Humanities Worked Together, in a Lab?.” The Chronicle of Higher Education 28 May 1999. 18 Apr 2009 <>.

Ede, Lisa, and Andrea A. Lunsford. “Collaboration and Concepts of Authorship.” PMLA 116.2 (2001): 354-369. 18 Apr 2009 <>.

Siemens, Lynne. “’It’s a team if you use “reply all” ‘: An exploration of research teams in digital humanities environments.” Lit Linguist Computing (2009): fqp009. 14 Apr 2009 <>.

Is Wikipedia Becoming a Respectable Academic Source?

Last year a colleague in the English department described a conversation in which a friend revealed a dirty little secret: “I use Wikipedia all the time for my research—but I certainly wouldn’t cite it.”  This got me wondering: How many humanities and social sciences researchers are discussing, using, and citing Wikipedia?  To find out, I searched Project Muse and JSTOR, leading electronic journal collections for the humanities and social sciences, for the term “wikipedia,” which picked up both references to Wikipedia and citations of the wikipedia URL.  I retrieved 167 results from between 2002 and 2008, all but 8 of which came from Project Muse.  (JSTOR covers more journals and a wider range of disciplines but does not provide access to issues published in the last 3-5 years.)  In contrast, Project Muse lists 149 results in a search for “Encyclopedia Britannica” between 2002 and 2008, and JSTOR lists 3.  I found that citations of Wikipedia have been increasing steadily: from 1 in 2002 (not surprisingly, by Yochai Benkler) to 17 in 2005 to 56 in 2007. So far Wikipedia has been cited 52 times in 2008, and it’s only August.

Along with the increasing number of citations, another indicator that Wikipedia may be gaining respectability is its citation by well-known scholars.  Indeed, several scholars both cite Wikipedia and are themselves subjects of Wikipedia entries, including Gayatri Spivak, Yochai Benkler, Hal Varian, Henry Jenkins, Jerome McGann, Lawrence Buell, and Donna Haraway.

111 of the sources (66.5%) are what I call “straight citations”—citations of Wikipedia without commentary about it–while 56 (34.5%) comment on Wikipedia as a source, either positively or negatively.  14.5% of the total citations come from literary studies, 14% from cultural studies, 11.4% from history, and 6.6% from law. Researchers cite Wikipedia on a diversity of topics, ranging from the military-industrial complex to horror films to Bush’s second state of the union speech.  8 use Wikipedia simply as a source for images (such as an advertisement for Yummy Mummy cereal or a diagram of the architecture of the Internet).  Many employ Wikipedia either as a source for information about contemporary culture or as a reflection of contemporary cultural opinion.  For instance, to illustrate how novels such as The Scarlet Letter and Uncle Tom’s Cabin have been sanctified as “Great American Novels,” Lawrence Buell cites the Wikipedia entry on “Great American Novel”(Buell).

About a third of the articles I looked at discuss the significance of Wikipedia itself.  14 (8%) criticize using it in research.  For instance, a reviewer of a biography about Robert E. Lee tsks-tsks:

The only curiosities are several references to Wikipedia for information that could (and should) have been easily obtained elsewhere (battle casualties, for example). Hopefully this does not portend a trend toward normalizing this unreliable source, the very thing Pryor decries in others’ work. (Margolies).

In contrast, 11 (6.6%) cite Wikipedia as a model for participatory culture.  For example:

The rise of the net offers a solution to the major impediment in the growth and complexification of the gift economy, that network of relationships where people come together to pursue public values. Wikipedia is one example.(DiZerega)

A few (1.8%) cite Wikipedia self-consciously, aware of its limitations but asserting its relevance for their particular project:

Citing Wikipedia is always dicey, but it is possible to cite a specific version of an entry. Start with the link here, because cybervandals have deleted the list on at least one occasion. For a reputable “permanent version” of “Alternative press (U.S. political right)” see: (Berlet).

Of course, just because more researchers—including some prominent ones—are citing Wikipedia does not mean it’s necessarily a valid source for academic papers.  However, you can begin to see academic norms shifting as more scholars find useful information in Wikipedia and begin to cite it.  As Christine Borgman notes, “Scholarly documents achieve trustworthiness through a social process to assure readers that the document satisfies the quality norms of the field” (Borgman 84).  As a possible sign of academic norms changing in some disciplines, several journals, particularly those focused on contemporary culture, include 3 or more articles that reference Wikipedia: Advertising and Society Review (7 citations), American Quarterly (3 citations), College Literature (3 citations), Computer Music Journal (5 citations), Indiana Journal of Global Legal Studies (3 citations), Leonardo (8 citations), Library Trends (5 citations), Mediterranean Quarterly (3 citations), and Technology and Culture (3 citations).

So can Wikipedia be a reputable scholarly resource?  I typically see four main criticisms of Wikipedia:

1) Research projects shouldn’t rely upon encyclopedias. Even Jimmy Wales, (co?-)founder of Wikipedia, acknowledges “I still would say that an encyclopedia is just not the kind of thing you would reference as a source in an academic paper. Particularly not an encyclopedia that could change instantly and not have a final vetting process” (Young).  But an encyclopedia can be a valid starting point for research.  Indeed, The Craft of Research, a classic guide to research, advises that researchers consult reference works such as encyclopedias to gain general knowledge about a topic and discover related works (80).  Wikipedia covers topics often left out of traditional reference works, such as contemporary culture and technology.  Most if not all of the works I looked at used Wikipedia to offer a particular piece of background information, not as a foundation for their argument.

2) Since Wikipedia is constantly undergoing revisions, it is too unstable to cite; what you read and verified today might be gone tomorrow–or even in an hour.  True, but Wikipedia is developing the ability for a particular version of an entry to be vetted by experts and then frozen, so researchers could cite an authoritative, unchanging version (Young).  As the above citation from Berlet indicates, you can already provide a link to a specific version of an article.

3) You can’t trust Wikipedia because anyone—including folks with no expertise, strong biases, or malicious (or silly) intent—can contribute to it anonymously.  Yes, but through the back and forth between “passionate amateurs,” experts, and Wikipedia guardians protecting against vandals, good stuff often emerges. As Nicholson Baker, who has himself edited Wikipedia articles on topics such as the Brooklyn Heights and the painter Emma Fordyce MacRae, notes in a delightful essay about Wikipedia, “Wikipedia was the point of convergence for the self-taught and the expensively educated. The cranks had to consort with the mainstreamers and hash it all out” (Baker).  As Roy Rosenzweig found in a detailed analysis of Wikipedia’s appropriateness for historical research, the quality of the collaboratively-produced Wikipedia entries can be uneven: certain topics are covered in greater detail than others, and the writing can have the choppy, flat quality of something composed by committee.  But Rosenzweig also concluded that Wikipedia compares favorably with Encarta and Encyclopedia Britannica for accuracy and coverage.

4) Wikipedia entries lack authority because there’s no peer review. Well, depends on how you define “peer review.”  Granted, Wikipedia articles aren’t reviewed by two or three (typically anonymous) experts in the field, so they may lack the scholarly authority of an article published in an academic journal.  However, articles in Wikipedia can be reviewed and corrected by the entire community, including experts, knowledgeable amateurs, and others devoted to Wikipedia’s mission to develop, collect and disseminate educational content (as well as by vandals and fools, I’ll acknowledge).  Wikipedia entries aim to achieve what Wikipedians call “verifiability”; the article about Barack Obama, for instance, has as many footnotes as a law review article–171 at last count (August 31), including several from this week.

Now I’m certainly not saying that Wikipedia is always a good source for an academic work–there is some dreck in it, as in other sources.  Ultimately, I think Wikipedia’s appropriateness as an academic source depends on what is being cited and for what purpose.   Alan Liu offers students a sensible set of guidelines for the appropriate use of Wikipedia, noting that it, like other encyclopedias, can be a good starting point, but that it is “currently an uneven resource” and always in flux.  Instead of condemning Wikipedia outright, professors should help students develop what Henry Jenkins calls “new media literacies.”  By examining the history and discussion pages associated with each article, for instance, students can gain insight into how knowledge is created and how to evaluate a source.  As John Seely Brown and Richard Adler write:

The openness of Wikipedia is instructive in another way: by clicking on tabs that appear on every page, a user can easily review the history of any article as well as contributors’ ongoing discussion of and sometimes fierce debates around its content, which offer useful insights into the practices and standards of the community that is responsible for creating that entry in Wikipedia. (In some cases, Wikipedia articles start with initial contributions by passionate amateurs, followed by contributions from professional scholars/researchers who weigh in on the “final” versions. Here is where the contested part of the material becomes most usefully evident.) In this open environment, both the content and the process by which it is created are equally visible, thereby enabling a new kind of critical reading—almost a new form of literacy—that invites the reader to join in the consideration of what information is reliable and/or important.(Brown & Adler)

OK, maybe Wikipedia can be a legitimate source for student research papers–and furnish a way to teach research skills.  But should it be cited in scholarly publications?  In “A Note on Wikipedia as a Scholarly Source of Record,” part of the preface to Mechanisms, Matt Kirschenbaum offers a compelling explanation of why he cited Wikipedia, particularly when discussing technical documentation:

Information technology is among the most reliable content domains on Wikipedia, given the high interest of such topics Wikipedia’s readership and the consequent scrutiny they tend to attract.   Moreover, the ability to examine page histories on Wikipedia allows a user to recover the editorial record of a particular entry… Attention to these editorial histories can help users exercise sound judgment as to whether or not the information before them at any given moment is controversial, and I have availed myself of that functionality when deciding whether or not to rely on Wikipedia.(Kirschenbaum xvii)

With Wikipedia, as with other sources, scholars should use critical judgment in analyzing its reliability and appropriateness for citation.  If scholars carefully evaluate a Wikipedia article’s accuracy, I don’t think there should be any shame in citing it.

For more information, review the Zotero report detailing all of the works citing Wikipedia, or take a look at a spreadsheet of basic bibliographic information. I’d be happy to share my bibliographic data with anyone who is interested.

Works Cited

Baker, Nicholson. “The Charms of Wikipedia.” The New York Review of Books 55.4 (2008). 30 Aug 2008 <;.

Berlet, Chip. “The Write Stuff: U. S. Serial Print Culture from Conservatives out to Neonazis.” Library Trends 56.3 (2008): 570-600. 24 Aug 2008 <;.

Booth, Wayne C, and Colomb, Gregory G. The Craft of Research. Chicago: U of Chicago P, 2003.

Borgman, Christine L. Scholarship in the Digital Age: Information, Infrastructure, and the Internet. Cambridge, Mass., 2007.

Brown, John Seely, and Richard P. Adler. “Minds on Fire: Open Education, the Long Tail, and Learning 2.0 .” EDUCAUSE Review 43.1 (2008): 16-32. 29 Aug 2008 <;.

Buell, Lawrence. “The Unkillable Dream of the Great American Novel: Moby-Dick as Test Case.” American Literary History 20.1 (2008): 132-155. 24 Aug 2008 <;.

Dee, Jonathan. “All the News That’s Fit to Print Out.” The New York Times 1 Jul 2007. 30 Aug 2008 <;.

DiZerega, Gus. “Civil Society, Philanthropy, and Institutions of Care.” The Good Society 15.1 (2006): 43-50. 24 Aug 2008 <;.

Jenkins, Henry. “What Wikipedia Can Teach Us About the New Media Literacies (Part One).” Confessions of an Aca/Fan 26 Jun 2007. 30 Aug 2008 <;.

Kirschenbaum, Matthew G. Mechanisms : new media and the forensic imagination. (Cambridge, Mass.: MIT Press, 2008).

Liu, Alan. “Student Wikipedia Use Policy.” 1 Apr 2007. 30 Aug 2008 <;.

Margolies, Daniel S. “Robert E. Lee: Heroic, But Not the Polio Vaccine.” Reviews in American History 35.3 (2007): 385-392. 25 Aug 2008 <;.

Rosenzweig, Roy. “Can History be Open Source? Wikipedia and the Future of the Past.” The Journal of American History Volume 93, Number 1 (June, 2006): 117-46.  Available at

Young, Jeffrey. “Wikipedia’s Co-Founder Wants to Make It More Useful to Academe.” Chronicle of Higher Education 13 Jun 2008. 28 Aug 2008 <;.

Doing Digital Scholarship: Presentation at Digital Humanities 2008

Note:  Here is roughly what I said during my presentation at Digital Humanities 2008 in Oulu, Finland (or at least meant to say—I was so sleep deprived thanks to the unceasing sunshine that I’m not sure what I actually did say).  My session, which explored the meaning and significance of “digital humanities,” also featured rich, engaging presentations by Edward Vanhoutte on the history of humanities computing and John Walsh on comparing alchemy and digital humanities.  My presentation reports on my project to remix my dissertation as a work of digital scholarship and synthesizes many of my earlier blog posts to offer a sort of Reader’s Digest condensed version of my blog for the past 7 months. By the way, sorry that I’ve been away from the blog for so long.  I’ve spent the last month and a half researching and writing a 100 page report on archival management software,  reviewing essays, performing various other professional duties, and going on both a family vacation to San Antonio and a grown-up vacation to Portland, OR (vegan meals followed up by Cap’n Crunch donuts.  It took me a week to recover from the donut hangover).  In the meantime, lots of ideas have been brewing, so expect many new blog entries soon.


When I began working on my dissertation in the mid 1990s, I used a computer primarily to do word processing—and goof off with Tetris.  Although I used digital collections such as Early American Fiction and Making of America for my dissertation project on bachelorhood in 19th C American literature, I did much of my research the old fashioned way: flipping through the yellowing pages of 19th century periodicals on the hunt for references to bachelors, taking notes using my lucky leaky fountain pen.  I relied on books for my research and, in the end, produced a book.

At the same time that I was dissertating, I was also becoming enthralled by the potential of digital scholarship through my work at the University of Virginia’s (late lamented) Electronic Text Center.  I produced an electronic edition of the first section from Donald Grant Mitchell’s bestseller Reveries of a Bachelor that allowed readers to toggle between variants.   I even convinced my department to count Perl as a second language, citing the Matt Kirschenbaum precedent (“come on, you let Matt do it, and look how well that turned out”) and the value of computer languages to my profession as a budding digital humanist.  However, I decided not to create an electronic version of my dissertation (beyond a carefully backed-up Word file) or to use computational methods in doing my research, since I wanted to finish the darn thing before I reached retirement age.

Last year, five years after I received my PhD and seven years after I had become the director of Rice University’s Digital Media Center, I was pondering the potential of digital humanities, especially given mass digitization projects and the emergence of tools such as TAPOR and Zotero.  I wondered: What is digital scholarship, anyway?  What does it take to produce digital scholarship? What kind of digital resources and tools are available to support it? To what extent do these resources and tools enable us to do research more productively and creatively? What new questions do these tools and resources enable us to ask? What’s challenging about producing digital scholarship? What happens when scholars share research openly through blogs, institutional repositories, & other means?

I decided to investigate these questions by remixing my 2002 dissertation as a work of digital scholarship.  Now I’ll acknowledge that my study is not exactly scientific—there is a rather subjective sample of one.  However, I figured, somewhat pragmatically, that the best way for me to understand what digital scholars face was to do the work myself.  I set some loose guidelines: I would rely on digital collections as much as possible and would experiment with tools for analyzing, annotating, organizing, comparing and visualizing digital information.  I would also explore different ways of representing my ideas, such as hypertextual essays and videos.  Embracing social scholarship, I would do my best to share my work openly and make my research process transparent.  So that the project would be fun and evolve organically, I decided to follow my curiosity wherever it led me, imagining that I would end up with a series of essays on bachelorhood in 19th century American culture and, as sort of an exoskeleton, meta-reflections on the process of producing digital scholarship.

My first challenge was defining digital scholarship.  The ACLS Commission on Cyberinfrastructure’s report points to five manifestations of digital scholarship: collection building, tools to support collection building, tools to support analysis, using tools and collections to produce “new intellectual products,” and authoring tools.   Some might argue we shouldn’t really count tool and collection building as scholarship.  I’ll engage with this question in more detail in a future post, but for now let me say that most consider critical editions, bibliographies, dictionaries and collations, arguably the collections and tools of the pre-digital era, to be scholarship.  In many cases, building academic tools and collections requires significant research and expertise and results in the creation of knowledge—so, scholarship.   Still, my primary focus is on the fourth aspect, leveraging digital resources and tools to produce new arguments.  I’m realizing along the way, though, that I may need to build my own personal collections and develop my own analytical tools to do the kind of scholarship I want to do.

In a recent presentation at CNI, Tara McPherson, the editor of Vectors, offered her own “Typology of Digital Humanities”:
•    The Computing Humanities: focused on building tools, infrastructure, standards and collections, e.g. The Blake Archive
•    The Blogging Humanities: networked, peer-to-peer, e.g. crooked timber
•    The Multimodal Humanities: “bring together databases, scholarly tools, networked writing, and peer-to-peer commentary while also leveraging the potential of the visual and aural media that so dominate contemporary life,” e.g. Vectors

Mashing up these two frameworks, my own typology would look something like this:

•    Tools, e.g. TAPOR, Zotero
•    Collections, e.g. The Blake Archive
•    Theories, e.g. McGann’s Radiant Textuality
•    Interpretations and arguments that leverage digital collections and tools, e.g. Ayers and Thomas’ The Difference Slavery Made
•    Networked Scholarship: a term that I borrow from the Institute for the Future of the Book’s Bob Stein and that I prefer to “blogging humanities,” since it encompasses many modes of communication, such as wikis, social bookmarking, institutional repositories, etc. Examples include Savage Minds (a group blog in anthropology), etc.
•    Multimodal scholarship: e.g. scholarly hypertexts and videos, e.g. what you might find in Vectors
•    Digital cultural studies, e.g. game studies, Lev Manovich’s work, etc (this category overlaps with theories)

Initially I assumed that tools, theories and collections would feed into arguments that would be expressed as networked and/or multimodal scholarship and be informed by digital cultural studies.  But I think that describing digital scholarship as a sort of assembly line in which scholars use tools, collections and theories to produce arguments oversimplifies the process.  My initial diagram of digital scholarship pictured single-headed arrows linking different approaches to digital scholarship; my revised diagram looks more like spaghetti, with arrows going all over the place.  Theories inform collection building; the process of blogging helps to shape an argument; how a scholar wants to communicate an idea influences what tools are selected and how they are used.

After coming up with a preliminary definition of what I wanted to do, I needed to figure out how to structure my work.  I thought of John Unsworth’s notion of scholarly primitives, a compelling description of core research practices.  Depending on how you count them, Unsworth identifies 7 scholarly primitives:
•    Discovering
•    Annotating
•    Comparing
•    Referring
•    Sampling
•    Illustrating
•    Representing

As useful as this list is in crystallizing what scholars do, I think the list is missing at least one more crucial scholarly primitive, perhaps the fundamental one: collaboration. Although humanists are stereotyped as solitary scholars isolated in the library, they often work together, whether through co-editing journals or books, sharing citations, or reviewing one another’s work.  In the digital humanities, of course, developing tools, standards, and collections demands collaboration among scholars, librarians, programmers, etc.  I would also define networked scholarship—blogging, contributing to wikis, etc—as collaborative, since it requires openly sharing ideas and supports conversation. It’s only appropriate for me to note that this idea was worked out collaboratively, with colleagues at THAT Camp.

I want to make my research process as visible as possible, not only for idealistic reasons, but also because my work only gets better the more feedback I receive.  So I started up a blog—actually, several of them. At the somewhat grandly-named Digital Scholarship in the Humanities, I reflect on trends in the digital humanities and on broader lessons learned in the process of doing my research project.  In “Lisa Spiro’s Research Notes,”  I typically address stuff that seems too specialized, half-baked, or even raw for me to put on my main blog, such as my navel gazing on where to take my project next, or my experiments with Open Wound, a language re-mixing tool.   At my PageFlakes research portal, I provide a single portal to the various parts of my research project, offering RSS feeds for both of my blogs as well as for a Google News search of the term “digital humanities,” my delicious bookmarks for “digital scholarship,” links to my various digital humanities projects, and more.

I’ll admit that when I started my experiments with social scholarship I worried that no one would care, or that I would embarrass myself by writing something really stupid, but so far I’ve loved the experience.  Through comments and emails from readers, I’m able to see other perspectives and improve my own thinking.  I’ve heard from biologists and anthropologists as well as literary scholars and historians, and I’ve communicated with researchers from several countries.  As a result, I feel more engaged in the research community and more motivated to keep working.   Although I know blogging hasn’t caught on in every corner of academia, I think it has been good for my career as a digital humanist.  I am more visible and thus have more opportunities to participate in the community, such as by reviewing book proposals, articles, and grant applications.

I don’t have space to discuss the relevance of each scholarly primitive to my project, but I did want to mention a few of them: discovering, comparing, and representing.


In order to use text analysis and other tools, I needed my research materials to be in an electronic format.  In the age of mass digitization projects such as Google Books and the Open Content Alliance, I wondered how many of my 296 original research sources are digitized & available in full text.  So I diligently searched Google Books and several other sources to find out.  I looked at 5 categories: archival resources as well as primary and secondary books and journals.   I found that with the exception of archival materials, over 90% of the materials I cited in my bibliography are in a digital format.  However, only about 83% of primary resources and 37% of the secondary materials are available as full text.  If you want to do use text analysis tools on 19th century American novels or 20th century articles from major humanities journals, you’re in luck, but the other stuff is trickier because of copyright constraints.  (I’ll throw in another scholarly primitive, annotation, and say that I use Zotero to manage and annotate my research collections, which has made me much more efficient and allowed me to see patterns in my research collections.)

Of course, scholars need to be able to trust the authority of electronic resources.  To evaluate quality, I focused on four collections that have a lot of content in my field, 19th century American literature: Google Books, Open Content Alliance, Early American Fiction (a commercial database developed by UVA’s Electronic Text Center), and Making of America.  I found that there were some scanning errors with Google Books, but not as many as I expected. I wished that Google Books provided full text rather than PDF files of its public domain content, as do Open Content Alliance and Making of America (and EAF, if you just download the HTML).  I had to convert Google’s PDF files to Adobe Tagged Text XML and got disappointing results.  The OCR quality for Open Content Alliance was better, but words were not joined across line breaks, reducing accuracy.  With multi-volume works, neither Open Content Alliance nor Google Books provided very good metadata.  Still, I’m enough of a pragmatist to think that having access to this kind of data will enable us to conduct research across a much wider range of materials and use sophisticated tools to discern patterns – we just need to be aware of the limitations.

To evaluate the power of text analysis tools for my project, I did some experiments using TAPOR tools, including a comparison of two of my key bachelor texts: Mitchell’s Reveries of a Bachelor, a series of a bachelor’s sentimental dreams (sometimes nightmares) about what it would be like to be married, and Melville’s Pierre, which mixes together elements of sentimental fiction, Gothic literature, and spiritualist tracts to produce a bitter satire.   I wondered if there was a family resemblance between these texts.  First I used the Wordle word cloud generator to reveal the most frequently appearing words.  I noted some significant overlap, including words associated with family such as mother and father, those linked with the body such as hand and eye, and those associated with temporality, such as morning, night, and time.  To develop a more precise understanding of how frequently terms appeared in the two texts and their relation to each other, I used TAPOR’s Comparator tool.  This tool also revealed words unique to each work, such as “flirt” and “sensibility” in the case of Reveries, “ambiguities” and “miserable” in the case of Pierre.  Finally, I used TAPOR’s concordance tool to view key terms in context.  I found, for instance, that in Mitchell “mother” is often associated with hands or heart, while in Melville it appears with terms indicating anxiety or deceit.  By abstracting out frequently occurring and unique words, I can how Melville, in a sense, remixes elements of sentimental fiction, putting terms in a darker context.  The text analysis tools provide a powerful stimulus to interpretation.

Not only am I using the computer to analyze information, but also to represent my ideas in a more media-rich, interactive way than the typical print article.  I plan to experiment with Sophie as a tool for authoring multimodal scholarship, and I’m also experimenting with video as a means for representing visual information. Right now I’m reworking an article on the publication history of Reveries of a Bachelor as a video so that I show significant visual information such as bindings, illustrations, and advertisements.    I’ve condensed a 20+ page article into a 7 minute narrative, which for a prolix person like me is rough.  I also have been challenged to think visually and cinematically, considering how the movement of the camera and the style of transitions shape the argument.  Getting the right imagery—high quality, copyright free—has been tricky as well.  I’m not sure how to bring scholarly practices such as citation into videos.  Even though my draft video is, frankly, a little amateurish, putting it together has been lots of fun, and I see real potential for video to allow us to go beyond text and bring the human voice, music, movement and rich imagery into scholarly communication.

On Tools
In the course of my experiments in digital scholarship, I often found myself searching for the right tool to perform a certain task.  Likewise, in my conversations with researchers who aren’t necessarily interested in doing digital scholarship, just in doing their research better, I learned that they weren’t aware of digital tools and didn’t know where to find out about them.  To make it easier for researchers to discover relevant tools, I teamed up with 5 other librarians to launch the Digital Research Tools, or DiRT, wiki at the end of May.   DiRT provides a directory of digital research tools, primarily free but also commercial, categorized by their functions, such as “manage citations.”  We are also writing reviews of tools geared toward researchers and trying to provide examples of how these tools are used by the research community.  Indeed, DiRT focuses on the needs of the community; the wiki evolves thanks to its contributors.   Currently 14 people in fields such as anthropology, communications, and educational technology have signed on to be contributors.  Everything is under a Creative Commons attribution license.  We would love to see spin-offs, such as DiRT in languages besides English; DiRT for developers; and Old DiRT (dust?), the hall of obsolete but still compelling tools.  My experiences with DiRT have demonstrated again the beauty of collaboration and sharing.  Both Dan Cohen of CHNM & Alan Liu of UC Santa Barbara generously offered to let us grab content from their own tools directories.  Busy folks have freely given their time to add tools to DiRT.  Through my work on DiRT, I’ve learned about tools outside of my field, such as qualitative data analysis software.

So I’ll end with an invitation: Please contribute to DiRT.  You can sign up to be an editor or reviewer, recommend tools to be added, or provide feedback via our survey.  Through efforts like DiRT, we hope to enable new digital scholarship, raise the profile of inventive digital tools, and build community.

THAT Camp Takeaways

My work has been so all-consuming lately that it feels like THAT Camp was months rather than a couple of weeks ago, but I wanted to offer a few observations about THAT Camp before they go completely stale. Like many others, I found THAT Camp much more satisfying than the typical academic conference, since it promoted a strong sense of community (in part by using technologies such as pre-conference blogging and Twitter), was organized around the interests of participants, and encouraged the open exchange of ideas. Academic conferences typically have three functions: 1) to disseminate new ideas; 2) to bring people together to explore those ideas (and share a few beers in the process); and 3) to provide a line on the CV certifying that a scholar is actually making contributions to the research community. THAT Camp excelled at fulfilling the first two functions, and I’m hopeful that search committees and tenure committees (at least in certain communities) will see THAT Camp on a CV and think, “Wow, this person is an innovator!” Besides, the ideas generated and collaborations formed at THAT Camp will likely lead to more lines (academic merit badges?) on CVs.

I don’t have the time—and the reader probably doesn’t have the patience—to describe everything I learned at THAT Camp, but I wanted to highlight a few of the most intriguing projects or compelling ideas.

1) It’s the people, stupid.

I helped to organize a session on emerging research methods and expected that we would focus on how technologies such as visualization and text mining are opening up new approaches to scholarly inquiry. Instead, we spent most of our time engaged in a fruitful discussion about the importance—and difficulty—of collaboration, positing it as the “scholarly primitive” missing from John Unsworth’s list of core research activities. Perhaps the defining statement of the session was one person’s observation that “the cyberinfrastructure is people.” As THAT Camp itself demonstrated, collaboration enables people to develop better ideas, share the workload, sustain projects, and ultimately have a greater impact in the field, but encouraging people to share requires changes in culture and incentive systems.

2) New tools are enabling people to share annotations, resources, and work.

If collaboration is a key research process, there are some really cool tools under development that will support it. For instance, Ben Brumfield demonstrated FromThePage, a tool that allows people (historians, genealogists, history buffs) to transcribe documents, zoom in on manuscript pages, collaborate with others to identify tasks and check their work, view subjects, and more. Travis Brown is working on eComma, which “will enable groups of students, scholars, or general readers to build collaborative commentaries on a text and to search, display, and share those commentaries online.” And then there’s Zotero 2.0, which will let researchers share their collections with others.

3) Through visualization tools, researchers can make sense of a vast amount of information.

For instance, Jeanne Kramer-Smyth demonstrated ArchivesZ, which enables users of archives to visualize how much material (e.g., how many linear feet) is available in an archive related to a particular topic.

4) GIS technologies offer real analytical power, showing changes across time and space, land ownership patterns, and much more.

In a rich session on GIS tools, Josh Greenberg demonstrated how an historical map of New York could be overlaid on a contemporary Google Map, enabling one to view the development of the city. Mikel Maron discussed Open Street Map, a free and open map of the world to which people regularly contribute data. And I was delighted to learn from Shekhar Krishnan that Zotero will be releasing a mapping plug-in that will allow you to view the publication location of works in a collection on a Google Map. I had planned to create my own Google Map showing where bachelor literature was published by extracting the necessary data from Zotero, but, hooray, now I don’t have to go through the extra work. (See for more cool GIS projects).

Ways that digital resources can transform teaching and research, grand and small

While trying to determine how many articles in JSTOR and Project Muse cite Making of America (MOA), I stumbled across several articles that describe how databases such as MOA are beginning to transform humanities research. (Funny–when I look for this kind of evidence, I don’t find it, but when I’m not looking, there it is.) Most of the essays focus on how online collections enrich research by making available works that would otherwise be difficult to locate, but in one a social historian imagines large, collaborative projects in which information technology plays a crucial role.

According to Sandra Roff, researchers are discovering sources that they otherwise would not have found because they can run full-text searches on databases such as Making of America and American Periodical Series Online, 1740–1900. Describing her research into the history of the Free Academy, the precursor to the City University of New York, Roff writes:

The standard histories published before the development of the internet now prove to be incomplete since new information is easily retrieved from periodical literature using the new technology. These periodicals can provide a picture of all aspects of life during a particular time period of history, which adds a new dimension to previously static historical facts. Since there are a limited number of indexes available for the greater part of the nineteenth century, research has usually been restricted to periodical sources close to the subject locale or else to periodicals in a particular subject area. Going beyond these parameters often would yield few results and would be considerably time consuming. However, by using these databases, we discovered that news of the Free Academy was not local but had indeed spread around the country. Without the limitations of subject, author and title searching, which were the only way that historical indexes such as Poole’s or any of the indexed New York City newspapers could be searched prior to online databases, articles can now be retrieved using keyword searches. These Boolean searches can reveal mentions of subjects embedded in articles that might earlier had proven elusive even if the periodicals were searched.

Similarly, Charles La Porte argues that databases such as MOA are making it possible to study “obscure” ideas buried in Victorian periodicals:

What is new and exciting is our increasing access to formerly obscure Victorian ideas through online databases. The study of Victorian periodicals is flourishing today in part because Victorian print culture has never been more accessible, given indices like the Nineteenth Century Masterfile, and sites that reproduce Victorian journals like Chadwyck-Healey’s “Periodicals Contents Index” (PCI), the jointly-produced (and free) “Internet Library of Early Journals” (ILEJ) of the Universities of Birmingham, Leeds, Manchester, and Oxford, and the “Making of America” (MOA) database of Cornell and the University of Michigan. The growth of these and similar resources provides us not only more access to obscure poetry, but also to the print environment of known works, and to Victorian discussions of them.

Cynthia Patterson describes online access as a “bane and boon:” she used the web extensively to locate materials for her study of Philadelphia pictorial magazine, but worried that digitization would make her own research less unique and innovative, since everyone would now have access to the same materials she had so diligently pursued:

Like most scholars, I was finding the World Wide Web an unbelievably rich source for access to networking and research. About that time, I discovered the Research Society for American Periodicals, the Making of America collection at Cornell and Michigan, and the few issues of Godey’s available online. I also discovered Periodyssey, the rare book dealer in New York City, and quietly began buying up bound volumes, first of the Union, then of Graham’s, Godey’s and Peterson’s. I also took coursework through George Mason University’s Center for History and New Media. While I was fascinated with the work they were doing, digital access became a source of dread: I lived in fear that someone else would suddenly digitize the magazines in my study before I could finish my project!

To encourage students to conduct original research, teachers are promoting MOA and other databases that provide access to primary source materials. Christopher Hanlon laments the difficulty of getting students to do serious literary scholarship and explains how requiring them to use online databases such as Making of America for their research led them to produce more interesting, original work. For instance, one of his students drew on magazine articles drawn from MOA to show how the Swede in Crane’s “Blue Hotel” reflects late 19th C anxiety about Swedish immigration to the US.

By urging my students to use OCR databases to do historical research on literary texts, I was asking them to view the texts on our syllabus in Hayden White’s (1978: 81) sense of a ‘literary artifact,’ but more than that, I was urging them to take charge of their own experience of literature and hence the experience they were asking their readers to share in. Although students still don’t possess a deep sense of history, using online archives can empower students to do something we always ask of them but hardly ever equip them to accomplish: devise their own way into a text, and a way in about which we are, finally, interested.

As these comments suggest, it seems that researchers currently most value digital collections for providing enhanced access to a broader range of materials; my colleague Jane Segal and I reached a similar conclusion in our survey of humanities scholars last year. Through enhanced access, both the depth and breadth of research can be improved, as researchers uncover sources that would be otherwise difficult to discover and can quickly search a wide range of materials. Perhaps in the next five or ten years, researchers will also be saying that how they fundamentally do research and what kinds of questions they can pose have also changed, as projects such as MONK, NINES, etc. provide sophisticated tools for working with digital information and online environments for collaboration, publication, etc. (Or maybe they’re saying this already and I haven’t stumbled across those sources yet.)

In developing digital tools and methods, we should consider how they can help scholars tackle particular research challenges. Calling for historians to undertake “big,” collaborative social science research projects, Richard Steckel suggests that “large-scale archives” and “systematic information collection” can enable researchers to pursue ambitious projects, such as studying climate history, creating an international catalog of films and photographs, digitizing the notes of prominent historians, and creating a database of crime reports from 1800 to the present. He also proposes that historians digitize large collections of diaries and letters, citing MOA, Valley of the Shadow, and the Evans Early American Imprint Collection as examples of successful digitization projects. Although Steckel doesn’t use the term “digital scholarship,” he makes the case for research that requires collaboration, draws on large databases, uses computer-based tools such as GIS and statistical applications, and engages historians in producing documentaries and databases–which sure sounds like digital scholarship to me.

What qualifies as a “grand challenge” in the humanities? Such a question seems to drive initiatives to develop digital scholarship in the humanities. According to the report of the ACLS Commission on Cyberinfrastructure for the Humanities and Social Sciences, building the cyberinfrastructure is itself the humanities’ grand challenge. The AHRC e-Science Scoping Study acknowledges the difficulty of describing specific grand challenges, but points to a few possibilities: developing tools for researchers that facilitate “annotating, collating, visualising and simulating the digital content created and used within their research,” as well as “new collaborative tools and virtual collaborative environments.” Steckel’s climate history idea particularly resonates with me, freaked out as I am about climate change, but other ambitious collaborative projects spring to mind: initiatives that aim to make the humanities more global and interdisciplinary (such as Mappamundi), major GIS projects (such as Africa Map), open access data archives (such as OpenContext), etc. Given the NEH’s recently-announced high-performance computing initiative, I also wonder about the possibilities of using supercomputers to conduct complex queries across massive collections of texts, construct 3D models of cultural heritage sites, run simulations of both historical and literary events, etc.

While I’m on the subject of grand challenges and big projects, in a compelling article in the most recent Literary & Linguistic Computing, Patrick Juola argues for “Killer Applications in Digital Humanities,” which he defines as “a solution sufficiently interesting to, by itself, retrospectively justify looking [at?] the problem it solves—a Great Problem that can both empower and inspire.” Juola suggests that to make digital humanities more relevant to the broader humanities community, it should develop tools that serve “the needs of mainstream humanities scholars.” As examples of potential “killer apps,” Juola describes tools that would enable humanities scholars to automatically create back-of-the-book indices, annotate works, and discover and explore resources.

Amen. I am excited by the potential of big projects and killer apps to open up new discoveries and methods, build knowledge, serve the social good, etc. However, I hope we don’t lose sight of the contributions that small, focused projects can make as well. As an example of the mismatch between scholars’ needs and the tools developed by digital humanities folks, Juola points to an electronic scholarly edition of Clotel, which allows readers to compare passages and track changes. According to Juola, “it is not clear who among Clotel scholars will be interested in using this capacity or this edition,” and the annotation capabilities cannot be applied to other texts. But I think such a comment may reflect an all-too-common underappreciation of textual scholarship. Since Clotel exists in 4 versions, being able to compare passages is of real benefit to researchers. It’s not as if this project was created without consulting with sholars; indeed, the editor is a distinguished scholar of African-American literature. Although I certainly agree that digital humanities projects should focus on researchers’ needs (hence the significance of projects such as Bamboo, which are trying to discern those needs), I also believe that innovative methods of exploring and representing knowledge can come out experiments such as the Clotel edition. (I should acknowledge that I’m pals with some of the folks involved in developing this electronic edition.) Of course, ideally experimental tools and interfaces would be developed in as open a fashion as possible so that other projects can build on the work. As the examples I cited at the beginning of this post illustrate, big projects–text collections, databases, annotation tools, GIS maps, etc–can facilitate research into more focused topics, which in turn can contribute to our understanding of the big picture or lead us to a small but nonetheless dazzling insight.

Works Cited:

Hanlon, Christopher. “History on the Cheap: Using the Online Archive to Make Historicists out of Undergrads.” Pedagogy 5.1 (2005): 97-101. <;.

Juola, Patrick. “Killer Applications in Digital Humanities.” Lit Linguist Computing 23.1 (2008): 73-83. 15 May 2008 <;.

LaPorte, Charles. “Post-Romantic Ideologies and Victorian Poetic Practice, or, the Future of Criticism at the Present Time.” Victorian Poetry 41.4 (2004): 519-525. 7 May 2008 <;.

Patterson, Cynthia. “Access: Bane and Boon.” American Periodicals: A Journal of History, Criticism, and Bibliography 17.1 (2007): 117-118. 7 May 2008 <;.

Roff, Sandra Shoiock. “From the Field: A Case Study in Using Historical Periodical Databases to Revise Previous Research.” American Periodicals: A Journal of History, Criticism, and Bibliography 18.1 (2008): 96-100. <;.

Steckel, Richard H. (Richard Hall). “Big Social Science History.” Social Science History 31.1 (2007): 1-34. <;.

Strategies for Promoting Social Scholarship

As I noted in my last post, the development of collaborative, online, open access scholarship (which I’ll call “social scholarship”) faces some significant obstacles, including cultural barriers, concerns about intellectual property, and the need for sound economic models for open access publications. But I think social scholarship can and will grow. Here are some strategies to promote it:

1) Develop tools that enable researchers to what they already do, but better.

Why have some disciplines, such as physics, embraced online delivery of research? As Stephen Pinfield notes in “How Do Physicists Use an E-Print Archive?,” the physics e-print archive arxiv succeeded in part because it “automated” physicists’ existing practices of exchanging pre-prints. Rather than having to go through the hassles of mailing or emailing preprints to multiple colleagues, physicists could easily post them online and, as a side benefit, make them more visible. Once researchers are convinced that a tool can help them do what they already do, only better, then they can also begin to see how it may help them to do new stuff, too. For instance, when I talk to researchers about Zotero, they first recognize its value in downloading bibliographic citations and creating bibliographies, but then begin to get excited about the possibilities of tagging and searching their collections.

2) Make social scholarship cool.
A primary lesson I learned in high school: if the cool people are doing it, pretty much everyone else will want to as well. I typically try something new (whether food, books, music, or technology) because someone I respect has recommended it. In a more scholarly context, I often evaluate the quality of a journal by checking out its editorial board. As researchers see how their colleagues are having a significant impact on research by making their work available as open access, they may be more willing to release their own research as open access. Likewise, as leading scholars come to be associated with open access journals (witness, for example, the Open Humanities Press, which has a top-notch editorial board), these publications will likely gain more legitimacy.

3) Assuage concerns about intellectual property.
Certainly not every researcher will want to blog or post pre-prints about ongoing work—someone pursuing a patent wouldn’t want to give away the goods prematurely, and if a researcher hopes to publish in a journal that doesn’t allow self-archiving, then he or she may not want to test that policy (although plenty of folks do). But researchers’ fears of being scooped or plagiarized if they post material online seem exaggerated. Indeed, posting a pre-print or a blog entry about a research breakthrough may enable a researcher to register that idea without having to wait through the long publication cycle. Sure, the Web enables plagiarizers to easily find information and copy and paste it into a document, but it also makes it easy to search for a unique phrase and catch the plagiarizers. (Witness today’s Chronicle of Higher Education article on journals experimenting with plagiarism detection tools similar to TurnItIn.) By using a Creative Commons license, researchers can make clear the terms under which their work can be used.

4) Experiment with new models for open access publication.
Even as the web makes the distribution of content easier, most academics aren’t ready to dispense with the peer review, copy editing, and in some cases the marketing functions provided by publishers, all of which cost money. So how will we pay for open access publishing? Various economic models are emerging—author fees, university or library support for publishing, etc. SCOAP3 pursues an intriguing collaborative model that has emerged from the high energy physics community, whereby a consortium supported by libraries, research societies and other groups would contract with publishers to provide their services and publish high energy physics journals as open access. To cover the approximately the United States’ approximately $4.5 million share of the total costs of publishing these journals, libraries, research societies, government agencies, etc. would re-direct funds to the SCOAP3 consortium. Rather than shifting the costs of open access publication to authors (through publication charges) or individual institutions (by moving the publication function to libraries, for instance), SCOAP3 hopes to control costs by pooling funds and to give authors and libraries (the producers, purchasers and consumers of journal content) a stronger voice in the publication process. The SCOAP3 consortium would contract with publishers to provide peer review and editorial quality control, but the publications would be open access. The publishing industry wouldn’t be closed out of this process; indeed, several publishers and scholarly societies are participating the conversations about SCOAP3. Final publications would be deposited in open access repositories, enabling data mining and scholarly re-use.

5) Make the case that social scholarship is good and good for you.
Making research openly accessible can appeal to researchers’ altruistic impulses to share their work with independent scholars and researchers whose libraries cannot afford expensive journal subscriptions, as well as to make work paid for by the public available as a public good. Yet open access also makes sense purely for self-interest. As universities increasingly measure the “impact factor” of publications, articles that other researchers can easily find, comment upon, and link to will likely carry more weight. As Michael Jensen points out, the more accessible a work is, the more visible it is and more likely it is that it will be cited. (Of course, if tenure committees don’t view electronic publications as being as scholarly as more traditional publications, then self-interest may be undermined–but scholarly organizations such as the MLA and universities such as the members of the University of California system are beginning to recognize the importance of giving proper credit to electronic publications.)

Obstacles to social scholarship

As I noted in an earlier post, humanities scholars are beginning to experiment with social scholarship, embracing open access, creating and using social networking sites and collaborative tools, and undertaking joint research projects. But I must acknowledge that social scholarship (which I’m using as a catch-all term to include open access, web 2.0, and a culture of collaboration) is in its early stages and faces significant obstacles—economic, cultural, and technological. These challenges include:

  1. Lack of awareness of social scholarship: According a recent article in the Chronicle of Higher Education (“Researchers Develop Online Tools for Science Collaborations“), few scientists are aware of collaborative resources such as blogs and social networking sites. I’ve noticed this lack of awareness among faculty members from pretty much every discipline at my university. As the article points out, many people don’t use new technologies or communication methods unless they have specific needs to meet—why invest the effort in changing how you do work unless there are concrete payoffs?
  2. Intellectual property concerns: Some researchers worry that if they make their work available online before publishing it with a traditional publisher they will lose control of it. For instance, a competitor may read their blog entry about ongoing research and scoop them—or even plagiarize their work. They also fear that publishers will refuse to publish a work that has already been made available online. From another perspective, copyright law also limits what material you can incorporate into your own work and share—for instance, museums and other cultural institutions seem to be levying higher fees for publication of digital images to which they hold the copyright.
  3. Skepticism about the quality of electronic-only publications: According to research by UC Berkeley’s Center for Studies in Higher Education, faculty in five disciplines—English, biostatistics, law and economics, anthropology, and chemical engineering–associate electronic-only publication with the lack of peer review and thus the lack of quality. If researchers don’t believe that tenure committees will give them credit for publishing in open access journals, then they will stick with more traditional means of publication.
  4. Lack of recognition for social scholarship: In many disciplines, there is currently little incentive for researchers to embrace social scholarship; the incentives are with the traditional system. When I talk to faculty about social scholarship, many appreciate the vision of sharing but worry about the implementation, particularly whether tenure committees will give them credit for collaborative scholarship. What kind of rewards and recognition do you get for commenting on a colleague’s blog, publishing your articles through an institutional repository, sharing your bibliographies, or keeping an open notebook documenting your research? The UC Berkeley’s new report “Publishing Needs and Opportunities at the University of California” finds that “a significant minority” of faculty are experimenting with alternative publishing models, but that they “are increasingly frustrated by a tenure and review system that fails to recognize these new publishing models and hence constrains experimentation both in the technologies of dissemination and in the audiences addressed.”
  5. Lack of time to make work available online: Contributing content to user-generated sites, reading and commenting on blogs, sharing bookmarks and doing all of the other work of social scholarship take a lot of time—time that many busy academics don’t have. In a blog post on why Web 2.0 hasn’t been adopted in the biosciences, David Crotty, executive editor of the online publication Cold Spring Harbor Protocols, details how traditional methods of doing research can often be more efficient than Web 2.0 approaches, at least initially, since you can just email a file rather than finding a collaborative site, setting up an account, uploading the file, inviting participants to view it, waiting for them to establish accounts, etc.
  6. Cultural obstacles: Engaging in online discussions and making public thoughts that are in process are not yet part of mainstream academic culture. As David Crotty notes, many academics are unlikely to make critical comments in a public forum, since they don’t want to piss off potential reviewers, employers, or collaborators.
  7. Need for sound economic models for open access publication: Producing academic journals isn’t free, as I learned when I served as the managing editor of Postmodern Culture—even if editors donate their time, funds are needed for copyediting, coordinating editorial review, covering travel costs for editorial meetings, paying for web hosts, etc. How will open access journals be paid for—through author fees? University, society or foundation support? What will guarantee the sustainability of these journals and provide long-term access to their content? If scholars worry about the viability and reputation of open access journals, what will entice them to publish in these journals rather than traditional publications? In Open Access Publishing and the Emerging Infrastructure for 21st-Century Scholarship, Don Waters, Program Officer for Scholarly Communications at The Andrew W. Mellon Foundation, expresses skepticism about the open access model: “One worry about mandates for open access publishing is that they will deprive smaller publishers of much needed subscription income, pushing them into further decline, and making it difficult for them to invest in ways to help scholars select, edit, market, evaluate, and sustain the new products of scholarship represented in digital resources and databases. The bigger worry, which is hardly recognized and much less discussed in open access circles, is that sophisticated publishers are increasingly seeing that the availability of material in open access form gives them important new business opportunities that may ultimately provide a competitive advantage by which they can restrict access, limit competition, and raise prices.”

I believe that these challenges can be overcome and will sketch some strategies for promoting social scholarship in my final posting on this thread.