Category Archives: collaboration

Presentation on How Digital Humanists Use GitHub

At Digital Humanities 2016, Sean Morey Smith and I presented on our ongoing work examining GitHub as a platform of knowledge for digital humanities. Our results are still preliminary, but we want to share our presentation (PDF). We’re especially grateful to those who agreed to be interviewed for the study and who took our survey. We expect to produce an article (or two) based on our research.

We welcome any questions or feedback.

Studying How Digital Humanists Use GitHub

Over the past academic year, I’ve been fortunate to participate in Rice’s Mellon-sponsored Sawyer Seminar on Platforms of Knowledge, where we’ve examined platforms for authoring, annotation, mapping, and social networking. We’ve discussed both the possibilities that platforms may open up for inquiry, public engagement and scholarly communications and the risks that they may pose for privacy and nuanced humanistic analysis. Inspired by the questions raised by the Seminar, my colleague Sean Smith and I are studying a platform used by a number of digital humanists: GitHub. Digital humanists employ GitHub not only for code, but also for writing projects, syllabi, websites, and other scholarly resources. We’ll present our initial findings at Digital Humanities 2016, but I wanted to offer some background to the study, especially since some of you will soon be receiving emails from me inviting you to participate in it.

Initially I was interested in using GitHub for a case study of how we assess and select digital platforms. Even as many researchers (myself included) rely on digital platforms, I haven’t been able to find many clear rubrics for evaluating them. Building on Quinn Dombrowski’s recommendations for choosing a platform for a web project, we are looking at criteria such as functionality and ease of use. In previous work examining archival management systems, I learned how important it is to talk with users about their experience with tools, so we will be conducting a survey and interviews about GitHub. Sean and I also also realized that GitHub itself provides valuable data about how people use GitHub, such as information about collaboration, code re-use, and connections to others. Our study will thus include analysis of publicly available data about selected GitHub users and repositories. (Of course, there is significant prior work on this topic in fields such as social computing that we will draw upon.)

With this project, we are:

  1. Identifying digital humanists who have GitHub accounts. For the purposes of this study, we are looking at presenters at the last three Digital Humanities conferences and people affiliated with organizations that belong to centerNet (assuming that the information is publicly available). Of course, this method is imperfect– it misses digital humanists who didn’t attend the DH conferences or who aren’t affiliated with DH centers, and it may include some people who don’t really consider themselves digital humanists. But it’s a start.
  2. Contacting those whose email addresses are easily retrievable (e.g. available via GitHub) and:
    1. Giving them the opportunity to opt out of having their publicly available GitHub data being included in our analysis and in the dataset that we plan to share at the end of the study. (Added 5/18/16: To be extra careful, we plan to anonymize this dataset.)
    2. Inviting them to take a brief survey about their usage and opinions of GitHub
    3. Inviting them to participate in an interview

    We may also contact people whose emails aren’t in the GitHub data but are otherwise available.

  3. Analyzing GitHub data from our dataset to gain insight into how digital humanists use GitHub.

We want to conduct this study openly while at the same respecting privacy. In conducting interviews for past studies, I’ve been frustrated that I can’t publicly identify and credit people who have made brilliant comments because of the promise of confidentiality.  So we’re giving interviewees the option to make all or some of their interview notes public–but of course they can instead keep the notes private and remain anonymous. Survey data will be anonymized but ultimately shared.

Here are important documents related to our study:

I welcome feedback and questions about this study. I hope that it will contribute to developing criteria for evaluating platforms like GitHub and offer insights into how digital humanities researchers and developers work.

Update on the Texas Digital Humanities Consortium

Organizations in the Boston area, Southern California, and New York City help area digital humanists connect with each other– and now Texas has its own DH group.  The Texas Digital Humanities Consortium (TXDHC) aims to enable Texas digital humanists to share knowledge, learn new skills and methods, and collaborate on research and educational projects. After a terrific first conference hosted by the University of Houston in April of 2014, the second Texas Digital Humanities conference will take place at the University of Texas-Arlington on April 9-11, 2015, with keynotes from Alan Liu, Adeline Koh and George Siemens. (Submit your paper proposal in by January 10.) Thanks to the work of Matt Christy at Texas A&M, the TXDHC website (built on Commons in a Box) allows members to create profiles, set up groups, participate in forums, and more. The TXDHC Steering Committee (which includes me, Jennifer Hecker, Laura Mandell, Rafia Mirza, Charlotte Nunes and Andrew Torget) is shaping the organization and planning upcoming events, including a virtual workshop. The TXDHC’s next online general meeting will take place on Thursday, December 4 from 3-4 p.m. and will include lightning talks by Tanya Clement and Charlotte Nunes, updates on the consortium’s activities, and an opportunity to share announcements and questions.

Interested in participating in the TXDHC? Sign up for the listserv, create an account on the website, and come to a meeting.  TXDHC is an informal, collaborative group; there are no membership fees or bureaucratic structures. Please get in touch with me (lisamspiro[at]gmail[dot]com) if you have questions or suggestions. As a scrappy new organization, TXDHC depends on the energy and ideas of its members.

Creating the Texas Digital Humanities Consortium

TXDHC-logo6At the Inaugural Texas Digital Humanities Consortium Conference (TXDHC) on April 12, Elijah Meeks suggested that “interloping, more than computational approaches or the digital broadly construed as the object of study, defines digital humanities.” Indeed, as researchers pursue their curiosity and explore new methods, they often venture into unfamiliar territory. But there they may find others eager to experiment with new approaches and share what they know (or, as Elijah puts it, “a vibrant community of practice,” such as what we see in neogeography). This open, collaborative ethos characterized the TXDHC conference. Ably organized and hosted by Cameron Buckner from the University of Houston (with co-sponsorship from Rice and Texas A&M), the conference attracted participants from across Texas as well as from California, Alabama, Louisiana, and Switzerland. (See Geoffrey Rockwell’s great conference notes.) I think the conference met its fundamental goal of building community among (and beyond) Texas digital humanists by providing a forum where people could present their work, make connections with fellow interlopers, and learn new skills, such as at the hackfest facilitated by Elijah. By bringing in knowledgeable and engaged keynote speakers, the conference exposed participants to cutting-edge work and enabled them to interact with experts happy to offer advice about projects and pose stimulating questions. Already a colleague from Rice who attended the conference reports that she has made progress on her project thanks to help from Elijah, and I bet others can share similar stories.

The conference functioned as the first event hosted by the Texas Digital Humanities Consortium, a new organization that aims to support collaboration among digital humanists in Texas. The consortium (and conference) emerged from a conversation that Cameron Buckner, Laura Mandell (Texas A&M) and I had in October 2013 in which we discussed the growth of digital humanities across the state and the opportunity to band together in promoting DH research and education. We roped in a few more universities, including the University of Texas, the University of North Texas, St. Edward’s, and the University of Texas at Arlington. But we want to extend the consortium further, to create an open, participatory organization that includes liberal arts colleges, universities, community colleges, libraries, museums, and archives. At the conference, I facilitated a business meeting devoted to organizing the new consortium. While I worried that few people would show up to an 8:30 a.m. meeting on a Saturday, I was impressed by how many came and how engaged they were. We had participants from Southwestern, Prairie View A&M, and the University of Texas at Dallas as well as from Rice, UH, UT Austin, St. Edward’s, and UT Arlington. Since Texas is such a big state, we don’t necessarily have the advantage of close geographical proximity, but we do have a diverse and lively community, exciting research and educational projects, and a desire to do as much as we can together.

In the course of a very productive hour, we developed a framework for the consortium.  We plan to do the following:

  • Establish a Commons in a Box web site where members of the consortium can share information about researchers, projects, events, and opportunities (such as internships). Laura Mandell and her colleagues at Texas A&M’s Initiative for Digital Humanities, Media, and Culture (IDHMC) generously offered to set up the site. Contact Laura if you would like to be put on the mailing list for the group.
  • Organize a monthly virtual meeting to plan activities, share ongoing research, and build community.
  • Explore creating internship opportunities for graduate students (and potentially undergraduate students as well). Those looking for students to assist with DH projects can write short descriptions of these projects and share them on the TXDHC web site.
  • Host an annual conference. We would like to hold the next TXDHC conference in the spring of 2015, perhaps in the Dallas/Fort Worth area.
  • Provide informal opportunities to interact, such as by hosting local reading groups and letting each other know about lectures and other events. Note that Texas A&M will host THATCamp DHCollaborate on May 16-17, 2014.
  • Explore potential advocacy activities.

We encourage others interested in digital humanities from across Texas to join us. Currently the consortium operates as a “coalition of the willing,” with decision making by consensus. There are no membership fees or formal structures; to participate, you just need to indicate interest and be willing to contribute your ideas and time. If you are a Texas digital humanist, please fill out a brief survey to indicate your interest in the consortium and offer input into its activities. Interlopers welcomed!

Group and Method: Collaboration in the Digital Humanities

Yesterday I gave a talk called “Group and Method: Collaboration in the Digital Humanities” at Case Western Reserve University’s Freedman Center Colloquium on “Exploring Collaboration in Digital Scholarship.” Drawing on my research for “Computing and Communicating Knowledge” and for a series of blog posts, I discussed why collaboration is so common in digital humanities (although of course not all DH work is necessarily collaborative); explored the significance of collaboration in projects to build digital resources, devise new research methods, and promote participatory humanities; and explored challenges to collaboration. I also described how my experiences as a grad student in English convinced me of the value of collaboration–particularly my membership in a dissertation group (I was thrilled that my fellow diss group member Amanda French also gave a talk at the colloquium) and my work at Virginia’s Etext Center.

Here is the pdf of the slides.

Opening the Humanities Part 2: Contexts

In 1813, Thomas Jefferson declared in a letter to Isaac McPherson:

“He who receives an idea from me, receives instruction himself without lessening mine; as he who lights his taper at mine, receives light without darkening me. That ideas should freely spread from one to another over the globe, for the moral and mutual instruction of man, and improvement of his condition, seems to have been peculiarly and benevolently designed by nature….”

“Sharing,” by Josh Harper

Unlike, say, a diamond bracelet, an idea can be freely given to others without diminishing its value for the person who “owns” it–indeed, its value only increases as it spreads. While Jefferson believed that the creators of inventions could not claim permanent, natural rights over them, he acknowledged that society could grant the right to profit from them in order to foster innovation (which, as Chris Kelty notes, Jefferson termed the “the embarrassment of an exclusive patent,” suggesting his discomfort). He cautioned that intellectual property rights may actually endanger innovation by granting monopolies, should exist only long enough to spawn innovation, should be governed by rules limiting their application, and should be differentiated according to what benefit they convey to the public (Boyle, The Public Domain).

Jefferson’s letter raises fundamental questions: what social functions do intellectual property rights play? How can we best encourage the sharing of ideas and the progress of knowledge? In this post, the second in my series on the open humanities, I will explore legal and cultural contexts, focusing on the US.

The view that intellectual property rights are granted to encourage innovation is reflected in Article 1, Section 8  of the US Constitution: “To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries.” Note that that the Constitution describes both the purpose of copyright–”To promote the Progress of Science and useful Arts”–and places limits upon it. Copyright aims to provide an incentive (a limited monopoly) for creators to share their work so that others may make use of it and build upon it. This incentive is balanced by limits, so that after a period of time the work falls into the public domain. The 1790 Copyright Act set the copyright term at 14 years, with the right to renew for another 14 years. Now, after the passage of the Sonny Bono Copyright Term Extension Act, the copyright term has exploded to 70 years after the death of the author. The original intention to encourage the progress of public knowledge seems to have fallen aside in the interest of protecting commercial interests such as Disney’s monopoly over Mickey Mouse.

Expansion of U.S. copyright law (assuming authors create their works 35 years prior to their death) (Wikipedia)

Expansion of U.S. copyright law (assuming authors create their works 35 years prior to their death) (Wikipedia)

With most academic work, the ability to secure a monopoly over one’s ideas is not the primary incentive for sharing. Rather, most academics publish scholarly works in order to make a visible contribution to the scholarly conversation, build their scholarly reputation, and ultimately secure tenure or promotion. Typically researchers do not receive monetary compensation for publishing journal articles; the reward comes in disseminating their research. As Peter Suber suggests, one factor that makes open access more complicated in the humanities is that authors of monographs often expect to receive royalties. However, as Paul Courant points out, the monetary rewards tend to be small; the author of a moderately successful manuscript selling 1000 copies might expect to make less than $4000, and “for many monographs, lifetime royalties are zero or close to it.” As Courant suggests, “The big financial payoff to the author of the great majority of scholarly books is not the royalties but the visibility (and hence the salary and working conditions) of the author in the academic labor market.” If authors aim to contribute to the scholarly conversation and heighten their visibility, it makes sense for them to remove barriers to their work (although they also have an incentive to publish with the top journals or publishers).

Open access facilitates the sharing of scholarly knowledge. Peter Suber, a philosopher and respected advocate for open access, offers a simple definition: “Open-access (OA) literature is digital, online, free of charge, and free of most copyright and licensing restrictions.” Because such literature is digital and available online, distributing it costs almost nothing, and it can be accessed by anyone with an Internet connection. The lack of most restrictions means that the literature could be accessed and mined, which could open up new insights. But creators can put into place some restrictions over open works. For example, they can adopt a Creative Commons license and specify whether the work can be modified and/or used commercially, as well as whether the work must be attributed (CC-BY) and/or whether new versions of the work must be licensed under the same terms (share and share alike). CC-BY upholds the scholarly practice of acknowledging sources (see Bethany Nowviskie’s “why, oh why, CC-BY?” for a smart discussion of the rationale for adopting this license). There are two principal means of disseminating open access scholarly work: green, through depositing works in disciplinary repositories (like arXiv) or institutional repositories (like DSpace@MIT), and gold, through publishing open journals and monographs. Note that many publishers allow scholars to self archive work in repositories; visit SHERPA RoMEO to access publisher policies.

Unfortunately, the humanities seem to be behind the sciences in practicing openness. As Wikipedia explains, the open science movement aims to enlarge access to research, data, and publications, speed up scholarly communication, facilitate collaboration, and improve the sharing and building of knowledge, whether through open lab notebooks, open data, or open access to scholarly literature. There isn’t even a Wikipedia page for open humanities (let’s get to work!). The Directory of Open Access and Hybrid Journals lists nearly 3000 journals in the sciences as opposed to a little over 1300 in the arts & humanities. Much of the rhetoric around openness focuses on science; as a rough measure, there are approximately 973,000 Google results for “open science” versus around 38,000 for “open humanities”.

In a 2004 essay, Peter Suber pointed to a number of reasons why the humanities have been more reluctant to embrace openness than the sciences, including the greater availability of public funding for scientific research (and publishing fees), a deeper sense of a cost crisis with science journals, the significance of pre-print repositories in the sciences, the importance of monographs in the humanities, and the greater public pressure for open access to science. Updating Suber’s analysis eight years later, Gary Daught suggests that the time may be ripe for efforts to promote openness in the humanities. He notes that the price inflation of humanities journals has become a greater concern and that open source tools such as Open Journal Systems have brought down publishing costs. Perhaps most importantly, as scholars become more accustomed to the speed, convenience and openness of online communication, they may more expect that research is easily accessible.

Indeed, I’ve identified a number of open humanities projects, mainly in the digital humanities. Openness in the humanities can take many forms, including:

While these different ways of categorizing openness are helpful, I agree with Clint Lalonde (riffing on Gardner Campbell) that “open is an attitude”– not only being willing to share resources, but also to work in such a way that others can observe, learn and offer to help. In my next post, I’ll provide a number of examples of open humanities projects and initiatives.

Of course, open humanities projects aren’t necessarily focused on digital humanities; note, for instance, publishing initiatives such as Open Humanities Press. With digital humanities, we often see the intersection of humanistic values and what I’ll call Web values. Driven by a desire to make it easier for scientists to share their data and collaborate, Tim Berners-Lee created the foundations of the Web. Rather than being a proprietary system, the Web is built upon open protocols, standards and design principles. The success of the Web comes from the way that it connects people to each other, information, and experiences, enabling them to share ideas, converse with each other, and explore and interact with information. Hence Berners-Lee’s message (appropriately delivered via Twitter) at the 2012 Summer Olympics: “this is for everyone.” What would it take to say the same about humanities scholarship and educational resources?

[Note: This post expands on a presentation I gave at WPI’s Digital Humanities Symposium in November.]

Examples of Collaborative Digital Humanities Projects

Observing that humanities scholars rarely jointly author articles, as I did in my last post, comes as no surprise.  As Blaise Cronin writes, “Collaboration—for which co-authorship is the most visible and compelling indicator—is established practice in both the life and physical sciences, reflecting the industrial scale, capital-intensiveness and complexity of much contemporary scientific research. But the ‘standard model of scholarly publishing,’ one that ‘assumes a work written by an author,” continues to hold sway in the humanities’ (24).   Just as I found that only about 2% of the articles published in American Literary History between 2004 and 2008 were co-authored, so Cronin et al discovered that just 2% of the articles that appeared in the philosophy journal Mind between 1900 and 2000 were written by more than one person, although between 1990 and 2000 that number increased slightly to 4% (Cronin, Shaw, & La Barre).   Whereas the scale of scientific research often requires scientists to collaborate with each other, humanities scholars typically need only something to write with and about.  But as William Brockman, et al suggest, humanities scholars do have their own traditions of collaboration, or at least of cooperation:  “Circulation of drafts, presentation of papers at conferences, and sharing of citations and ideas, however, are collaborative enterprises that give a social and collegial dimension to the solitary activity of writing. At times, the dependence of humanities scholars upon their colleagues can approach joint authorship of a publication” (11).

Information technology can speed and extend the exchange of ideas, as researchers place their drafts online and solicit comments through technologies such as CommentPress, make available conference papers via institutional repositories, and share citations and notes using tools such as Zotero.  Over ten years, ago John Unsworth described an ongoing shift from cooperation to collaboration, indicating perhaps both his prescience and the slow pace of change in academia.

In the cooperative model, the individual produces scholarship that refers to and draws on the work of other individuals. In the collaborative model, one works in conjunction with others, jointly producing scholarship that cannot be attributed to a single author. This will happen, and is already happening, because of computers and computer networks. Many of us already cooperate, on networked discussion groups and in private email, in the research of others: we answer questions, provide references for citations, engage in discussion. From here, it’s a small step to collaboration, using those same channels as a way to overcome geographical dispersion, the difference in time zones, and the limitations of our own knowledge.

The limitations of our own knowledge.  As Unsworth also observes, collaboration, despite the challenges it poses, can open up new approaches to inquiry: “instead of establishing a single text, editors can present the whole layered history of composition and dissemination; instead of opening for the reader a single path through a thicket of text, the critic can provide her with a map and a machete. This is not an abdication of the responsibility to educate or illuminate: on the contrary, it engages the reader, the user, as a third kind of collaborator, a collaborator in the construction of meaning.”  With the interactivity of networked digital environments, Unsworth imagines the reader becoming an active co-creator of knowledge.  Through online collaboration, scholars can divide labor (whether in making a translation, developing software, or building a digital collection), exchange and refine ideas (via blogs, wikis, listservs, virtual worlds, etc.), engage multiple perspectives, and work together to solve complex problems.  Indeed, “[e]mpowering enhanced collaboration over distance and across disciplines” is central to the vision of cyberinfrastructure or e-research (Atkins).  Likewise, Web 2.0 focuses on sharing, community and collaboration.

Work in many areas of the digital humanities seems to both depend upon collaboration and aim to support it.  Out of the 116 abstracts for posters, presentations, and panels given at the Digital Humanities 2008 (DH2008) conference, 41 (35%) include a form of the word “collaboration,” whether they are describing collaborative technologies (“Online Collaborative Research with REKn and PReE”) or collaborative teams (“a collaborative group of librarians, scholars and technologists”).  Likewise, 67 out of 104 (64%) papers and posters presented at DH 2008 have more than one author.  (Both the Digital Humanities conference and LLC tend to focus on the computational side of the digital humanities, so I’d also like to see if the pattern of collaboration holds in what Tara McPherson calls the “multimodal humanities,” e.g. journals such as Vectors.  Given that works in Vectors typically are produced through collaborations between scholars and designers, I’d expect to see a somewhat similar pattern.)

I was having trouble articulating precisely how collaboration plays a role in humanities research until I began looking for concrete examples—and I found plenty.   As computer networks connect researchers to content, tools and each other, we are seeing humanities projects that facilitate people working together to produce, explore and disseminate knowledge.  I interpret the word “collaboration” broadly; it’s a squishy term with synonyms such as teamwork, cooperation, partnership, and working together, and it also calls to mind co-authorship, communication, community, citizen humanities, and social networks.  In Here Comes Everybody, Clay Shirky puts forward a handy hierarchy of collaboration: 1) sharing; 2) cooperation; 3) collaboration; 4) collectivism (Kelly).  In this post, I’ll list different types of computer-supported collaboration in the humanities, note antecedents in “traditional” scholarship, briefly describe example projects, and point to some supporting technologies.  This is an initial attempt to classify a wide range of activity; some of these categories overlap.

–FACILITATING COMMUNICATION AND KNOWLEDGE BUILDING–

ONLINE COMMUNITIES/ VIRTUAL ORGANIZATIONS

  • Historical antecedents: conferences, colloquia, letters
  • Supporting technologies: listservs, online forums, blogs, social networking platforms, virtual worlds, microblogging (e.g. Twitter), video conferencing
  • Key functions: fostering communication and collaboration across a distance
  • Examples:
    • Listervs: Perhaps the most well-known online community in the humanities is H-NET, which was founded in 1992  and thus predates Web 2.0 or even Web 1.0.  According to Mark Kornbluh, H-Net provides an “electronic version of an academic conference, a way for people to come together and to talk about their research and their teaching, to announce what was going on in the field, and to review and critique things that are going on in the field.”  Currently H-Net  supports over 100 humanities email lists and serves over 100,000 subscribers in more than 90 countries.  Although H-Net has been criticized for relying on an old technology, the listserv, and is facing economic difficulties, it remains valued for supporting information sharing and discussion.  For digital humanities folks, the Humanist list, launched in 1987, serves as “an international online seminar on humanities computing and the digital humanities” and has played a vital part in the intellectual life of the community.
    • Online forums: HASTAC, “a virtual network, a network of networks” that supports collaboration across disciplines and institutions, sponsors lively forums about technology and the humanities, often moderated by graduate students.  HASTAC also organizes conferences, administers a grant competition, and advocates for “new forms of collaboration across communities and disciplines fostered by creative uses of technology.” In my experience, online communities often break down the hierarchies separating graduate students from senior scholars and bring recognition to good ideas, no matter what the source.
    • Online communities: Since 1996, Romantic Circles (RC) has built an online community focused on Romanticism, not only fostering communication among researchers but also collaboratively developing content.  Romantic Circles includes a blog for sharing information about news and events of interest to the community; a searchable archive of electronic editions; collections of critical essays; chronologies, indices, bibliographies and other scholarly tools; reviews; pedagogical resources; and a MOO (gaming environment).  Over 30 people have served as editors, while over 300 people have contributed reviews and essays.  Alan Liu aptly summarizes RC’s significance: “Romantic Circles, which helped pioneer collaborative scholarship on the Web, has become the leading paradigm for what such scholarship could be. One can point variously to the excellence of its refereed editions of primary texts, its panoply of critical and pedagogical resources, its inventive Praxis series, its state-of-the-art use of technology or its stirring commitment (nearly unprecedented on the Web) to spanning the gap between high-school and research-level tiers of education. But ultimately, no one excellence is as important as the overall, holistic impact of the site. We witness here a broad community of scholars using the new media vigorously, inventively, and rigorously to inhabit a period of historical literature together.”In building a community that supports digital scholarship, NINES focuses on three main goals: providing peer review for digital scholarship in 19th century American and British studies (thus helping to legitimize and recognize emerging scholarly forms), helping scholars create digital scholarship by providing training and content, and developing software such as Collex and Juxta to support inquiry and collaboration.
    • Advanced videoconferencing: With budgets tight, time scarce, and concern about the environmental costs  of travel increasing, collaborators often need to meet without having to travel.  AccessGrid supports communication among multiple groups by providing high quality video and audio and enabling researchers to share data and scientific instruments seamlessly.  AccessGrid, which was developed by Argonne National Laboratory and uses open source software, employs large displays and multiple projectors to create an immersive environment.   In the arts and humanities, AccessGrid has been used to support “telematic” performances, the study of high resolution images, seminars, and classes.
CollabRoom by Modbob

CollabRoom by Modbob

COLLABORATORIES

  • Historical antecedents: laboratories, research centers,
  • Supporting technologies: grid technologies/ advanced networking, large displays, remote instrumentation, simulation software, collaboration platforms such as HubZero, databases, digital libraries
  • Key functions: fostering communication, collaboration, resource sharing, and research regardless of physical distance
  • Examples:

William Wulf coined the term collaboratory in 1989 to describe a “center without walls, in which the nation’s researchers can perform their research without regard to physical location, interacting with colleagues, accessing instrumentation, sharing data and computational resources, [and] accessing information in digital libraries.” Most of the collaboratories listed on the (now somewhat-out-of-date) Science of Collaboratories web site focus on the sciences.  For example, scientific collaboratories such as NanoHub, Space Physics and Astronomy Research Collaboratory (SPARC) and Biomedical Informatics Research Network (BIRN) have supported online data sharing, analysis, and communication.

What would a collaboratory in the humanities do? The term has been used in the humanities to refer to:

“Collaboratory” has thus taken on additional meanings, referring to “a new networked organizational form that also includes social processes; collaboration techniques; formal and informal communication; and agreement on norms, principles, values, and rules” (Cogburn, 2003, via Wikipedia).

“Virtual research environment” seems to be replacing “collaboratory” to refer to online collaborative spaces that provide access to tools and content (e.g. Early Modern Texts VRE, powered by Sakai). Through its funding program focused on Virtual Research Environments, JISC has sponsored the Virtual Research Environment for Archaeology, a VRE for the Study of Documents and Manuscripts, Collaborative Research Events on the Web, and myExperiments for sharing scientific workflows.

–SHARING AND AGGREGATING CONTENT—

DIGITAL MEMORY BANKS/ USER-CONTRIBUTED CONTENT

  • Historical antecedents: museums, archives, personal collections
  • Supporting technologies: Web publishing platforms (e.g. Omeka, Drupal), databases
  • Key functions: “collecting & exhibiting” content (to borrow from CHNM)
  • Examples:
    When the Valley of the Shadow project was launched in the 1990s, project team members went into communities in Pennsylvania and Virginia to digitize 19th century documents held by families in personal collections, thus building a virtual archive.  As scanners and digital cameras have become ubiquitous and user-contributed content sites such as Flickr and YouTube have taken off, people can contribute their own digital artifacts to online collections.  For example, The Hurricane Digital Memory Bank collects over 25,000 stories, images, and other multimedia files about Hurricanes Katrina and Rita.  Using a simple interface, people can upload items and describe the title, keywords, geographic location, and contributor.  The archive thus becomes a dynamic, living repository of current history, a space where researchers and citizens come together—or, in the terminology of the Center for History and New Media (CHNM), a memory bank that “promote[s] popular participation in presenting and preserving the past.”  As the editors of Vectors write in their introduction to “Hurricane Digital Memory Bank: Preserving the Stories of Katrina, Rita, and Wilma,” “Their work troubles a number of binaries long reified by history scholars (and humanities scholars more generally), including one/many, closed/open, expert/amateur, scholarship/journalism, and research/pedagogy.”  CHNM also sponsors digital memory banks focused on Mozilla, September 11, and the Virginia Tech tragedy.  Likewise, the Great War Archive, sponsored by the University of Oxford, contains over 6,500 items about World War I contributed by the public.

CONTENT AGGREGATION AND INTEGRATION

  • Historical antecedents: museums, archives
  • Supporting technologies: databases, open standards
  • Key functions: making it easier to discove, share and use information
  • Examples:
    Too often digital resources reside in silos, as each library or archive puts up its own digital collection.  As a result, researchers must spend more time identifying, searching, and figuring out how to use relevant digital collections.  However, some projects are shifting away from a siloed approach and bringing together collaborators to build digital collections focused on a particular topic or to develop interoperable, federated digital collections.  For instance, the Alliance for American Quilts, MATRIX: Center for Humane Arts, Letters and Social Sciences Online, and Michigan State University Museum have created the Quilt Index, which makes available images and descriptions of quilts provided by 14 contributors, including The Library of Congress American Folklife Center and the Illinois State Museum.  As Mark Kornbluh argues, interoperable content enables new kinds of inquiry: “In the natural sciences, large new datasets, powerful computers, and a rich array of computational tools are rapidly transforming knowledge generation. For the same to occur in the humanities, we need to understand the principle that ‘more is better.’ Part of what the computer revolution is doing is that it is letting us bring huge volumes of material under control. Cultural artifacts have always been held by separate institutions and separated by distance. Large–scale interoperable digital repositories, like the Quilt Index, open dramatically new possibilities to look at the totality of cultural content in ways never before possible.” Other examples of content aggregation and integration projects include the Walt Whitman Archive’s Finding Aids for Poetry Manuscripts and NINES.

DATA SHARING

  • Historical antecedents: informal exchange of data
  • Supporting technologies: databases (MySQL, etc), web services tools
  • Key functions: support research by enabling discovery and reuse of data sets
  • Example projects:
    By sharing data, researchers can enable others to build on their work and provide transparency.  As Christine Borgman writes, “If related data and documents can be linked together in a scholarly information infrastructure, creative new forms of data- and information-intensive, distributed, collaborative, multidisciplinary research and learning become possible.  Data are outputs of research, inputs to scholarly publications, and inputs to subsequent research and learning.  Thus they are the foundation of scholarship” (Borgman 115).  Of course, there are a number of problems bound up in data sharing—how to ensure participation, make data discoverable through reliable metadata, balance flexibility in accepting a range of formats and the need for standardization, preserve data for the long term, etc.  Several projects focused on humanities and social science data are beginning to confront at least some of these challenges:

    • Open Context “hopes to make archaeological and related datasets far more accessible and usable through common web-based tools.”  Embracing open access and collaboration, Open Context makes it easy for researchers to upload, search, tag and analyze archaeological datasets.
    • Through Open Street Map, people freely and openly share and use geographic data in a wiki-like fashion.  Contributors employ GPS devices to record details about places such as the names of roads, then upload this information to a collaborative database.  The data is used to create detailed maps that have no copyright restrictions (unlike most geographical data).
    • Through the Reading Experience Database researchers can contribute records of British readers engaging with texts.

–COLLABORATIVE ANNOTATION, TRANSCRIPTION, AND KNOWLEDGE PRODUCTION–

CROWDSOURCING TRANSCRIPTION

  • Historical antecedents: genealogical research(?)
  • Supporting technologies: wikis
  • Key functions: share the labor required for transcribing manuscripts
  • Examples:
    Much of the historical record is not yet accessible online because it exists as handwritten documents—letters, diaries, account books, legal documents, etc.  Although work is underway on Optical Character Recognition software for handwritten materials, making these variable documents searchable and easy to read usually still requires a person to manually transcribe the document.  Why not enable people to collaborate to make family documents and other manuscripts available through commons-based peer production? At THATCamp last year, I learned about Ben Brumley’s FromthePage software, which enables volunteers to transcribe handwritten documents through a web-based interface.  The right side of the interface shows a zoomable image of the page, while on the left volunteers enter the transcription through a wiki-like interface.  Likewise, the FamilySearch Indexing Project, sponsored by the LDS, recruits volunteers to transcribe family information from historical documents.   (See Jeanne Kramer-Smyth’s great account of the THATCamp session on crowdsourcing transcription and annotation.)  Not only can collaborative transcription be more efficient, but it can also reduce error.  Martha Nell Smith recounts how she, working solo at the Houghton, transcribed a line of Susan Dickinson’s poetry as “I’m waiting but the cow’s not back.’’  When her collaborators at the Dickinson Electronic Archives, Lara Vetter and Laura Lauth, later compared the transcriptions to digital images of Dickinson’s manuscripts, they discovered that the line actually says “‘I’m waiting but she comes not back.”  As Smith suggests, “Had we not been working in concert with one another, and had we not had the high quality reproductions of Susan Dickinson’s manuscripts to revisit and thereby perpetually reevaluate our keys to her alphabet, my misreading might have been congealed in the technology of a critical print translation and what is very probably a poetic homage to Emily Dickinson would have lain lost in the annals of literary history”(Smith 849).

    Efforts to crowdsource transcription seem similar to the distributed proofreading that powers Project Gutenberg, which has enlisted volunteers to proofread over 15,000 books since 2000.  Likewise, Project Madurai is using distributed proofreading to build a digital library of Tamil texts.

COLLABORATIVE TRANSLATION

  • Historical antecedents: translation teams, e.g. Pevear and Volokhonsky
  • Supporting technologies: wikis, blogs, machine translation supplemented by human intervention
  • Examples:
    Rather than requiring an individual to undertake the time-intensive work of translating a complex classical text solo, the Suda Online (SOL)  brings together classicists to collaborate in translating into English the Suda, a tenth century encyclopedia of ancient learning written by a committee of Byzantine scholars (and thus itself a collaboration).  In addition to providing translations, SOL also offers commentaries and references, so it serves as a sort of encyclopedic predecessor to Wikipedia.  As Anne Mahoney reports in a recent article from Digital Humanities Quarterly, an email exchange in 1998 sparked the Suda Online; one scholar wondered whether there was an English translation of the Suda (there wasn’t) and others recognized that a translation could be produced through web-based collaboration.  Student programmers at the University of Kentucky quickly developed the technological infrastructure for SOL (a wiki might have been used today, but the custom application has apparently served its purpose well).  Now a self-organizing team of 61 editors and 95 translators from 12 countries has already translated over 21,000 entries, about 2/3 of the total.  Translators make the initial translations, which are then reviewed and augmented by editors (typically classics faculty) and given a quality rating of “draft,” “low,” or “high.”   All who worked on the translation are credited through a sort of open peer review process.  Whereas collaborative projects such as Wikipedia are open to anyone, SOL translators must register with the project.  Mahoney suggests that the collaboration has succeeded in part because it was focused and bounded, so that collaborators could feel the satisfaction of working toward a common goal and meeting milestones, such as 100 entries translated.  According to Mahoney, SOL has made this important text more accessible by offering an English version, making it searchable, and providing commentaries and references.  Moreover, “[a]s a collaboration SOL demonstrates the feasibility of open peer review and the value of incremental progress.” Other collaborative translation projects include The Encyclopédie of Diderot and d’Alembert, Traduwiki, which aims to “eliminate the last barrier of the Internet, the language’; the WorldWide lexicon project; and Babels.

COLLABORATIVE EDITING

  • Historical antecedents: creating critical editions
  • Supporting technologies: grid computing, XML editors, text analysis tools, annotation tools
  • Example Projects:

As Peter Robinson observed at this year’s MLA, the traditional model for creating a critical edition centralizes authority in an editor, who oversees work by graduate assistants and others.  However, the Internet enables distributed, de-centralized editing.  To create “community-made editions,” a library would digitize texts and produce high quality images, researchers would transcribe those images, others would collate the transcriptions, others would analyze the collations and add commentaries, and so forth.

Explaining the need for collaborative approaches to textual editing, Marc Wilhelm Kiister, Christoph Ludwig and Andreas Aschenbrenner of TextGrid describe how 3 different editors attempted to create a critical edition of the massive “so-called pseudo-capitulars supposedly written by a Benedictus Levita,” dying before they could complete their work.  Now a team of scholars is collaborating to create the edition, increasing their chances of completion by sharing the labor.  The TextGrid project is building a virtual workbench for collaborative editing, annotation, analysis and publication of texts.  Leveraging the grid infrastructure, TextGrid provides a platform for “software agents with well-defined interfaces that can be harnessed together through a user defined workflow to mine or analyze existing textual data or to structure new data both manually and automatically.” TextGrid recently released a beta version of its client application that includes an XML editor, search tool, dictionary search tool, metadata annotator, and workflow modules. As Kiister, Ludwig and Aschenbreener point out, enabling collaboration requires not only developing a technical platform that supports real-time collaboration and automation of routine tasks, but also facilitating a cultural shift toward collaboration among philologists, linguists, historians, librarians, and technical experts.

SOCIAL BIBLIOGRAPHIES, COLLABORATIVE FILTERING, AND ANNOTATION

  • Historical antecedents: shared references, bibliographies
  • Key functions: share citations, notes, and scholarly resources; build collective knolwedge
  • Supporting technologies: social bookmarking, bibliographic tools
  • Projects:
    With the release of Zotero 2.0, Zotero is taking a huge step toward the vision articulated by Dan Cohen of providing access to “the combined wisdom of hundreds of thousands of scholars” (Cohen).  Researchers can set up groups to share collections with a class and/or collaborators on a research project.   I’ve already used Zotero groups to support my research and to collaborate with others; I discovered several useful citations in the collaboration folder for the digital history group, and with Sterling Fluharty I’ve set up a group to study collaboration in the digital humanities (feel free to join).  Ultimately Zotero will provide Amazon-like recommendation services to help scholars identify relevant resources.  As Stan Katz wrote in hailing Zotero’s collaboration with the Internet Archive to create a “Zotero commons” for sharing research documents, “For secretive individualists, which is to say old-fashioned humanists, this will sound like an invasion of privacy and an invitation to plagiarism. But to scholars who value accessibility, collaboration, and the early exchange of information and insight -– the future is available. And free on the Internet.”

    Similarly, the eComma project suggests that collaborative annotation can facilitate collaborative interpretation, as readers catalog poetic devices (personification, enjambment, etc.) and offer their own interpretations of literary works.  You can see eComma at work in the Collaborative Rubáiyát, which enables users to compare different versions of the text, annotate the text, tag it, and access sections through a tag cloud.   Likewise, Philospace will allow scholars to describe philosophical resources, filter them, find resources tagged by others, and submit resulting research for peer review. Other projects and technologies supporting collaborative annotation include Flickr CommonsAus-e-Lit: Collaborative Integration and Annotation Services for Australian Literature Communities, NINES’ Collex, and STEVE.

COLLABORATIVE WRITING

  • Historical antecedents: Encyclopedias
  • Supporting technologies: Wikis
  • Key functions: sharing knowledge, synthesizing multiple perspectives
  • Examples:
    With the rise of Wikipedia, academics have been debating whether collaborative writing spaces such as wikis undermine authority, expertise, and trustworthiness.  In “Literary Sleuths Online,” Ralph Schroeder and Matthijs Den Besten examine the Pynchon Wiki, a collaborative space where Pynchon enthusiasts annotate and discuss his works.  Schroeder and Den Besten compare the wiki’s section on Pynchon’s Against the Day with a print equivalent, Weisenburger’s “A Gravity’s Rainbow Companion.”  While the annotations in Weisenburger’s book are more concise and consistent, the wiki is more comprehensive, more accurate (because many people are checking the information), and more speedily produced (it only took 3 months for the wiki to cover every page of Pynchon’s novel).   Moreover, the book is fixed, while the wiki is open-ended and expansive. Schroeder and Den Besten suggest that competition, community and curiosity drive participation, since contributors raced to add annotations as they made their way through the novel and “sleuthed” together.

GAMING: “Collaborative Play”/ Games as Research

  • Historical antecedents: role playing games, board games, etc.
  • Key functions: problem solving, team work, knowledge sharing
  • Supporting technologies: gaming engines, wikis, networks
  • Example Projects:
    Perhaps some of the most intense collaboration comes in massively multiplayer online games, as teams of players consult each other for assistance navigating virtual worlds, team up to defeat monsters, join guilds to collaborate on quests, and share their knowledge through wikis such as the WOWWiki, which has almost 74,000 articles.  Focusing on World of Warcraft, Nardi and Harris explore collaborative play as a form of learning.  They also point to potential applications of gaming in research communities: “Mixed collaboration spaces, whether MMOGs or another format, may be useful in domains such as interdisciplinary scientific work where a key challenge is finding the right collaborators.”

    Sometimes those collaborators can be people without specialized training.  Recently Wired featured a fascinating article about FoldIt, a game to come up with different models of proteins that is attracting devoted teams of participants (Bohannon).  The game was devised by the University of Washington Departments of Computer Science & Engineering and Biochemistry to crowdsource solutions to Community-Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP), a scientific contest to predict protein structures.   Previously biochemist David Baker had used Rosetta@home to harness the spare computing cycles of 86,000 PCs that had been volunteered to help determine the shapes of proteins, but he was convinced that human intelligence as well as computing power needed to be tapped to solve spatial puzzles.  Thus he and his colleagues developed a game in which players fold proteins into their optimal shapes, a sort of “global online molecular speed origami.” Over 100,000 people have downloaded the game, and a 13 year-old is one of the game’s best players. Using the game’s chat function, players formed teams, “and collective efforts proved far more successful than any solo folder.”  At the CASP competition, 7 of the 15 solutions contributed through FoldIt worked, and one finished in first place, so “[a] band of gamer nonscientists had beaten the best biochemists.”

    How might gaming be used to motivate and support humanities research?  As we see in the example of FoldIt, games provide motivation and a structure for collaboration; teamwork enables puzzles to be solved more rapidly.  I could imagine, for example, a game in which players would transcribe pieces of a diary to unravel the mystery it recounts, describe the features of a series of images (similar to Google’s Image Labeler game), or offer up their own interpretations of abstruse philosophical or literary passages.  In “Games of Inquiry for Collaborative Concept Structuring,” Mary A. Keeler and Heather D. Pfeiffer envision a “Manuscript Reconstruction Game (MRG)” where Peirce scholars would collaborate to figure out where a manuscript page belongs. “The scholars rely on the mechanism of the game, as a logical editor or ‘logical lens,’ to help them focus on and clarify the complexities of inference and conceptual content in their collaborative view of the manuscript evidence” (407).  There are already some compelling models for humanities game play.  Dan Cohen recently used Twitter to crowdsource solving an historical puzzle. Ian Bogost and collaborators are investigating the intersections between journalism and gaming.  Jerome McGann describes Ivanhoe as an  “online playspace… for organizing collaborative interpretive investigations of traditional humanities materials of any kind,” as two or more players come together to re-imagine and transform a literary work (McGann).

PUBLISHING

  • Historical antecedents: exchange of drafts, letters, critical dialogs in journals
  • Supporting technologies and protocols: CommentPress, blogs, wikis, Creative Commons licenses, etc.
  • Projects:
    Bob Stein defines the book as “a place where readers (and sometimes authors) congregate.” Recent projects enable readers to participate in all phases of the publishing process, from peer-to-peer review to remixing a work to produce something new.  For instance, LiquidPub aims to transform the dissemination and evaluation of scientific knowledge by enabling “Liquid Publication that can take multiple forms, that evolves continuously, and is enriched by multiple sources.”  Using CommentPress, Noah Wardrip-Fruin  experimented with peer-to-peer review of his new book Expressive Processing alongside traditional peer review, posting a section of the book each week day to the Grand Text Auto blog.  Although it was difficult for many reviewers to get a sense of the book’s overall arguments when they were reading only fragments, Wardrip-Fruin found many benefits to this open approach to peer review: he could engage in conversation with his reviewers and determine how to act on their comments, and he received detailed comments from both academics and non-academics with expertise in the topics being discussed, such as game designers.  Similarly, O’Reilly recently developed the Open Publishing Feedback System to gather comments from the community.  Its first experiment, Programming Scala, yielded over 7000 comments from nearly 750 people. New publishing companies such as WeBook and Vook are exploring collaborative authorship and multimedia.

SOCIAL LEARNING

  • Historical antecedents: Students as research assistants?
  • Supporting technologies: blogs, wikis, social bookmarking, social bibliographies
  • Motto: “We participate, therefore we are.” (via John Seely Brown)
  • Example:
    As John Seely Brown explains, “social learning is based on the premise that our understanding of content is socially constructed through conversations about that content and through grounded interactions, especially with others, around problems or actions.”  Social learning involves “learning to be” an expert through apprenticeship, as well as learning the content and language of a domain.  Brown points to open source communities as exemplifying social learning.  I would guess that many, if not most, collaborative digital humanities projects have depended on contributions from undergraduate and graduate students, whether they digitized materials, did programming, authored metadata, contributed to the project wiki, designed the web site, or even managed the project.

    Why not create a network of research projects, so that students studying a similar topic could jointly contribute to a common resource?  Such is the vision of “Looking for Whitman: The Poetry of Place in the Life and Work of Walt Whitman,” led by Matthew Gold.   Working together to build a common web site on Whitman, students will document their research using Web 2.0 technologies such as CommentPress, BuddyPress (Word Press + social networking), blogs, wikis, YouTube, Flickr, Google Maps, etc.m  Students at City Tech, CUNY’s New York City College of Technology and New York University will focus on Whitman in New York;  those at Rutgers University at Camden will look at Whitman as “sage of Camden”; and those at the University of Mary Washington will examine Whitman and the Civil War.   Similarly, Michael Wesch, the 2008 CASE/Carnegie U.S. Professor of the Year for Doctoral and Research Universities, asks his students to become “co-creators” of knowledge, whether in simulating world history and cultures, creating an ethnography of YouTube, or examining anonymity and new media.

While collaboration in the humanities is certainly not new, these projects suggest how researchers (both professional and amateur) can work together regardless of physical location to share ideas and citations, produce translations or transcriptions, and create common scholarly resources.  Long as this list is, I know I’m omitting many other relevant projects (some of which I’ve bookmarked) and overlooking (for now) the challenges that collaborative scholarship faces.  I’ll be working with several collaborators to explore these issues, but I of course welcome comments….

Works Cited

Atkins, Dan. Report of the National Science Foundation Blue-Ribbon Advisory Panel on Cyberinfrastructure. NSF. January 2003. <http://www.nsf.gov/od/oci/reports/toc.jsp>.
Bohannon, John. “Gamers Unravel the Secret Life of Protein.” Wired 20 Apr 2009. 26 May 2009 <http://www.wired.com/medtech/genetics/magazine/17-05/ff_protein?currentPage=all>.
Borgman, Christine L. Scholarship in the Digital Age: Information, Infrastructure, and the Internet. Cambridge, Mass.: The MIT Press, 2007.
Brockman, William et al. Scholarly Work in the Humanities and the Evolving Information Environment. CLIR/DLF, 2001. 24 Jul 2007 <http://www.clir.org/PUBS/reports/pub104/pub104.pdf>.
Cohen, Daniel J. “Zotero: Social and Semantic Computing for Historical Scholarship.” Perspectives (2007). 27 May 2009 <http://www.historians.org/perspectives/issues/2007/0705/0705tec2.cfm>.
Cronin, Blaise, Debora Shaw, and Kathryn La Barre. “A cast of thousands: Coauthorship and subauthorship collaboration in the 20th century as manifested in the scholarly journal literature of psychology and philosophy.” Journal of the American Society for Information Science and Technology 54.9 (2003): 855-871.
Cronin, Blaise. The hand of science. Scarecrow Press, 2005.
Kelly, Kevin. “The New Socialism: Global Collectivist Society Is Coming Online.” Wired 22 May 2009. 26 May 2009 <http://www.wired.com/culture/culturereviews/magazine/17-06/nep_newsocialism?currentPage=all>.
Kornbluh, Mark. “From Digital Repositorities to Information Habitats: H-Net, the Quilt Index, Cyber Infrastruture, and Digital Humanities.” First Monday 13.8: August 4, 2008. 
Kuster, M.W., C. Ludwig, and A. Aschenbrenner. “TextGrid as a Digital Ecosystem.” Digital EcoSystems and Technologies Conference, 2007. DEST ’07. Inaugural IEEE-IES. 2007. 506-511.
Mahoney, Anne. “Tachypaedia Byzantina: The Suda On Line as Collaborative Encyclopedia.”  Digital Humanities Quarterly. 3.1 (2009). 22 Mar 2009 <http://www.digitalhumanities.org/dhq/vol/003/1/000025.html>.
McGann, Jerome J. “Culture and Technology: The Way We Live Now, What Is to Be Done?.” New Literary History 36.1 (2005): 71-82.
Nardi, Bonnie, and Justin Harris. “Strangers and friends: collaborative play in world of warcraft.” Proceedings of the 2006 20th anniversary conference on Computer supported cooperative work. Banff, Alberta, Canada: ACM, 2006. 149-158. 18 May 2009 <http://portal.acm.org/citation.cfm?id=1180875.1180898>.
O’Donnell, Daniel Paul. “Disciplinary Impact and Technological Obsolescence in Digital Medieval Studies.” A Companion To Digital Humanities. 2 May 2009 <http://www.digitalhumanities.org/companion/view?docId=blackwell/9781405148641/9781405148641.xml&chunk.id=ss1-4-2&toc.id=0&brand=9781405148641_brand>.
Schroeder, Ralph, and Matthijs Den Besten. “Literary Sleuths On-line: e-Research collaboration on the Pynchon Wiki.” Information, Communication & Society 11.2 (2008): 167-187.
Smith, Martha Nell. “Computing: What Has American Literary Study To Do with It.” American Literature 74.4 (2002): 833-857.
Unsworth, John M. “Creating Digital Resources: the Work of Many Hands.” 14 Sep 1997. 10 Mar 2009 <http://www3.isrl.uiuc.edu/%7Eunsworth/drh97.html>.

Revisions: Fixed From the Page link, 6/1/09; Tanya ] Tara, 6/2/09; fixed typos (6/14/09)

Collaborative Authorship in the Humanities

Recently I heard the editors of a history journal and a literature journal say that they rarely published articles written by more than one author—perhaps a couple every few years.   Around the same time, I was looking over a recent issue of Literary and Linguistic Computing and noticed that it included several jointly-authored articles.  This got me wondering:  is collaborative authorship more common in digital humanities than in “traditional” humanities?

“Collaboration” is often associated with “digital humanities.”  Building digital collections, creating software, devising new analytical methods, and authoring multimodal scholarship typically cannot be accomplished by a solo scholar; rather, digital humanities projects require contributions from people with content knowledge, technical skills, design skills, project management experience, metadata expertise, etc.  Our Cultural Commonwealth identifies enabling collaboration as a key feature of the humanities cyberinfrastructure, funders encourage multi-institutional and even international teams, and proponents of increased collaboration in the humanities like Cathy Davidson and Lisa Ede and Andrea A. Lunsford cite digital humanities projects such as Orlando as exemplifying collaborative possibilities.

As a preliminary investigation, I compared the number of collaboratively-written articles published between 2004 and 2008 in two well-respected quarterly journals, American Literary History (ALH) and Literary and Linguistic Computing (LLC).  Both journals are published by Oxford University Press as part of its humanities catalog. I selected ALH because it is a leading journal on American literature and culture that encourages critical exchanges and interdisciplinary work—and because I thought it would be fun to see what the journal has published since 2004. (The hardest part of my research: resisting the urge to stop and read the articles.)  LLC, the official publication of the Association for Literary and Linguistic Computing and the Association for Computers and the Humanities, includes contributions on digital humanities from around the world—the UK, the US, Germany, Australia, Greece, Italy, Norway, etc.—and from many disciplines, such as literature, linguistics, computer and information science, statistics, librarianship, and biochemistry.  To determine the level of collaborative authorship in each issue, I tallied articles that had more than one author, excluding editors’ introductions, notes on contributors, etc.  For LLC, I counted everything that had an abstract as an article.  While I didn’t count LLC’s reviews, which typically are brief and focus on a single work, I did include the review essays published by ALH, since they are longer and synthesize critical opinion about several works.

So what did I find? Whereas 5 of 259 (1.93%) articles published in ALH—about one a year–feature two authors (none had more than two), 70 out of 145 (48.28%) of the articles published in LLC were written by two or more authors.  Most (4 of 5, or 80%) of the ALH articles were written by scholars from multiple institutions, whereas 49% (34 of 70) of the LLC articles were.  About 16% (11 of 70) of the LLC articles featured contributors from two or more countries, while none of the ALH articles did.  Two of the five ALH articles are review essays, while three focus on hemispheric or transatlantic American studies.  Although this study should be carried out more systematically across a wider range of journals, the initial results do suggest that collaborative authorship is more common in digital humanities. [See the Zotero reports for ALH and LLC for more information.]

Why does LLC feature more collaboratively written articles than ALH? I suspect that because, as I’ve already suggested, digital humanities projects often require collaboration, whereas most literary criticism can be produced by an individual scholar who needs only texts to read, a place to write, and a computer running a word processing application (as well as a library to provide access to texts, colleagues to consult and to review the resulting research, a university and/or funding agency to support the research, a publisher to disseminate the work, etc.).   Moreover, LLC represents a sort of meeting point for a range of disciplines, including several (such as computer science) that have a tradition of collaborative authorship.  Whereas collaborative authorship is common (even expected) in the sciences, in the humanities many tenure and promotion committees have not yet developed mechanisms for evaluating and crediting collaborative work. In a recent blog post, for example, Cathy Davidson tells a troubling story about being told (in a public and humiliating way) by a member of a search committee that her collaborative work and other “non-traditional” research didn’t “count.”  Literary study values individual interpretation, or what Davidson calls “the humanistic ethic of individuality.”

While individual scholarship remains valid and important, shouldn’t humanities scholarship to expand to embrace collaborative work as well?  Indeed, in 2000 the MLA launched an initiative to consider “alternatives to the adversarial academy” and encourage collaborative scholarship.  (By the way, I’m not criticizing ALH; I doubt that it receives many collaboratively-authored submissions, and it has encouraged critical exchange and interdisciplinary research.)  Of course, collaboration poses some significant challenges, such divvying up and managing work, negotiating conflicts, finding funding for complex projects, assigning credit, etc.    But as Lisa Ede and Andrea A. Lunsford point out, collaborative authorship can lead to a “widening of scholarly possibilities.”  In talking to humanities scholars (particularly those in global humanities), I’ve noticed genuine enthusiasm about collaborative work that allows scholars to engage in community, consider alternative perspectives, and undertake ambitious projects that require diverse skills and/or knowledge.

What kind of collaborations do the jointly-written articles in LLC and ALH represent? Since LLC often lists only the authors’ institutional affiliations, not their departments, tracing the degree of interdisciplinary collaboration would require further research.  However, I did find examples of several types of collaboration (which may overlap):

  • Faculty/student collaboration: In the sciences, faculty frequently publish with their postdocs and students, a practice that seems to be rare in the humanities.  I noted at least one example of a similar collaboration in LLC—involving, I should note, computer science rather than humanities grad students.
    • Urbina, Eduardo et al. “Visual Knowledge: Textual Iconography of the Quixote, a Hypertextual Archive.” Lit Linguist Computing 21.2 (2006): 247-258. 5 Apr 2009 <http://llc.oxfordjournals.org/cgi/content/abstract/21/2/247>.
      This article includes contributions by a professor of Hispanic studies, a professor of computer science, a librarian/archivist/adjunct English professor, and three graduate students in computer science.
  • Project teams: In digital humanities, collaborators often work together on projects to build digital collections, develop software, etc.  In LLC, I found a number of articles written by project teams, such as:
    • Barney, Brett et al. “Ordering Chaos: An Integrated Guide and Online Archive of Walt Whitman’s Poetry Manuscripts.” Lit Linguist Computing 20.2 (2005): 205-217. 5 Apr 2009 <http://llc.oxfordjournals.org/cgi/content/abstract/20/2/205>.
      Members of the project team included an archivist, programmer, digital initiatives librarian, English professor, and two English Ph.Ds who serve as library faculty and focus on digital humanities.
  • Interdisciplinary collaborations: In LLC, I noted several instances of teams that included humanities scholars and scientists working together to apply particular methods (text mining, stemmatic analysis) in the humanities.  For example:
    • Windram, Heather F. et al. “Dante’s Monarchia as a test case for the use of phylogenetic methods in stemmatic analysis.” Lit Linguist Computing 23.4 (2008): 443-463. 5 Apr 2009 <http://llc.oxfordjournals.org/cgi/content/abstract/23/4/443>.  The authors include two biochemists, a textual scholar, and a scholar of Italian literature
    • Sculley, D., and Bradley M. Pasanek. “Meaning and mining: the impact of implicit assumptions in data mining for the humanities.” Lit Linguist Computing 23.4 (2008): 409-424. 5 Apr 2009 <http://llc.oxfordjournals.org/cgi/content/abstract/23/4/409>.
      Authored by a computer scientist and a literature professor.
  • Shared interests: Researchers may publish together because they share an intellectual kinship and can accomplish more by working together.  For instance:
    • Auerbach, Jonathan, and Lisa Gitelman. “Microfilm, Containment, and the Cold War.” American Literary History 19.3 (2007).  I noticed that Jonathan Auerbach and Lisa Gitelman thank each other in works that each had previously published as an individual.

Observing that LLC publishes a number of collaboratively-written articles opens up several questions, which I hope to pursue through interviews with the authors of at least some of these articles (if you are one of these authors, you may see an email from me soon….):

1)    What characterizes the LLC articles that have only one author?
Based on a quick look at the tables of contents from past issues, I suspect that these articles are more likely to be theoretical or to focus on particular problems rather than projects.  Here, for example, are the titles of some singly-authored articles:  “The Inhibition of Geographical Information in Digital Humanities Scholarship,” “Monkey Business—or What is an Edition?,” “What Characterizes Pictures and Text?” and “Original, Authentic, Copy: Conceptual Issues in Digital Texts.”

2)    Why was the article written collaboratively?

What led to the collaboration?  Did team members offer complementary skill sets, such as knowledge of statistical methods and understanding of the content? How did the collaborators come together—do they work for the same institution? Did they meet at a conference? Do they cite each other?

3)    What were the outcomes of the collaboration?

What was accomplished through collaboration that would have been difficult to do otherwise?  Would the scale of the project be smaller if it were pursued by a single scholar? Did the project require contributions from people with different types of expertise?

4)    How was the collaboration managed and sustained?

Was one person in charge, or was authority distributed? What tools were used to facilitate communication, track progress on the project, and support collaborative writing? To what degree was face-to-face interaction important?

5)    What was difficult about the collaboration?

What was hard about collaborating: Communicating? Identifying who does what? Agreeing on methods? Coming to a common understanding of results? Finding funding?

We can find answers to some of these questions in Lynne Siemens’ recent article “’It’s a team if you use “reply all” ‘: An exploration of research teams in digital humanities environments.”  Siemens describes factors contributing to the success of collaborative teams in digital humanities, such as clear milestones and benchmarks, strong leadership, equal contributions by members of the team, and a balance between communication through digital tools and in-person meetings.  I particularly liked the description of “a successful team as a ‘round thing’ with equitable contribution by individual members.”

In doing this research, I realized how much it would benefit from collaborators.  For instance, someone with expertise in citation analysis could help enlarge the study and detect patterns in collaborative authorship, while someone with expertise in qualitative research methods could help to interview collaborative research teams and analyze the resulting data.  However, I think anyone with an interest in the topic could make valuable contributions.  This is by way of leading up to my pitch: I’m working on a piece about collaborative research methods in digital humanities for an essay collection and would welcome collaborators.  If you’re interested in teaming up, contact me at lspiro@rice.edu.

Works Cited

Davidson, Cathy N. “What If Scholars in the Humanities Worked Together, in a Lab?.” The Chronicle of Higher Education 28 May 1999. 18 Apr 2009 <http://chronicle.com/weekly/v45/i38/38b00401.htm>.

Ede, Lisa, and Andrea A. Lunsford. “Collaboration and Concepts of Authorship.” PMLA 116.2 (2001): 354-369. 18 Apr 2009 <http://www.jstor.org/stable/463522>.

Siemens, Lynne. “’It’s a team if you use “reply all” ‘: An exploration of research teams in digital humanities environments.” Lit Linguist Computing (2009): fqp009. 14 Apr 2009 <http://llc.oxfordjournals.org/cgi/content/abstract/fqp009v1>.

Is Wikipedia Becoming a Respectable Academic Source?

Last year a colleague in the English department described a conversation in which a friend revealed a dirty little secret: “I use Wikipedia all the time for my research—but I certainly wouldn’t cite it.”  This got me wondering: How many humanities and social sciences researchers are discussing, using, and citing Wikipedia?  To find out, I searched Project Muse and JSTOR, leading electronic journal collections for the humanities and social sciences, for the term “wikipedia,” which picked up both references to Wikipedia and citations of the wikipedia URL.  I retrieved 167 results from between 2002 and 2008, all but 8 of which came from Project Muse.  (JSTOR covers more journals and a wider range of disciplines but does not provide access to issues published in the last 3-5 years.)  In contrast, Project Muse lists 149 results in a search for “Encyclopedia Britannica” between 2002 and 2008, and JSTOR lists 3.  I found that citations of Wikipedia have been increasing steadily: from 1 in 2002 (not surprisingly, by Yochai Benkler) to 17 in 2005 to 56 in 2007. So far Wikipedia has been cited 52 times in 2008, and it’s only August.

Along with the increasing number of citations, another indicator that Wikipedia may be gaining respectability is its citation by well-known scholars.  Indeed, several scholars both cite Wikipedia and are themselves subjects of Wikipedia entries, including Gayatri Spivak, Yochai Benkler, Hal Varian, Henry Jenkins, Jerome McGann, Lawrence Buell, and Donna Haraway.

111 of the sources (66.5%) are what I call “straight citations”—citations of Wikipedia without commentary about it–while 56 (34.5%) comment on Wikipedia as a source, either positively or negatively.  14.5% of the total citations come from literary studies, 14% from cultural studies, 11.4% from history, and 6.6% from law. Researchers cite Wikipedia on a diversity of topics, ranging from the military-industrial complex to horror films to Bush’s second state of the union speech.  8 use Wikipedia simply as a source for images (such as an advertisement for Yummy Mummy cereal or a diagram of the architecture of the Internet).  Many employ Wikipedia either as a source for information about contemporary culture or as a reflection of contemporary cultural opinion.  For instance, to illustrate how novels such as The Scarlet Letter and Uncle Tom’s Cabin have been sanctified as “Great American Novels,” Lawrence Buell cites the Wikipedia entry on “Great American Novel”(Buell).

About a third of the articles I looked at discuss the significance of Wikipedia itself.  14 (8%) criticize using it in research.  For instance, a reviewer of a biography about Robert E. Lee tsks-tsks:

The only curiosities are several references to Wikipedia for information that could (and should) have been easily obtained elsewhere (battle casualties, for example). Hopefully this does not portend a trend toward normalizing this unreliable source, the very thing Pryor decries in others’ work. (Margolies).

In contrast, 11 (6.6%) cite Wikipedia as a model for participatory culture.  For example:

The rise of the net offers a solution to the major impediment in the growth and complexification of the gift economy, that network of relationships where people come together to pursue public values. Wikipedia is one example.(DiZerega)

A few (1.8%) cite Wikipedia self-consciously, aware of its limitations but asserting its relevance for their particular project:

Citing Wikipedia is always dicey, but it is possible to cite a specific version of an entry. Start with the link here, because cybervandals have deleted the list on at least one occasion. For a reputable “permanent version” of “Alternative press (U.S. political right)” see: http://en.wikipedia.org/w/index.php?title=Alternative_press_%28U.S._political_right%29&oldid=107090129 (Berlet).

Of course, just because more researchers—including some prominent ones—are citing Wikipedia does not mean it’s necessarily a valid source for academic papers.  However, you can begin to see academic norms shifting as more scholars find useful information in Wikipedia and begin to cite it.  As Christine Borgman notes, “Scholarly documents achieve trustworthiness through a social process to assure readers that the document satisfies the quality norms of the field” (Borgman 84).  As a possible sign of academic norms changing in some disciplines, several journals, particularly those focused on contemporary culture, include 3 or more articles that reference Wikipedia: Advertising and Society Review (7 citations), American Quarterly (3 citations), College Literature (3 citations), Computer Music Journal (5 citations), Indiana Journal of Global Legal Studies (3 citations), Leonardo (8 citations), Library Trends (5 citations), Mediterranean Quarterly (3 citations), and Technology and Culture (3 citations).

So can Wikipedia be a reputable scholarly resource?  I typically see four main criticisms of Wikipedia:

1) Research projects shouldn’t rely upon encyclopedias. Even Jimmy Wales, (co?-)founder of Wikipedia, acknowledges “I still would say that an encyclopedia is just not the kind of thing you would reference as a source in an academic paper. Particularly not an encyclopedia that could change instantly and not have a final vetting process” (Young).  But an encyclopedia can be a valid starting point for research.  Indeed, The Craft of Research, a classic guide to research, advises that researchers consult reference works such as encyclopedias to gain general knowledge about a topic and discover related works (80).  Wikipedia covers topics often left out of traditional reference works, such as contemporary culture and technology.  Most if not all of the works I looked at used Wikipedia to offer a particular piece of background information, not as a foundation for their argument.

2) Since Wikipedia is constantly undergoing revisions, it is too unstable to cite; what you read and verified today might be gone tomorrow–or even in an hour.  True, but Wikipedia is developing the ability for a particular version of an entry to be vetted by experts and then frozen, so researchers could cite an authoritative, unchanging version (Young).  As the above citation from Berlet indicates, you can already provide a link to a specific version of an article.

3) You can’t trust Wikipedia because anyone—including folks with no expertise, strong biases, or malicious (or silly) intent—can contribute to it anonymously.  Yes, but through the back and forth between “passionate amateurs,” experts, and Wikipedia guardians protecting against vandals, good stuff often emerges. As Nicholson Baker, who has himself edited Wikipedia articles on topics such as the Brooklyn Heights and the painter Emma Fordyce MacRae, notes in a delightful essay about Wikipedia, “Wikipedia was the point of convergence for the self-taught and the expensively educated. The cranks had to consort with the mainstreamers and hash it all out” (Baker).  As Roy Rosenzweig found in a detailed analysis of Wikipedia’s appropriateness for historical research, the quality of the collaboratively-produced Wikipedia entries can be uneven: certain topics are covered in greater detail than others, and the writing can have the choppy, flat quality of something composed by committee.  But Rosenzweig also concluded that Wikipedia compares favorably with Encarta and Encyclopedia Britannica for accuracy and coverage.

4) Wikipedia entries lack authority because there’s no peer review. Well, depends on how you define “peer review.”  Granted, Wikipedia articles aren’t reviewed by two or three (typically anonymous) experts in the field, so they may lack the scholarly authority of an article published in an academic journal.  However, articles in Wikipedia can be reviewed and corrected by the entire community, including experts, knowledgeable amateurs, and others devoted to Wikipedia’s mission to develop, collect and disseminate educational content (as well as by vandals and fools, I’ll acknowledge).  Wikipedia entries aim to achieve what Wikipedians call “verifiability”; the article about Barack Obama, for instance, has as many footnotes as a law review article–171 at last count (August 31), including several from this week.

Now I’m certainly not saying that Wikipedia is always a good source for an academic work–there is some dreck in it, as in other sources.  Ultimately, I think Wikipedia’s appropriateness as an academic source depends on what is being cited and for what purpose.   Alan Liu offers students a sensible set of guidelines for the appropriate use of Wikipedia, noting that it, like other encyclopedias, can be a good starting point, but that it is “currently an uneven resource” and always in flux.  Instead of condemning Wikipedia outright, professors should help students develop what Henry Jenkins calls “new media literacies.”  By examining the history and discussion pages associated with each article, for instance, students can gain insight into how knowledge is created and how to evaluate a source.  As John Seely Brown and Richard Adler write:

The openness of Wikipedia is instructive in another way: by clicking on tabs that appear on every page, a user can easily review the history of any article as well as contributors’ ongoing discussion of and sometimes fierce debates around its content, which offer useful insights into the practices and standards of the community that is responsible for creating that entry in Wikipedia. (In some cases, Wikipedia articles start with initial contributions by passionate amateurs, followed by contributions from professional scholars/researchers who weigh in on the “final” versions. Here is where the contested part of the material becomes most usefully evident.) In this open environment, both the content and the process by which it is created are equally visible, thereby enabling a new kind of critical reading—almost a new form of literacy—that invites the reader to join in the consideration of what information is reliable and/or important.(Brown & Adler)

OK, maybe Wikipedia can be a legitimate source for student research papers–and furnish a way to teach research skills.  But should it be cited in scholarly publications?  In “A Note on Wikipedia as a Scholarly Source of Record,” part of the preface to Mechanisms, Matt Kirschenbaum offers a compelling explanation of why he cited Wikipedia, particularly when discussing technical documentation:

Information technology is among the most reliable content domains on Wikipedia, given the high interest of such topics Wikipedia’s readership and the consequent scrutiny they tend to attract.   Moreover, the ability to examine page histories on Wikipedia allows a user to recover the editorial record of a particular entry… Attention to these editorial histories can help users exercise sound judgment as to whether or not the information before them at any given moment is controversial, and I have availed myself of that functionality when deciding whether or not to rely on Wikipedia.(Kirschenbaum xvii)

With Wikipedia, as with other sources, scholars should use critical judgment in analyzing its reliability and appropriateness for citation.  If scholars carefully evaluate a Wikipedia article’s accuracy, I don’t think there should be any shame in citing it.

For more information, review the Zotero report detailing all of the works citing Wikipedia, or take a look at a spreadsheet of basic bibliographic information. I’d be happy to share my bibliographic data with anyone who is interested.

Works Cited

Baker, Nicholson. “The Charms of Wikipedia.” The New York Review of Books 55.4 (2008). 30 Aug 2008 <http://www.nybooks.com/articles/21131&gt;.

Berlet, Chip. “The Write Stuff: U. S. Serial Print Culture from Conservatives out to Neonazis.” Library Trends 56.3 (2008): 570-600. 24 Aug 2008 <http://muse.jhu.edu/journals/library_trends/v056/56.3berlet.html&gt;.

Booth, Wayne C, and Colomb, Gregory G. The Craft of Research. Chicago: U of Chicago P, 2003.

Borgman, Christine L. Scholarship in the Digital Age: Information, Infrastructure, and the Internet. Cambridge, Mass., 2007.

Brown, John Seely, and Richard P. Adler. “Minds on Fire: Open Education, the Long Tail, and Learning 2.0 .” EDUCAUSE Review 43.1 (2008): 16-32. 29 Aug 2008 <http://connect.educause.edu/Library/EDUCAUSE+Review/MindsonFireOpenEducationt/45823?time=1220007552&gt;.

Buell, Lawrence. “The Unkillable Dream of the Great American Novel: Moby-Dick as Test Case.” American Literary History 20.1 (2008): 132-155. 24 Aug 2008 <http://muse.jhu.edu/journals/american_literary_history/v020/20.1buell.pdf&gt;.

Dee, Jonathan. “All the News That’s Fit to Print Out.” The New York Times 1 Jul 2007. 30 Aug 2008 <http://www.nytimes.com/2007/07/01/magazine/01WIKIPEDIA-t.html&gt;.

DiZerega, Gus. “Civil Society, Philanthropy, and Institutions of Care.” The Good Society 15.1 (2006): 43-50. 24 Aug 2008 <http://muse.jhu.edu/journals/good_society/v015/15.1diZerega.html&gt;.

Jenkins, Henry. “What Wikipedia Can Teach Us About the New Media Literacies (Part One).” Confessions of an Aca/Fan 26 Jun 2007. 30 Aug 2008 <http://www.henryjenkins.org/2007/06/what_wikipedia_can_teach_us_ab.html&gt;.

Kirschenbaum, Matthew G. Mechanisms : new media and the forensic imagination. (Cambridge, Mass.: MIT Press, 2008).

Liu, Alan. “Student Wikipedia Use Policy.” 1 Apr 2007. 30 Aug 2008 <http://www.english.ucsb.edu/faculty/ayliu/courses/wikipedia-policy.html&gt;.

Margolies, Daniel S. “Robert E. Lee: Heroic, But Not the Polio Vaccine.” Reviews in American History 35.3 (2007): 385-392. 25 Aug 2008 <http://muse.jhu.edu/journals/reviews_in_american_history/v035/35.3margolies.html&gt;.

Rosenzweig, Roy. “Can History be Open Source? Wikipedia and the Future of the Past.” The Journal of American History Volume 93, Number 1 (June, 2006): 117-46.  Available at http://chnm.gmu.edu/resources/essays/d/42

Young, Jeffrey. “Wikipedia’s Co-Founder Wants to Make It More Useful to Academe.” Chronicle of Higher Education 13 Jun 2008. 28 Aug 2008 <http://chronicle.com/free/v54/i40/40a01801.htm?utm_source=at&utm_medium=en&gt;.

Doing Digital Scholarship: Presentation at Digital Humanities 2008

Note:  Here is roughly what I said during my presentation at Digital Humanities 2008 in Oulu, Finland (or at least meant to say—I was so sleep deprived thanks to the unceasing sunshine that I’m not sure what I actually did say).  My session, which explored the meaning and significance of “digital humanities,” also featured rich, engaging presentations by Edward Vanhoutte on the history of humanities computing and John Walsh on comparing alchemy and digital humanities.  My presentation reports on my project to remix my dissertation as a work of digital scholarship and synthesizes many of my earlier blog posts to offer a sort of Reader’s Digest condensed version of my blog for the past 7 months. By the way, sorry that I’ve been away from the blog for so long.  I’ve spent the last month and a half researching and writing a 100 page report on archival management software,  reviewing essays, performing various other professional duties, and going on both a family vacation to San Antonio and a grown-up vacation to Portland, OR (vegan meals followed up by Cap’n Crunch donuts.  It took me a week to recover from the donut hangover).  In the meantime, lots of ideas have been brewing, so expect many new blog entries soon.

***

When I began working on my dissertation in the mid 1990s, I used a computer primarily to do word processing—and goof off with Tetris.  Although I used digital collections such as Early American Fiction and Making of America for my dissertation project on bachelorhood in 19th C American literature, I did much of my research the old fashioned way: flipping through the yellowing pages of 19th century periodicals on the hunt for references to bachelors, taking notes using my lucky leaky fountain pen.  I relied on books for my research and, in the end, produced a book.

At the same time that I was dissertating, I was also becoming enthralled by the potential of digital scholarship through my work at the University of Virginia’s (late lamented) Electronic Text Center.  I produced an electronic edition of the first section from Donald Grant Mitchell’s bestseller Reveries of a Bachelor that allowed readers to toggle between variants.   I even convinced my department to count Perl as a second language, citing the Matt Kirschenbaum precedent (“come on, you let Matt do it, and look how well that turned out”) and the value of computer languages to my profession as a budding digital humanist.  However, I decided not to create an electronic version of my dissertation (beyond a carefully backed-up Word file) or to use computational methods in doing my research, since I wanted to finish the darn thing before I reached retirement age.

Last year, five years after I received my PhD and seven years after I had become the director of Rice University’s Digital Media Center, I was pondering the potential of digital humanities, especially given mass digitization projects and the emergence of tools such as TAPOR and Zotero.  I wondered: What is digital scholarship, anyway?  What does it take to produce digital scholarship? What kind of digital resources and tools are available to support it? To what extent do these resources and tools enable us to do research more productively and creatively? What new questions do these tools and resources enable us to ask? What’s challenging about producing digital scholarship? What happens when scholars share research openly through blogs, institutional repositories, & other means?

I decided to investigate these questions by remixing my 2002 dissertation as a work of digital scholarship.  Now I’ll acknowledge that my study is not exactly scientific—there is a rather subjective sample of one.  However, I figured, somewhat pragmatically, that the best way for me to understand what digital scholars face was to do the work myself.  I set some loose guidelines: I would rely on digital collections as much as possible and would experiment with tools for analyzing, annotating, organizing, comparing and visualizing digital information.  I would also explore different ways of representing my ideas, such as hypertextual essays and videos.  Embracing social scholarship, I would do my best to share my work openly and make my research process transparent.  So that the project would be fun and evolve organically, I decided to follow my curiosity wherever it led me, imagining that I would end up with a series of essays on bachelorhood in 19th century American culture and, as sort of an exoskeleton, meta-reflections on the process of producing digital scholarship.

My first challenge was defining digital scholarship.  The ACLS Commission on Cyberinfrastructure’s report points to five manifestations of digital scholarship: collection building, tools to support collection building, tools to support analysis, using tools and collections to produce “new intellectual products,” and authoring tools.   Some might argue we shouldn’t really count tool and collection building as scholarship.  I’ll engage with this question in more detail in a future post, but for now let me say that most consider critical editions, bibliographies, dictionaries and collations, arguably the collections and tools of the pre-digital era, to be scholarship.  In many cases, building academic tools and collections requires significant research and expertise and results in the creation of knowledge—so, scholarship.   Still, my primary focus is on the fourth aspect, leveraging digital resources and tools to produce new arguments.  I’m realizing along the way, though, that I may need to build my own personal collections and develop my own analytical tools to do the kind of scholarship I want to do.

In a recent presentation at CNI, Tara McPherson, the editor of Vectors, offered her own “Typology of Digital Humanities”:
•    The Computing Humanities: focused on building tools, infrastructure, standards and collections, e.g. The Blake Archive
•    The Blogging Humanities: networked, peer-to-peer, e.g. crooked timber
•    The Multimodal Humanities: “bring together databases, scholarly tools, networked writing, and peer-to-peer commentary while also leveraging the potential of the visual and aural media that so dominate contemporary life,” e.g. Vectors

Mashing up these two frameworks, my own typology would look something like this:

•    Tools, e.g. TAPOR, Zotero
•    Collections, e.g. The Blake Archive
•    Theories, e.g. McGann’s Radiant Textuality
•    Interpretations and arguments that leverage digital collections and tools, e.g. Ayers and Thomas’ The Difference Slavery Made
•    Networked Scholarship: a term that I borrow from the Institute for the Future of the Book’s Bob Stein and that I prefer to “blogging humanities,” since it encompasses many modes of communication, such as wikis, social bookmarking, institutional repositories, etc. Examples include Savage Minds (a group blog in anthropology), etc.
•    Multimodal scholarship: e.g. scholarly hypertexts and videos, e.g. what you might find in Vectors
•    Digital cultural studies, e.g. game studies, Lev Manovich’s work, etc (this category overlaps with theories)

Initially I assumed that tools, theories and collections would feed into arguments that would be expressed as networked and/or multimodal scholarship and be informed by digital cultural studies.  But I think that describing digital scholarship as a sort of assembly line in which scholars use tools, collections and theories to produce arguments oversimplifies the process.  My initial diagram of digital scholarship pictured single-headed arrows linking different approaches to digital scholarship; my revised diagram looks more like spaghetti, with arrows going all over the place.  Theories inform collection building; the process of blogging helps to shape an argument; how a scholar wants to communicate an idea influences what tools are selected and how they are used.

After coming up with a preliminary definition of what I wanted to do, I needed to figure out how to structure my work.  I thought of John Unsworth’s notion of scholarly primitives, a compelling description of core research practices.  Depending on how you count them, Unsworth identifies 7 scholarly primitives:
•    Discovering
•    Annotating
•    Comparing
•    Referring
•    Sampling
•    Illustrating
•    Representing

As useful as this list is in crystallizing what scholars do, I think the list is missing at least one more crucial scholarly primitive, perhaps the fundamental one: collaboration. Although humanists are stereotyped as solitary scholars isolated in the library, they often work together, whether through co-editing journals or books, sharing citations, or reviewing one another’s work.  In the digital humanities, of course, developing tools, standards, and collections demands collaboration among scholars, librarians, programmers, etc.  I would also define networked scholarship—blogging, contributing to wikis, etc—as collaborative, since it requires openly sharing ideas and supports conversation. It’s only appropriate for me to note that this idea was worked out collaboratively, with colleagues at THAT Camp.

I want to make my research process as visible as possible, not only for idealistic reasons, but also because my work only gets better the more feedback I receive.  So I started up a blog—actually, several of them. At the somewhat grandly-named Digital Scholarship in the Humanities, I reflect on trends in the digital humanities and on broader lessons learned in the process of doing my research project.  In “Lisa Spiro’s Research Notes,”  I typically address stuff that seems too specialized, half-baked, or even raw for me to put on my main blog, such as my navel gazing on where to take my project next, or my experiments with Open Wound, a language re-mixing tool.   At my PageFlakes research portal, I provide a single portal to the various parts of my research project, offering RSS feeds for both of my blogs as well as for a Google News search of the term “digital humanities,” my delicious bookmarks for “digital scholarship,” links to my various digital humanities projects, and more.

I’ll admit that when I started my experiments with social scholarship I worried that no one would care, or that I would embarrass myself by writing something really stupid, but so far I’ve loved the experience.  Through comments and emails from readers, I’m able to see other perspectives and improve my own thinking.  I’ve heard from biologists and anthropologists as well as literary scholars and historians, and I’ve communicated with researchers from several countries.  As a result, I feel more engaged in the research community and more motivated to keep working.   Although I know blogging hasn’t caught on in every corner of academia, I think it has been good for my career as a digital humanist.  I am more visible and thus have more opportunities to participate in the community, such as by reviewing book proposals, articles, and grant applications.

I don’t have space to discuss the relevance of each scholarly primitive to my project, but I did want to mention a few of them: discovering, comparing, and representing.

Discovering

In order to use text analysis and other tools, I needed my research materials to be in an electronic format.  In the age of mass digitization projects such as Google Books and the Open Content Alliance, I wondered how many of my 296 original research sources are digitized & available in full text.  So I diligently searched Google Books and several other sources to find out.  I looked at 5 categories: archival resources as well as primary and secondary books and journals.   I found that with the exception of archival materials, over 90% of the materials I cited in my bibliography are in a digital format.  However, only about 83% of primary resources and 37% of the secondary materials are available as full text.  If you want to do use text analysis tools on 19th century American novels or 20th century articles from major humanities journals, you’re in luck, but the other stuff is trickier because of copyright constraints.  (I’ll throw in another scholarly primitive, annotation, and say that I use Zotero to manage and annotate my research collections, which has made me much more efficient and allowed me to see patterns in my research collections.)

Of course, scholars need to be able to trust the authority of electronic resources.  To evaluate quality, I focused on four collections that have a lot of content in my field, 19th century American literature: Google Books, Open Content Alliance, Early American Fiction (a commercial database developed by UVA’s Electronic Text Center), and Making of America.  I found that there were some scanning errors with Google Books, but not as many as I expected. I wished that Google Books provided full text rather than PDF files of its public domain content, as do Open Content Alliance and Making of America (and EAF, if you just download the HTML).  I had to convert Google’s PDF files to Adobe Tagged Text XML and got disappointing results.  The OCR quality for Open Content Alliance was better, but words were not joined across line breaks, reducing accuracy.  With multi-volume works, neither Open Content Alliance nor Google Books provided very good metadata.  Still, I’m enough of a pragmatist to think that having access to this kind of data will enable us to conduct research across a much wider range of materials and use sophisticated tools to discern patterns – we just need to be aware of the limitations.

Comparing
To evaluate the power of text analysis tools for my project, I did some experiments using TAPOR tools, including a comparison of two of my key bachelor texts: Mitchell’s Reveries of a Bachelor, a series of a bachelor’s sentimental dreams (sometimes nightmares) about what it would be like to be married, and Melville’s Pierre, which mixes together elements of sentimental fiction, Gothic literature, and spiritualist tracts to produce a bitter satire.   I wondered if there was a family resemblance between these texts.  First I used the Wordle word cloud generator to reveal the most frequently appearing words.  I noted some significant overlap, including words associated with family such as mother and father, those linked with the body such as hand and eye, and those associated with temporality, such as morning, night, and time.  To develop a more precise understanding of how frequently terms appeared in the two texts and their relation to each other, I used TAPOR’s Comparator tool.  This tool also revealed words unique to each work, such as “flirt” and “sensibility” in the case of Reveries, “ambiguities” and “miserable” in the case of Pierre.  Finally, I used TAPOR’s concordance tool to view key terms in context.  I found, for instance, that in Mitchell “mother” is often associated with hands or heart, while in Melville it appears with terms indicating anxiety or deceit.  By abstracting out frequently occurring and unique words, I can how Melville, in a sense, remixes elements of sentimental fiction, putting terms in a darker context.  The text analysis tools provide a powerful stimulus to interpretation.

Representing
Not only am I using the computer to analyze information, but also to represent my ideas in a more media-rich, interactive way than the typical print article.  I plan to experiment with Sophie as a tool for authoring multimodal scholarship, and I’m also experimenting with video as a means for representing visual information. Right now I’m reworking an article on the publication history of Reveries of a Bachelor as a video so that I show significant visual information such as bindings, illustrations, and advertisements.    I’ve condensed a 20+ page article into a 7 minute narrative, which for a prolix person like me is rough.  I also have been challenged to think visually and cinematically, considering how the movement of the camera and the style of transitions shape the argument.  Getting the right imagery—high quality, copyright free—has been tricky as well.  I’m not sure how to bring scholarly practices such as citation into videos.  Even though my draft video is, frankly, a little amateurish, putting it together has been lots of fun, and I see real potential for video to allow us to go beyond text and bring the human voice, music, movement and rich imagery into scholarly communication.

On Tools
In the course of my experiments in digital scholarship, I often found myself searching for the right tool to perform a certain task.  Likewise, in my conversations with researchers who aren’t necessarily interested in doing digital scholarship, just in doing their research better, I learned that they weren’t aware of digital tools and didn’t know where to find out about them.  To make it easier for researchers to discover relevant tools, I teamed up with 5 other librarians to launch the Digital Research Tools, or DiRT, wiki at the end of May.   DiRT provides a directory of digital research tools, primarily free but also commercial, categorized by their functions, such as “manage citations.”  We are also writing reviews of tools geared toward researchers and trying to provide examples of how these tools are used by the research community.  Indeed, DiRT focuses on the needs of the community; the wiki evolves thanks to its contributors.   Currently 14 people in fields such as anthropology, communications, and educational technology have signed on to be contributors.  Everything is under a Creative Commons attribution license.  We would love to see spin-offs, such as DiRT in languages besides English; DiRT for developers; and Old DiRT (dust?), the hall of obsolete but still compelling tools.  My experiences with DiRT have demonstrated again the beauty of collaboration and sharing.  Both Dan Cohen of CHNM & Alan Liu of UC Santa Barbara generously offered to let us grab content from their own tools directories.  Busy folks have freely given their time to add tools to DiRT.  Through my work on DiRT, I’ve learned about tools outside of my field, such as qualitative data analysis software.

So I’ll end with an invitation: Please contribute to DiRT.  You can sign up to be an editor or reviewer, recommend tools to be added, or provide feedback via our survey.  Through efforts like DiRT, we hope to enable new digital scholarship, raise the profile of inventive digital tools, and build community.