Monthly Archives: April 2009

Collaborative Authorship in the Humanities

Recently I heard the editors of a history journal and a literature journal say that they rarely published articles written by more than one author—perhaps a couple every few years.   Around the same time, I was looking over a recent issue of Literary and Linguistic Computing and noticed that it included several jointly-authored articles.  This got me wondering:  is collaborative authorship more common in digital humanities than in “traditional” humanities?

“Collaboration” is often associated with “digital humanities.”  Building digital collections, creating software, devising new analytical methods, and authoring multimodal scholarship typically cannot be accomplished by a solo scholar; rather, digital humanities projects require contributions from people with content knowledge, technical skills, design skills, project management experience, metadata expertise, etc.  Our Cultural Commonwealth identifies enabling collaboration as a key feature of the humanities cyberinfrastructure, funders encourage multi-institutional and even international teams, and proponents of increased collaboration in the humanities like Cathy Davidson and Lisa Ede and Andrea A. Lunsford cite digital humanities projects such as Orlando as exemplifying collaborative possibilities.

As a preliminary investigation, I compared the number of collaboratively-written articles published between 2004 and 2008 in two well-respected quarterly journals, American Literary History (ALH) and Literary and Linguistic Computing (LLC).  Both journals are published by Oxford University Press as part of its humanities catalog. I selected ALH because it is a leading journal on American literature and culture that encourages critical exchanges and interdisciplinary work—and because I thought it would be fun to see what the journal has published since 2004. (The hardest part of my research: resisting the urge to stop and read the articles.)  LLC, the official publication of the Association for Literary and Linguistic Computing and the Association for Computers and the Humanities, includes contributions on digital humanities from around the world—the UK, the US, Germany, Australia, Greece, Italy, Norway, etc.—and from many disciplines, such as literature, linguistics, computer and information science, statistics, librarianship, and biochemistry.  To determine the level of collaborative authorship in each issue, I tallied articles that had more than one author, excluding editors’ introductions, notes on contributors, etc.  For LLC, I counted everything that had an abstract as an article.  While I didn’t count LLC’s reviews, which typically are brief and focus on a single work, I did include the review essays published by ALH, since they are longer and synthesize critical opinion about several works.

So what did I find? Whereas 5 of 259 (1.93%) articles published in ALH—about one a year–feature two authors (none had more than two), 70 out of 145 (48.28%) of the articles published in LLC were written by two or more authors.  Most (4 of 5, or 80%) of the ALH articles were written by scholars from multiple institutions, whereas 49% (34 of 70) of the LLC articles were.  About 16% (11 of 70) of the LLC articles featured contributors from two or more countries, while none of the ALH articles did.  Two of the five ALH articles are review essays, while three focus on hemispheric or transatlantic American studies.  Although this study should be carried out more systematically across a wider range of journals, the initial results do suggest that collaborative authorship is more common in digital humanities. [See the Zotero reports for ALH and LLC for more information.]

Why does LLC feature more collaboratively written articles than ALH? I suspect that because, as I’ve already suggested, digital humanities projects often require collaboration, whereas most literary criticism can be produced by an individual scholar who needs only texts to read, a place to write, and a computer running a word processing application (as well as a library to provide access to texts, colleagues to consult and to review the resulting research, a university and/or funding agency to support the research, a publisher to disseminate the work, etc.).   Moreover, LLC represents a sort of meeting point for a range of disciplines, including several (such as computer science) that have a tradition of collaborative authorship.  Whereas collaborative authorship is common (even expected) in the sciences, in the humanities many tenure and promotion committees have not yet developed mechanisms for evaluating and crediting collaborative work. In a recent blog post, for example, Cathy Davidson tells a troubling story about being told (in a public and humiliating way) by a member of a search committee that her collaborative work and other “non-traditional” research didn’t “count.”  Literary study values individual interpretation, or what Davidson calls “the humanistic ethic of individuality.”

While individual scholarship remains valid and important, shouldn’t humanities scholarship to expand to embrace collaborative work as well?  Indeed, in 2000 the MLA launched an initiative to consider “alternatives to the adversarial academy” and encourage collaborative scholarship.  (By the way, I’m not criticizing ALH; I doubt that it receives many collaboratively-authored submissions, and it has encouraged critical exchange and interdisciplinary research.)  Of course, collaboration poses some significant challenges, such divvying up and managing work, negotiating conflicts, finding funding for complex projects, assigning credit, etc.    But as Lisa Ede and Andrea A. Lunsford point out, collaborative authorship can lead to a “widening of scholarly possibilities.”  In talking to humanities scholars (particularly those in global humanities), I’ve noticed genuine enthusiasm about collaborative work that allows scholars to engage in community, consider alternative perspectives, and undertake ambitious projects that require diverse skills and/or knowledge.

What kind of collaborations do the jointly-written articles in LLC and ALH represent? Since LLC often lists only the authors’ institutional affiliations, not their departments, tracing the degree of interdisciplinary collaboration would require further research.  However, I did find examples of several types of collaboration (which may overlap):

  • Faculty/student collaboration: In the sciences, faculty frequently publish with their postdocs and students, a practice that seems to be rare in the humanities.  I noted at least one example of a similar collaboration in LLC—involving, I should note, computer science rather than humanities grad students.
    • Urbina, Eduardo et al. “Visual Knowledge: Textual Iconography of the Quixote, a Hypertextual Archive.” Lit Linguist Computing 21.2 (2006): 247-258. 5 Apr 2009 <http://llc.oxfordjournals.org/cgi/content/abstract/21/2/247>.
      This article includes contributions by a professor of Hispanic studies, a professor of computer science, a librarian/archivist/adjunct English professor, and three graduate students in computer science.
  • Project teams: In digital humanities, collaborators often work together on projects to build digital collections, develop software, etc.  In LLC, I found a number of articles written by project teams, such as:
    • Barney, Brett et al. “Ordering Chaos: An Integrated Guide and Online Archive of Walt Whitman’s Poetry Manuscripts.” Lit Linguist Computing 20.2 (2005): 205-217. 5 Apr 2009 <http://llc.oxfordjournals.org/cgi/content/abstract/20/2/205>.
      Members of the project team included an archivist, programmer, digital initiatives librarian, English professor, and two English Ph.Ds who serve as library faculty and focus on digital humanities.
  • Interdisciplinary collaborations: In LLC, I noted several instances of teams that included humanities scholars and scientists working together to apply particular methods (text mining, stemmatic analysis) in the humanities.  For example:
    • Windram, Heather F. et al. “Dante’s Monarchia as a test case for the use of phylogenetic methods in stemmatic analysis.” Lit Linguist Computing 23.4 (2008): 443-463. 5 Apr 2009 <http://llc.oxfordjournals.org/cgi/content/abstract/23/4/443>.  The authors include two biochemists, a textual scholar, and a scholar of Italian literature
    • Sculley, D., and Bradley M. Pasanek. “Meaning and mining: the impact of implicit assumptions in data mining for the humanities.” Lit Linguist Computing 23.4 (2008): 409-424. 5 Apr 2009 <http://llc.oxfordjournals.org/cgi/content/abstract/23/4/409>.
      Authored by a computer scientist and a literature professor.
  • Shared interests: Researchers may publish together because they share an intellectual kinship and can accomplish more by working together.  For instance:
    • Auerbach, Jonathan, and Lisa Gitelman. “Microfilm, Containment, and the Cold War.” American Literary History 19.3 (2007).  I noticed that Jonathan Auerbach and Lisa Gitelman thank each other in works that each had previously published as an individual.

Observing that LLC publishes a number of collaboratively-written articles opens up several questions, which I hope to pursue through interviews with the authors of at least some of these articles (if you are one of these authors, you may see an email from me soon….):

1)    What characterizes the LLC articles that have only one author?
Based on a quick look at the tables of contents from past issues, I suspect that these articles are more likely to be theoretical or to focus on particular problems rather than projects.  Here, for example, are the titles of some singly-authored articles:  “The Inhibition of Geographical Information in Digital Humanities Scholarship,” “Monkey Business—or What is an Edition?,” “What Characterizes Pictures and Text?” and “Original, Authentic, Copy: Conceptual Issues in Digital Texts.”

2)    Why was the article written collaboratively?

What led to the collaboration?  Did team members offer complementary skill sets, such as knowledge of statistical methods and understanding of the content? How did the collaborators come together—do they work for the same institution? Did they meet at a conference? Do they cite each other?

3)    What were the outcomes of the collaboration?

What was accomplished through collaboration that would have been difficult to do otherwise?  Would the scale of the project be smaller if it were pursued by a single scholar? Did the project require contributions from people with different types of expertise?

4)    How was the collaboration managed and sustained?

Was one person in charge, or was authority distributed? What tools were used to facilitate communication, track progress on the project, and support collaborative writing? To what degree was face-to-face interaction important?

5)    What was difficult about the collaboration?

What was hard about collaborating: Communicating? Identifying who does what? Agreeing on methods? Coming to a common understanding of results? Finding funding?

We can find answers to some of these questions in Lynne Siemens’ recent article “’It’s a team if you use “reply all” ‘: An exploration of research teams in digital humanities environments.”  Siemens describes factors contributing to the success of collaborative teams in digital humanities, such as clear milestones and benchmarks, strong leadership, equal contributions by members of the team, and a balance between communication through digital tools and in-person meetings.  I particularly liked the description of “a successful team as a ‘round thing’ with equitable contribution by individual members.”

In doing this research, I realized how much it would benefit from collaborators.  For instance, someone with expertise in citation analysis could help enlarge the study and detect patterns in collaborative authorship, while someone with expertise in qualitative research methods could help to interview collaborative research teams and analyze the resulting data.  However, I think anyone with an interest in the topic could make valuable contributions.  This is by way of leading up to my pitch: I’m working on a piece about collaborative research methods in digital humanities for an essay collection and would welcome collaborators.  If you’re interested in teaming up, contact me at lspiro@rice.edu.

Works Cited

Davidson, Cathy N. “What If Scholars in the Humanities Worked Together, in a Lab?.” The Chronicle of Higher Education 28 May 1999. 18 Apr 2009 <http://chronicle.com/weekly/v45/i38/38b00401.htm>.

Ede, Lisa, and Andrea A. Lunsford. “Collaboration and Concepts of Authorship.” PMLA 116.2 (2001): 354-369. 18 Apr 2009 <http://www.jstor.org/stable/463522>.

Siemens, Lynne. “’It’s a team if you use “reply all” ‘: An exploration of research teams in digital humanities environments.” Lit Linguist Computing (2009): fqp009. 14 Apr 2009 <http://llc.oxfordjournals.org/cgi/content/abstract/fqp009v1>.

Digital Humanities in 2008, III: Research

In this final installment of my summary of Digital Humanities in 2008, I’ll discuss developments in digital humanities research. (I should note that if I attempted to give a true synthesis of the year in digital humanities, this would be coming out 4 years late rather than 4 months, so this discussion reflects my own idiosyncratic interests.)

1) Defining research challenges & opportunities

What are some of the key research challenges in digital humanities? Leading scholars tackled this question when CLIR and the NEH convened a workshop on Promoting Digital Scholarship: Formulating Research Challenges In the Humanities, Social Sciences and Computation. Prior to the workshop, six scholars in classics, architectural history, physics/information sciences, literature, visualization, and information retrieval wrote brief overviews of their field and of the ways that information technology could help to advance it. By articulating the central concerns of their fields so concisely, these essays promote interdisciplinary conversation and collaboration; they’re also fun to read. As Doug Oard writes in describing the natural language processing “tribe,” “Learning a bit about the other folks is a good way to start any process of communication… The situation is really quite simple: they are organized as tribes, they work their magic using models (rather like voodoo), they worship the word “maybe,” and they never do anything right.” Sounds like my kind of tribe. Indeed, I’d love to see a wiki where experts in fields ranging from computational biology to postcolonial studies write brief essays about their fields, provide a bibliography of foundational works, and articulate both key challenges and opportunities for collaboration. (Perhaps such information could be automatically aggregated using semantic technologies—see, for instance, Concept Web or Kosmix–but I admire the often witty, personal voices of these essays.)

Here are some key ideas that emerge from the essays:

  1. Global Humanistic Studies: Both Caroline Levander and Greg Crane, Alison Babeu, David Bamman, Lisa Cerrato, and Rashmi Singhal call for a sort of global humanistic studies, whether re-conceiving American studies from a hemispheric perspective or re-considering the Persian Wars from the Persian point of view. Scholars working in global humanistic studies face significant challenges, such as the need to read texts in many languages and understand multiple cultural contexts. Emerging technologies promise to help scholars address these problems. For instance, named entity extraction, machine translation and reading support tools can help scholars make sense of works that would otherwise be inaccessible to them; visualization tools can enable researchers “to explore spatial and temporal dynamism;” and collaborative workspaces allow scholars to divide up work, share ideas, and approach a complex research problem from multiple perspectives. Moreover, a shift toward openly accessible data will enable scholars to more easily identify and build on relevant work. Describing how reading support tools enable researchers to work more productively, Crane et . write, “By automatically linking inflected words in a text to linguistic analyses and dictionary entries we have already allowed readers to spend more time thinking about the text than was possible as they flipped through print dictionaries. Reading support tools allow readers to understand linguistic sources at an earlier stage of their training and to ask questions, no matter how advanced their knowledge, that were not feasible in print.” We can see a similar intersection between digital humanities and global humanities in projects like the Global Middle Ages.
  2. What skills do humanities scholars need? Doug Oard suggests that humanities scholars should collaborate with computer scientists to define and tackle “challenge problems” so that the development of new technologies is grounded in real scholarly needs. Ultimately, “humanities scholars are going to need to learn a bit of probability theory” so that they can understand the accuracy of automatic methods for processing data, the “science of maybe.” How does probability theory jibe with humanistic traditions of ambiguity and interpretation? And how are humanities scholars going to learn these skills?

According to the symposium, major research challenges for the digital humanities include:

  1. Scale and the poverty of abundance:” developing tools and methods to deal with the plenitude of data, including text mining and analysis, visualization, data management and archiving, and sustainability.
  2. Representing place and time: figuring out how to support geo-temporal analysis and enable that analysis to be documented, preserved, and replicated
  3. Social networking and the economy of attention: understanding research behaviors online; analyzing text corpora based on these behaviors (e.g. citation networks)
  4. Establishing a research infrastructure that facilitates access, interdisciplinary collaboration, and sustainability. “As one participant asked, “What is the Protein Data Bank for the humanities?””

2) High performance computing: visualization, modeling, text mining

What are some of the most promising research areas in digital humanities? In a sense, the three recent winners of the NEH/DOE’s High Performance Computing Initiative define three of the main areas of digital humanities and demonstrate how advanced computing can open up new approaches to humanistic research.

  • text mining and text analysis: For its project on “Large-Scale Learning and the Automatic Analysis of Historical Texts,” the Perseus Digital Library at Tufts University is examining how words in Latin and Greek have changed over time by comparing the linguistic structure of classical texts with works written in the last 2000 years. In the press release announcing the winners, David Bamman, a senior researcher in computational linguistics with the Perseus Project, said that “[h]igh performance computing really allows us to ask questions on a scale that we haven’t been able to ask before. We’ll be able to track changes in Greek from the time of Homer to the Middle Ages. We’ll be able to compare the 17th century works of John Milton to those of Vergil, which were written around the turn of the millennium, and try to automatically find those places where Paradise Lost is alluding to the Aeneid, even though one is written in English and the other in Latin.”
  • 3D modeling: For its “High Performance Computing for Processing and Analysis of Digitized 3-D Models of Cultural Heritage” project, the Institute for Advanced Technology in the Humanities at the University of Virginia will reprocess existing data to create 3D models of culturally-significant artifacts and architecture. For example, IATH hopes to re-assemble fragments that chipped off  ancient Greek and Roman artifacts.
  • Visualization and cultural analysis: The University of California, San Diego’s Visualizing Patterns in Databases of Cultural Images and Video project will study contemporary culture, analyzing datastreams such as “millions of images, paintings, professional photography, graphic design, user-generated photos; as well as tens of thousands of videos, feature films, animation, anime music videos and user-generated videos.” Ultimately the project will produce detailed visualizations of cultural phenomena.

Winners received compute time on a supercomputer and technical training.

Of course, there’s more to digital humanities than text mining, 3D modeling, and visualization. For instance, the category listing for the Digital Humanities and Computer Science conference at Chicago reveals the diversity of participants’ fields of interest. Top areas include text analysis; libraries/digital archives; imaging/visualization, data mining/machine learning; informational retrieval; semantic search; collaborative technologies; electronic literature; and GIS mapping. A simple analysis of the most frequently appearing terms in the Digital Humanities 2008 Book of Abstracts suggests that much research continues to focus on text—which makes sense, given the importance of written language to humanities research.  Here’s the list that TAPOR generated of the 10 words most frequently used terms in the DH 2008 abstracts:

  1. text: 769
  2. digital: 763
  3. data: 559
  4. information: 546
  5. humanities: 517
  6. research: 501
  7. university: 462
  8. new: 437
  9. texts: 413
  10. project: 396

“Images” is used 161 times, visualization 46.

Wordle: Digital Humanities 2008 Book of Abstracts

And here’s the word cloud. As someone who got started in digital humanities by marking up texts in TEI, I’m always interested in learning about developments in encoding, analyzing and visualizing texts, but some of the coolest sessions I attended at DH 2008 tackled other questions: How do we reconstruct damaged ancient manuscripts? How do we archive dance performances? Why does the digital humanities community emphasize tools instead of services?

3) Focus on method

As digital humanities emerges, much attention is being devoted to developing research methodologies. In “Sunset for Ideology, Sunrise for Methodology?,” Tom Scheinfeldt suggests that humanities scholarship is beginning to tilt toward methodology, that we are entering a “new phase of scholarship that will be dominated not by ideas, but once again by organizing activities, both in terms of organizing knowledge and organizing ourselves and our work.”

So what are some examples of methods developed and/or applied by digital humanities researchers? In “Meaning and mining: the impact of implicit assumptions in data mining for the humanities,” Bradley Pasanek and D. Sculley tackle methodological challenges posed by mining humanities data, arguing that literary critics must devise standards for making arguments based upon data mining. Through a case study testing Lakoff’s theory that political ideology is defined by metaphor, Pasanek and Sculley demonstrate that the selection of algorithms and representation of data influence the results of data mining experiments. Insisting that interpretation is central to working with humanities data, they concur with Steve Ramsay and others in contending that data mining may be most significant in “highlighting ambiguities and conflicts that lie latent within the text itself.” They offer some sensible recommendations for best practices, including making assumptions about the data and texts explicit; using multiple methods and representations; reporting all trials; making data available and experiments reproducible; and engaging in peer review of methodology.

4) Digital literary studies

Different methodological approaches to literary study are discussed in the Companion to Digital Literary Studies (DLS), which was edited by Susan Schreibman and Ray Siemens and was released for free online in the fall of 2008. Kudos to its publisher, Blackwell, for making the hefty volume available, along with A Companion to Digital Humanities. The book includes essays such as “Reading digital literature: surface, data, interaction, and expressive processing” by Noah Wardrip-Fruin, “The Virtual Codex from page space to e-space” by Johanna Drucker, “Algorithmic criticism” by Steve Ramsay, and “Knowing true things by what their mockeries be: modelling in the humanities” by Willard McCarty. DLS also provides a handy annotated bibliography by Tanya Clement and Gretchen Gueguen that highlights some of the key scholarly resources in literature, including Digital Transcriptions and Images, Born-Digital Texts and New Media Objects, and Criticism, Reviews, and Tools. I expect that the book will be used frequently in digital humanities courses and will be a foundational work.

5) Crafting history: History Appliances

For me, the coolest—most innovative, most unexpected, most wow!—work of the year came from the ever-inventive Bill Turkel, who is exploring humanistic fabrication (not in the Mills Kelly sense of making up stuff ;), but in the DIY sense of making stuff). Turkel is working on “materialization,” giving a digital representation physical form by using, for example, a rapid prototyping machine, a sort of 3D printer. Turkel points to several reasons why humanities scholars should experiment with fabrication: they can be like DaVinci, making the connection between the mind and hand by realizing an idea in physical form; study the past by recreating historical objects (fossils, historical artifacts, etc) that can be touched, rotated, scrutinized; explore “haptic history,” a sensual experience of the past; and engage in “Critical technical practice,” where scholars both create and critique.

Turkel envisions making digital information “available in interactive, ambient and tangible forms.”  As Turkel argues, “As academic researchers we have tended to emphasize opportunities for dissemination that require our audience to be passive, focused and isolated from one another and from their surroundings. We need to supplement that model by building some of our research findings into communicative devices that are transparently easy to use, provide ambient feedback, and are closely coupled with the surrounding environment.” Turkel and his team are working on 4 devices: a dashboard, which shows both public and customized information streams on a large display; imagescapes and soundscapes that present streams of complex data as artificial landscapes or sound, aiding awareness; a GeoDJ, which is an iPod-like device that uses GPS and GIS to detect your location and deliver audio associated with it ( e.g. percussion for an historic industrial site); and ice cores and tree rings, “tangible browsers that allow the user to explore digital models of climate history by manipulating physical interfaces that are based on this evidence.” This work on ambient computing and tangible interfaces promises to foster awareness and open up understanding of scholarly data by tapping people’s natural way of comprehending the world through touch and other forms of sensory perception. (I guess the senses of smell and taste are difficult to include in sensual history, although I’m not sure I want to smell or taste many historical artifacts or experiences anyway. I would like to re-create the invention of the Toll House cookie, which for me qualifies as an historic occasion.) This approach to humanistic inquiry and representation requires the resources of a science lab or art studio—a large, well-ventilated space as well as equipment like a laser scanner, lathes, mills, saws, calipers, etc. Unfortunately, Turkel has stopped writing his terrific blog “Digital History Hacks” to focus on his new interests, but this work is so fascinating that I’m anxious to see what comes next–which describes my attitude toward digital humanities in general.