Monthly Archives: March 2008

Becoming a “Digital Scholar”: Digital Discovery 2008

[Below is the text of a presentation that I will be giving at the Digital Discovery conference on March 27, 2008]

When I started a graduate program in English way back in 1992, I used computers mainly to write papers. Then came the web. Within a few years, I was creating web-based assignments for the undergraduate courses I was teaching,UVA's Alderman Library marking up electronic texts for the University of Virginia’s Electronic Text Center, creating my own electronic edition of a nineteenth-century sentimental sketch, and copy-editing articles for one of the first humanities journals to be published exclusively online, Postmodern Culture. Even though I was actively engaged in what has now come to be called “digital humanities,” I still did much of my dissertation research—which examines bachelorhood in nineteenth-century American literature and culture–the old fashioned way. I wandered the stacks of Alderman Library smelling the decaying books, skimmed print catalogs such as Lyle Wright’s bibliography of American fiction, flipped through 150 year-old volumes of magazines such as Harpers and The Atlantic, and even took notes with a fountain pen.

I finished my dissertation in 2002. After spending so much time laboring over it, I wanted to move on. But 5 years later, I decided I was ready to take up with the diss again, on new terms. I’m fascinated by the question of how the abundance of digital information and the development of new technologies will affect humanities scholarship, and it seemed to me that the best way for me to understand these transformations would be to undertake a major research project myself. By revisiting my dissertation, I could build on my existing knowledge and compare my research process 5+ years ago to what’s possible today. Thus I decided to remix my dissertation as a work of digital scholarship.

So what’s digital scholarship? According to the ACLS report on cyberinfrastructure for the humanities and social sciences, digital scholarship includes building digital collections, creating tools for collecting, analyzing, and authoring digital information, and “using digital collections and analytical tools to generate new intellectual products.” My project reflects that last idea, as I am exploring the implications of using digital collections and tools. My work is still very much in process, but here are three preliminary observations:

  1. A vast amount of information is now available online. When I first started working on my dissertation, I wished that there were some way for me to search not only bibliographic information, but also the content of works themselves. Well, that wish is beginning to come true. We don’t know exactly how many books Google has digitized, but the number is well over a million—and of course many more works have also been digitized by the Open Content Alliance, Microsoft’s Live Book Search, and countless libraries and archives. I was curious about how many of the nearly 300 works I cite in my dissertation bibliography are now available online, so I searched for them in Google Books, Making of America, online journal collections, and other sites. I found that 77% of my primary source resources and 22% of my secondary sources are available online as full-text, while 92% of all my research materials have been digitized (this number includes works available through Google Books as limited preview, snippet view, and no preview.) 13% of the resources—mostly journal articles–require a subscription.Now I should note that I study 19th century American literature, which is safely out of copyright and ubiquitous at most research libraries. Still, many significant materials haven’t been digitized, particularly periodical literature and archival materials. Other works are only available through subscription. Even if the resource has been digitized, it often has errors—metadata can be unclear or incomplete, scans can show the fingers of the scanning operators or be cut off, and the quality of the OCR can be poor. Nevertheless, the availability of so much digital information means that how we do research will change.

  2. As we deal with the abundance of information, we need tools to find, organize, manage, analyze and share our research materials. Fortunately, those tools are beginning to be developed.For example, when I started researching my dissertation, ZoteroI wasted a lot of time attempting to organize my notes and looking up bibliographic information that I hadn’t captured accurately. Now I manage my research much more effectively by using Zotero, a free Firefox-based research tool developed by George Mason’s Center for New Media and History. Zotero automatically captures bibliographic information from hundreds of supported web sites and lets you insert properly-formatted notes and bibliographies as you write a paper in Word. Moreover, you can take notes in Zotero, tag your resources, organize them into collections and sub-collections, and search across them. The next version of Zotero will support sharing bibliographic resources on the network.To detect patterns in the texts that I collect, I am using text analysis tools such as those developed by TAPOR. With TAPOR, you can create a concordance, compare texts, and look for co-occurring words. I’ve already used TAPOR to generate a list of the most frequently occurring terms in the first chapter of my dissertation, then compared that list to one I created manually. While my own list mainly focused on different descriptors of the bachelor, the TAPOR list reflected key components of my argument, including words associated with domesticity, nationhood, and identity. Text analysis tools can make implicit knowledge explicit and open your eyes to patterns you hadn’t previously been aware of.

    Beyond text, visualization and mashup tools allow you to make sense of data such as demographic History Browserinformation, troop movements, and even patterns in the correspondences of historical figures such as Thomas Jefferson. Ed Ayers envisions historians using dynamic “social weather maps” that allow them to watch historical forces in process. For my own project, I plan to explore the geographic and temporal nature of bachelor literature by developing several interactive maps that show where bachelor narratives were set and where their authors were born, as well as timelines that plot the publication history of bachelor literature.Having texts in open formats (such as XML) makes it much easier to analyze and manipulate them—otherwise you have to go through a cumbersome conversion process. I’m OCRing the PDFs I downloaded from Google Books so I can then use text analysis tools on them, but I understand that the resulting text will be somewhat unreliable.

  3. Although the journal and monograph still dominate the humanities, new means of scholarly communication are emerging, enabling the faster dissemination of ideas, more community dialog, and the use of multimedia.
    • I recently started blogging about my research project. Through blogging, I’ve become much more engaged in my research community and have been energized by the generous responses from friends and leaders in digital humanities. Readers have helped me to think through ideas by offering alternative perspectives and alerting me to resources I hadn’t been aware of. Blogging motivates me to follow developments in my field more closely and to synthesize what I’m observing. My blog is also a great memory aid—in preparing this talk, I’ve remembered, “Oh yeah, I blogged about that” and have been able to pull up the relevant entry quickly. I’m reaching much more people than I would through traditional means of publication. For instance, during January alone, my blog was viewed over 2,725 times, far more times, I suspect, than any of my articles have been read.
    • Our culture is increasingly a visual one, dominated by TV, movies, and YouTube videos, but we don’t yet have many examples of video-based scholarship. However, some interesting models are emerging. SciVee, which is sponsored by the Public Library of Science, NSF, and the Dan Diego Supercomputing Center, makes it easy for scientists to upload videos that accompany published articles, making their work more accessible and visible. Anthropologist Michael Wesch’s video “The Machine is Us/ing Us” demonstrates the potential of video as a means for disseminating ideas—it has been viewed almost 5 million times and explores Web 2.0 in, well, a Web 2.0 way, illustrating the dynamic, interactive nature of the Web. Inspired by these examples, as well as by digital storytelling , I’m planning to create short videos that allow me to express ideas difficult to explore in print. For example, I plan to survey America’s changing perception of the bachelor by showing images of bachelor from the 19th to 20th centuries (accompanied by bachelor songs). I also am working on a short video about the history of the bestseller Reveries of a Bachelor, which went through many editions and changed its physical format as the publisher dreamed up ways to keep it in demand.
    • Then there are the scholars who are simply—and significantly—making their works available online through open access repositories. In so doing, these researchers are making it possible for independent scholars and those at institutions without big library budgets to access their work, advancing the democracy of knowledge. Moreover, they are likely increasing their own visibility as scholars. As Michael Jensen argues, scholars will increasingly be evaluated based on the impact of their work, which will be measured by factors such as number and quality of citations, blog comments, and links to the document.

    Lest I seem naïve, I should acknowledge that significant challenges face digital scholarship. Many studies, including the ACLS report on Cyberinfrastructure and the UC Berkeley Center for Studies in Higher Education reports on faculty attitudes toward digital resources, detail these challenges, but let me just mention a few. I’ve found that many humanities scholars I’ve talked with are not yet aware of digital scholarship. Already feeling stretched by obligations to do research, teach, and perform service , few academics have time to learn new technologies. As the digital environment constantly shifts and tools come and go, it’s overwhelming trying to keep up. In any case, the system doesn’t really reward faculty for experimenting with new technologies. According to a recent MLA report, over half of the tenure committees in the humanities have no experience evaluating “scholarly monographs in electronic format.” Many researchers feel that they will be penalized if they don’t publish in the most prestigious, well-established journals. Then there’s copyright: For my remixed work on bachelorhood, I’d love to provide links to the full-text of every work that I cite. I’d also like to remix those original sources to produce new works. But what I can do is constrained by copyright.

    I believe that many of these challenges will be overcome. For example, scholarly societies like the MLA are recognizing the validity of digital scholarship. Organizations like NINES, which focuses on nineteenth century studies, are providing tools, training, content portals, and support for scholars, as well as conveying legitimacy on digital scholarship. The NEH just turned its Digital Humanities Initiative into a full-fledged Office. The Creative Commons is pressing for greater clarity on copyright.

    What effect will the computer revolution have on humanities scholarship? It’s really too early to say– in a small way, that is what I’m trying to figure out in my project. In the sciences, we’ve seen the rise of new sub-disciplines and methodologies made possible through computation and data archives. In the humanities, I believe that being able to access the full text of millions of books will bring about significant changes in how research is conducted. In a great blog post from earlier this month, Tom Scheinfeldt from the Center for History and New Media suggests that historical scholarship will shift from a focus on ideology to a focus on methodology. As my colleague Jane Segal and I found in our study of the impact of digital archives on humanities scholarship, the Walt Whitman Archive is opening up new areas of inquiry in Whitman studies. For instance, Whitman scholars are shifting to manuscript study and paying increased attention to versions of Leaves of Grass besides the first and deathbed editions. As I’ve discovered through my own attempt to go digital, many challenges lie ahead, but I’m motivated by the opportunity to be creative, learn new skills, and have an impact on scholarship.

Signs that social scholarship is catching on in the humanities

To what extent are humanities researchers practicing “social scholarship”—embracing openness, accessibility and collaboration in producing their work? In defining the characteristics of the humanities cyberinfrastructure, the report of the ACLS Commission on Cyberinfrastructure recommends that it should be “accessible” and “facilitate collaboration.” At the same time, the report contends that solitary scholarship is the norm in the humanities: “Despite the demonstrated value of collaboration in the sciences, there are relatively few formal digital communities and relatively few institutional platforms for online collaboration in the humanities. In these disciplines, single-author work continues to dominate.” Recently, however, I’ve observed several trends that suggest increasing experimentation with collaborative tools and approaches in the humanities:

1) Individual commitment by scholars to open access
Recently several prominent humanities scholars have voiced strong support for open access publishing. For instance, Nick Montfort has stated that he will no longer review articles for non-open access journals. Likewise, dannah boyd has declared that she will no longer publish in journals where content is not freely available and that “scholars have a responsibility to make their work available as a public good.” As part of a forum on open access in Anthropology News, Chris Kelty articulated his reluctance to peer-review articles “for a multinational corporation with shareholders and an enormous profit margin” when he isn’t compensated for his labor. Such declarations are increasing awareness of open access and stirring up an important debate about whether it is feasible and desirable. By making publications freely available online, scholars reach a larger audience, serve the fundamental scholarly mission to advance public knowledge, and make their own work more visible. Of course, there are significant economic and cultural obstacles to open access, obstacles that I will look at in my next post.

2) Development of open access publishing outlets
The commitment to publish only in open access journals won’t go very far if there aren’t appropriate forums for this scholarship (unless authors choose to self-publish their work). Already the Directory of Open Access Journals lists 554 humanities journals, including Digital Humanities Quarterly, Transformations, African Studies Quarterly, Southern Spaces, and Bryn Mawr Classical Review Yet some open access journals struggle with the lack of resources and, perhaps more significantly, the lack of contributors. According to Sigi Jottkandt and Gary Hall, leaders of the new Open Humanities Press, the most significant obstacle “is still the general perception by our colleagues that open access publication is not as academically rigorous as traditional print-based journals and books” (http To tackle the perception that open access journals are somehow less scholarly, the Open Humanities Press emphasizes the prestige of its editorial board, which includes Stephen Greenblatt, N. Katherine Hayles, Jerome McGann, Peter Suber, and Gayatri Chakravorty Spivak. The Open Humanities Press aims to develop open access humanities journals in critical theory, construct a research gateway, and publish foundational books on critical theory that are in the public domain, taking as it main values access, scholarship, diversity and transparency. Academic and commercial publishers are likewise experimenting with open access publishing models. For instance, the University of Michigan Press and the University of Michigan Library are collaborating on the digitalculturebooks imprint, which makes digital versions of works freely available. The MIT Press is publishing Information Technologies and International Development as an open access journal and is providing free online access to the MacArthur Foundation Series on Digital Media and Learning thanks to the support of the MacArthur Foundation. Hindawi Publishing Corporation, a commercial press focused on science and engineering, now publishes all of its journals as open access under a model where authors cover publication costs.

3) Availability of tools to support collaboration
To encourage humanities scholars to work together on complex research problems, share data and references, and jointly author documents, they need tools that make the whole process easy. Web 2.0 is a notoriously squishy term, but for me it is fundamentally about enabling participation and collaboration. We could list dozens of different collaborative tools, such as blogs, wikis, collaborative bookmarking, social networking, collaborative authoring, social tagging, visualization, mashups, etc. In the digital humanities domain, a number of tools are under development that facilitate collaboration. For example, Stan Katz hails the recent partnership between the Center for the New Media and History and the Internet Archive to enable humanities scholars to collaborate by uploading their research notes and collections to the Internet Archive using Zotero. SEASR is a software environment for data analysis that will “empower collaboration among scholars.”

4) Experiments with social peer review
While the traditional peer-review process includes only a few often anonymous reviewers, new approaches to peer review engage a larger community in evaluation and leverage collaborative bookmarking and social tagging applications to determine the impact of a work. For example, in preparing his book Expressive Processing: Digital Fictions, Computer Games, and Software Studies for publication, Noah Wardruip-Fruin is pursuing two methods of peer-review: the traditional process, through MIT Press, and blog-based peer review. He’s posting the book in sections to Grand Text Auto and using CommentPress to engage in a conversation with readers. In reading over Wardruip-Fruin’s meta-reflections on blog-based peer review, I was struck by his observation that getting feedback from multiple reviewers helps him to figure out whether something just bothered one reader or is a deeper problem: “the blog-based review form not only brings in more voices (which may identify more potential issues), and not only provides some ‘review of the reviews’ (with reviewers weighing in on the issues raised by others), but is also, crucially, a conversation (my proposals for a quick fix to the discussion of one example helped unearth the breadth and seriousness of the larger issues with the section).” For Wardruip-Fruin, the “social process” produces comments that he trusts more, since they emerge from community dialogue. Some have criticized this approach, arguing that removing anonymity means that comments aren’t as honest and that opening up the review process dilutes its authority, but it seems to me that blog-based peer review resembles an online writing workshop—you hear from multiple readers and get a sense of how your argument is playing out.

5) Development of social networks to support open exchanges of knowledge
Social networking sites provide key organizational and communication tools for a community, whether it be focused around a particular field or spans the disciplines. As HASTAC’s name (the Humanities, Arts, Science and Technology Advanced Collaboratory) suggests, it fosters collaboration focused on innovative, interdisciplinary uses of technology by coordinating a network of research centers, sharing information, cultivating community, overseeing funding programs such as the MacArthur Digital Media and Learning Competition, and more. NINES (Networked Infrastructure for Nineteenth-century Electronic Scholarship) is developing a platform for collaboration (Collex), a network of nineteenth-century scholars, mechanisms for peer review of digital scholarship, and training programs for scholars working on digital projects.

6) Support for collaboration by funding agencies
Funding agencies are emphasizing collaboration in many of their programs. If you look at the tag cloud for the recently-announced winners of the MacArthur Digital Media and Learning competition, “collaboration” stands out as the most frequently used term, applied to projects that, for instance, “connect young African social entrepreneurs with young North American professionals,” enable young people to work together on Do It Yourself science projects, or engage high school students in Los Angeles and Cairo in an environmental studies game. Similarly, the NEH/IMLS Digital Partnership program focuses on “innovative, collaborative humanities projects,” encouraging libraries, museums, and scholars to work together to advance public knowledge.

7) More broadly, universities are emphasizing community as key part of graduate education.
The Carnegie Foundation’s The Formation of Scholars: Re-thinking Doctoral Education for the Twenty-First Century argues that graduate programs must create intellectual community to engage graduate students in the work of the department and discipline, retain them, and promote innovative thinking. Perhaps digital humanities projects exemplify the benefits of collaborative approaches to scholarship, since it’s difficult for a solo scholar to pull off the typical digital humanities project. I was motivated to complete my PhD in large part because of the communities that I participated in, particularly my dissertation group and the Electronic Text Center. It seemed that the happiest graduate students in my program were those working on digital humanities projects, which allowed us to collaborate with senior scholars and fellow graduate students, learn new skills, and do work that had immediate benefit for researchers and, often, the general public.

Other examples of social scholarship’s emergence include the growth of blogging and the use of collaborative bibliographic tools such as citeulike (which includes 500 items that are tagged “humanities“). Despite these signs that social scholarship is beginning to gain traction in the humanities, significant obstacles remain, obstacles that I will discuss in my next post.

Social Scholarship in the Humanities

Scholarship seems to be getting more visibly social. According to Laura Cohen, social scholarship is “the practice of scholarship in which the use of social tools is an integral part of the research and publishing process.” Social scholars may blog, share bookmarks, data and other resources, participate in social networks, make their works-in-progress available for review, and deposit their publications in open access repositories. A recent Scientific American article points out some of the benefits of “open source” science. At social networking sites such as OpenWetWare, which recently received a substantial NSF grant to develop social software for scientists, biologists and bioengineers share research protocols and syllabi, blog the research process, post profiles of their research groups, and find collaborators. As a result, collective wisdom is documented and passed down, failures as well as successes are made visible, lab managers can more easily track ongoing research, and researchers can get quick feedback on their work from colleagues around the world. Open Source Science seems especially appropriate for researchers searching for cures to diseases common in developing nations but of little interest to big pharmaceutical companies, since such openness can facilitate more rapid discoveries and is not constrained by the quest for patents. With Harvard’s recent adoption of an open access policy and the NIH mandate that research publications it funds be deposited in PubMed Central, social scholarship appears to be gaining momentum. To what extent are the humanities part of this movement?

Typically humanists are cast as the loners of academia, brooding over books in solitude. True, rarely do you see humanities scholars jointly authoring works, although they often collaborate to edit essay collections and journals and organize conferences and workshops. Unlike the sciences, where joint authorship is expected, many tenure committees haven’t yet figured out how to assign credit for collaborative work in the humanities. Yet you can glance at the acknowledgments in any humanities monograph and find ample evidence for the social context out of which scholarship emerges—the friends and colleagues who suggested references and read multiple drafts, the anonymous peer reviewers who provided feedback, the conference attendees and students who served as sounding boards, the assistants who offered research support, the librarians and archivists who tracked down sources, the funders who helped pay for research trips, the partners who put up with it all. Reversing the typical image of scientists as collaborators and humanists as loners, Sayeed Choudhury and Timothy Stinson point out in The Virtual Observatory and the Roman de la Rose: Unexpected Relationships and the Collaborative Imperative that in the “data-poor” environments of the early modern era scientists were reluctant to share information, whereas medieval manuscripts provide ample evidence of humanists working together to write, copy, annotate, illustrate, and disseminate texts. As Choudhury and Stinson suggest, “Perhaps it is not a set of inherent characteristics within specific disciplines that defines their mode of scholarship or communication, but rather the relative ease or difficulty with which practitioners of those disciplines can generate, acquire or process data.” Does scarcity produce secrecy, abundance openness? Information housed in archives remains a scarce resource for humanities scholars, but mass digitization efforts are making other forms of humanities data widely available. Will humanities scholars work together to mine and make sense of this information? In my next posts, I’ll look at some trends indicating that humanities scholars are beginning to embrace social scholarship, as well as discuss some obstacles.