Becoming a “Digital Scholar”: Digital Discovery 2008

[Below is the text of a presentation that I will be giving at the Digital Discovery conference on March 27, 2008]

When I started a graduate program in English way back in 1992, I used computers mainly to write papers. Then came the web. Within a few years, I was creating web-based assignments for the undergraduate courses I was teaching,UVA's Alderman Library marking up electronic texts for the University of Virginia’s Electronic Text Center, creating my own electronic edition of a nineteenth-century sentimental sketch, and copy-editing articles for one of the first humanities journals to be published exclusively online, Postmodern Culture. Even though I was actively engaged in what has now come to be called “digital humanities,” I still did much of my dissertation research—which examines bachelorhood in nineteenth-century American literature and culture–the old fashioned way. I wandered the stacks of Alderman Library smelling the decaying books, skimmed print catalogs such as Lyle Wright’s bibliography of American fiction, flipped through 150 year-old volumes of magazines such as Harpers and The Atlantic, and even took notes with a fountain pen.

I finished my dissertation in 2002. After spending so much time laboring over it, I wanted to move on. But 5 years later, I decided I was ready to take up with the diss again, on new terms. I’m fascinated by the question of how the abundance of digital information and the development of new technologies will affect humanities scholarship, and it seemed to me that the best way for me to understand these transformations would be to undertake a major research project myself. By revisiting my dissertation, I could build on my existing knowledge and compare my research process 5+ years ago to what’s possible today. Thus I decided to remix my dissertation as a work of digital scholarship.

So what’s digital scholarship? According to the ACLS report on cyberinfrastructure for the humanities and social sciences, digital scholarship includes building digital collections, creating tools for collecting, analyzing, and authoring digital information, and “using digital collections and analytical tools to generate new intellectual products.” My project reflects that last idea, as I am exploring the implications of using digital collections and tools. My work is still very much in process, but here are three preliminary observations:

  1. A vast amount of information is now available online. When I first started working on my dissertation, I wished that there were some way for me to search not only bibliographic information, but also the content of works themselves. Well, that wish is beginning to come true. We don’t know exactly how many books Google has digitized, but the number is well over a million—and of course many more works have also been digitized by the Open Content Alliance, Microsoft’s Live Book Search, and countless libraries and archives. I was curious about how many of the nearly 300 works I cite in my dissertation bibliography are now available online, so I searched for them in Google Books, Making of America, online journal collections, and other sites. I found that 77% of my primary source resources and 22% of my secondary sources are available online as full-text, while 92% of all my research materials have been digitized (this number includes works available through Google Books as limited preview, snippet view, and no preview.) 13% of the resources—mostly journal articles–require a subscription.Now I should note that I study 19th century American literature, which is safely out of copyright and ubiquitous at most research libraries. Still, many significant materials haven’t been digitized, particularly periodical literature and archival materials. Other works are only available through subscription. Even if the resource has been digitized, it often has errors—metadata can be unclear or incomplete, scans can show the fingers of the scanning operators or be cut off, and the quality of the OCR can be poor. Nevertheless, the availability of so much digital information means that how we do research will change.

  2. As we deal with the abundance of information, we need tools to find, organize, manage, analyze and share our research materials. Fortunately, those tools are beginning to be developed.For example, when I started researching my dissertation, ZoteroI wasted a lot of time attempting to organize my notes and looking up bibliographic information that I hadn’t captured accurately. Now I manage my research much more effectively by using Zotero, a free Firefox-based research tool developed by George Mason’s Center for New Media and History. Zotero automatically captures bibliographic information from hundreds of supported web sites and lets you insert properly-formatted notes and bibliographies as you write a paper in Word. Moreover, you can take notes in Zotero, tag your resources, organize them into collections and sub-collections, and search across them. The next version of Zotero will support sharing bibliographic resources on the network.To detect patterns in the texts that I collect, I am using text analysis tools such as those developed by TAPOR. With TAPOR, you can create a concordance, compare texts, and look for co-occurring words. I’ve already used TAPOR to generate a list of the most frequently occurring terms in the first chapter of my dissertation, then compared that list to one I created manually. While my own list mainly focused on different descriptors of the bachelor, the TAPOR list reflected key components of my argument, including words associated with domesticity, nationhood, and identity. Text analysis tools can make implicit knowledge explicit and open your eyes to patterns you hadn’t previously been aware of.

    Beyond text, visualization and mashup tools allow you to make sense of data such as demographic History Browserinformation, troop movements, and even patterns in the correspondences of historical figures such as Thomas Jefferson. Ed Ayers envisions historians using dynamic “social weather maps” that allow them to watch historical forces in process. For my own project, I plan to explore the geographic and temporal nature of bachelor literature by developing several interactive maps that show where bachelor narratives were set and where their authors were born, as well as timelines that plot the publication history of bachelor literature.Having texts in open formats (such as XML) makes it much easier to analyze and manipulate them—otherwise you have to go through a cumbersome conversion process. I’m OCRing the PDFs I downloaded from Google Books so I can then use text analysis tools on them, but I understand that the resulting text will be somewhat unreliable.

  3. Although the journal and monograph still dominate the humanities, new means of scholarly communication are emerging, enabling the faster dissemination of ideas, more community dialog, and the use of multimedia.
    • I recently started blogging about my research project. Through blogging, I’ve become much more engaged in my research community and have been energized by the generous responses from friends and leaders in digital humanities. Readers have helped me to think through ideas by offering alternative perspectives and alerting me to resources I hadn’t been aware of. Blogging motivates me to follow developments in my field more closely and to synthesize what I’m observing. My blog is also a great memory aid—in preparing this talk, I’ve remembered, “Oh yeah, I blogged about that” and have been able to pull up the relevant entry quickly. I’m reaching much more people than I would through traditional means of publication. For instance, during January alone, my blog was viewed over 2,725 times, far more times, I suspect, than any of my articles have been read.
    • Our culture is increasingly a visual one, dominated by TV, movies, and YouTube videos, but we don’t yet have many examples of video-based scholarship. However, some interesting models are emerging. SciVee, which is sponsored by the Public Library of Science, NSF, and the Dan Diego Supercomputing Center, makes it easy for scientists to upload videos that accompany published articles, making their work more accessible and visible. Anthropologist Michael Wesch’s video “The Machine is Us/ing Us” demonstrates the potential of video as a means for disseminating ideas—it has been viewed almost 5 million times and explores Web 2.0 in, well, a Web 2.0 way, illustrating the dynamic, interactive nature of the Web. Inspired by these examples, as well as by digital storytelling , I’m planning to create short videos that allow me to express ideas difficult to explore in print. For example, I plan to survey America’s changing perception of the bachelor by showing images of bachelor from the 19th to 20th centuries (accompanied by bachelor songs). I also am working on a short video about the history of the bestseller Reveries of a Bachelor, which went through many editions and changed its physical format as the publisher dreamed up ways to keep it in demand.
    • Then there are the scholars who are simply—and significantly—making their works available online through open access repositories. In so doing, these researchers are making it possible for independent scholars and those at institutions without big library budgets to access their work, advancing the democracy of knowledge. Moreover, they are likely increasing their own visibility as scholars. As Michael Jensen argues, scholars will increasingly be evaluated based on the impact of their work, which will be measured by factors such as number and quality of citations, blog comments, and links to the document.

    Lest I seem naïve, I should acknowledge that significant challenges face digital scholarship. Many studies, including the ACLS report on Cyberinfrastructure and the UC Berkeley Center for Studies in Higher Education reports on faculty attitudes toward digital resources, detail these challenges, but let me just mention a few. I’ve found that many humanities scholars I’ve talked with are not yet aware of digital scholarship. Already feeling stretched by obligations to do research, teach, and perform service , few academics have time to learn new technologies. As the digital environment constantly shifts and tools come and go, it’s overwhelming trying to keep up. In any case, the system doesn’t really reward faculty for experimenting with new technologies. According to a recent MLA report, over half of the tenure committees in the humanities have no experience evaluating “scholarly monographs in electronic format.” Many researchers feel that they will be penalized if they don’t publish in the most prestigious, well-established journals. Then there’s copyright: For my remixed work on bachelorhood, I’d love to provide links to the full-text of every work that I cite. I’d also like to remix those original sources to produce new works. But what I can do is constrained by copyright.

    I believe that many of these challenges will be overcome. For example, scholarly societies like the MLA are recognizing the validity of digital scholarship. Organizations like NINES, which focuses on nineteenth century studies, are providing tools, training, content portals, and support for scholars, as well as conveying legitimacy on digital scholarship. The NEH just turned its Digital Humanities Initiative into a full-fledged Office. The Creative Commons is pressing for greater clarity on copyright.

    What effect will the computer revolution have on humanities scholarship? It’s really too early to say– in a small way, that is what I’m trying to figure out in my project. In the sciences, we’ve seen the rise of new sub-disciplines and methodologies made possible through computation and data archives. In the humanities, I believe that being able to access the full text of millions of books will bring about significant changes in how research is conducted. In a great blog post from earlier this month, Tom Scheinfeldt from the Center for History and New Media suggests that historical scholarship will shift from a focus on ideology to a focus on methodology. As my colleague Jane Segal and I found in our study of the impact of digital archives on humanities scholarship, the Walt Whitman Archive is opening up new areas of inquiry in Whitman studies. For instance, Whitman scholars are shifting to manuscript study and paying increased attention to versions of Leaves of Grass besides the first and deathbed editions. As I’ve discovered through my own attempt to go digital, many challenges lie ahead, but I’m motivated by the opportunity to be creative, learn new skills, and have an impact on scholarship.

9 responses to “Becoming a “Digital Scholar”: Digital Discovery 2008

  1. Not to let copyright be too much of a bogeyman: If you want to remix materials from the 19th century, no worries that, as that material is firmly in the copyright-free zone of the public domain.

  2. Pingback: Who’s blogging? And a rant… « (Digital) Humanities

  3. Re. Copyright: Yes indeed, public domain materials are free & clear for remixing–I should have been clearer about that. But I’m also interested in including some 20th C materials in my remixes, and I confess to being somewhat confused about how copyright law governs public domain materials digitized by others. For instance, if I wanted to include 19th C cartoons from Harper’s online archive, could I just use the page images downloaded from their site (assuming the quality was sufficient), or would I have to re-scan the pages myself?

  4. On the copyright issue – I don’t know about scanning, but if someone has taken a photograph of a source, the photograph is copyrighted even if the source is public domain. A useful parallel here is, I believe, the different editions of a text – the text itself might be public domain, but a particular edition be copyrighted.

  5. Pingback: How many texts have been digitized? « Digital Scholarship in the Humanities

  6. Pingback: DigiNative Wanne(May)be(Not) « MEDIA PRAXIS

  7. Pingback: Doing Digital Scholarship: Presentation at Digital Humanities 2008 « Digital Scholarship in the Humanities

  8. Great piece! Im putting together a site (launched in a month or so) for psychologists to find and work together – all ages, nationalities, expertise – utilising many of the open digital scholarship tools mentioned, as well as some of our own. For me I think one of the key opportunities of going digital (for scholars) is to use their experiences as a teaching tool for those just getting started. We have students here who would love to ‘see’ a piece of research being done by one of their academic heroes from start to finish. In this sense, they see what is being done, what gets thrown out and what kept in, learning the going digital experience by seeing it as it happens. Which is why I found this post so great, and hope what we are putting together helps both those doing the research today and those who will be getting their hands dirty tommorrow. Stuart

Leave a comment