Slides and Exercises from “Doing Things with Text” Workshop

Last week I was delighted to be back at my old stomping grounds at Rice University’s Digital Media Commons to lead a workshop on “Doing Things with Text.” The workshop was part of Rice’s Digital Humanities Bootcamp Series, led by my former colleagues Geneva Henry and Melissa Bailar. I hoped to expose participants to a range of approaches and tools, provide opportunities for hands-on exploration and play, and foster discussion about the advantages and limitations of text analysis, topic modeling, text encoding, and metadata. Although we ran out of time before getting through my ambitious agenda, I hope my slides and exercises provide useful starting points for exploring text analysis and text encoding.

Opening the Humanities Part 2: Contexts

In 1813, Thomas Jefferson declared in a letter to Isaac McPherson:

“He who receives an idea from me, receives instruction himself without lessening mine; as he who lights his taper at mine, receives light without darkening me. That ideas should freely spread from one to another over the globe, for the moral and mutual instruction of man, and improvement of his condition, seems to have been peculiarly and benevolently designed by nature….”

“Sharing,” by Josh Harper

Unlike, say, a diamond bracelet, an idea can be freely given to others without diminishing its value for the person who “owns” it–indeed, its value only increases as it spreads. While Jefferson believed that the creators of inventions could not claim permanent, natural rights over them, he acknowledged that society could grant the right to profit from them in order to foster innovation (which, as Chris Kelty notes, Jefferson termed the “the embarrassment of an exclusive patent,” suggesting his discomfort). He cautioned that intellectual property rights may actually endanger innovation by granting monopolies, should exist only long enough to spawn innovation, should be governed by rules limiting their application, and should be differentiated according to what benefit they convey to the public (Boyle, The Public Domain).

Jefferson’s letter raises fundamental questions: what social functions do intellectual property rights play? How can we best encourage the sharing of ideas and the progress of knowledge? In this post, the second in my series on the open humanities, I will explore legal and cultural contexts, focusing on the US.

The view that intellectual property rights are granted to encourage innovation is reflected in Article 1, Section 8  of the US Constitution: “To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries.” Note that that the Constitution describes both the purpose of copyright–”To promote the Progress of Science and useful Arts”–and places limits upon it. Copyright aims to provide an incentive (a limited monopoly) for creators to share their work so that others may make use of it and build upon it. This incentive is balanced by limits, so that after a period of time the work falls into the public domain. The 1790 Copyright Act set the copyright term at 14 years, with the right to renew for another 14 years. Now, after the passage of the Sonny Bono Copyright Term Extension Act, the copyright term has exploded to 70 years after the death of the author. The original intention to encourage the progress of public knowledge seems to have fallen aside in the interest of protecting commercial interests such as Disney’s monopoly over Mickey Mouse.

Expansion of U.S. copyright law (assuming authors create their works 35 years prior to their death) (Wikipedia)

Expansion of U.S. copyright law (assuming authors create their works 35 years prior to their death) (Wikipedia)

With most academic work, the ability to secure a monopoly over one’s ideas is not the primary incentive for sharing. Rather, most academics publish scholarly works in order to make a visible contribution to the scholarly conversation, build their scholarly reputation, and ultimately secure tenure or promotion. Typically researchers do not receive monetary compensation for publishing journal articles; the reward comes in disseminating their research. As Peter Suber suggests, one factor that makes open access more complicated in the humanities is that authors of monographs often expect to receive royalties. However, as Paul Courant points out, the monetary rewards tend to be small; the author of a moderately successful manuscript selling 1000 copies might expect to make less than $4000, and “for many monographs, lifetime royalties are zero or close to it.” As Courant suggests, “The big financial payoff to the author of the great majority of scholarly books is not the royalties but the visibility (and hence the salary and working conditions) of the author in the academic labor market.” If authors aim to contribute to the scholarly conversation and heighten their visibility, it makes sense for them to remove barriers to their work (although they also have an incentive to publish with the top journals or publishers).

Open access facilitates the sharing of scholarly knowledge. Peter Suber, a philosopher and respected advocate for open access, offers a simple definition: “Open-access (OA) literature is digital, online, free of charge, and free of most copyright and licensing restrictions.” Because such literature is digital and available online, distributing it costs almost nothing, and it can be accessed by anyone with an Internet connection. The lack of most restrictions means that the literature could be accessed and mined, which could open up new insights. But creators can put into place some restrictions over open works. For example, they can adopt a Creative Commons license and specify whether the work can be modified and/or used commercially, as well as whether the work must be attributed (CC-BY) and/or whether new versions of the work must be licensed under the same terms (share and share alike). CC-BY upholds the scholarly practice of acknowledging sources (see Bethany Nowviskie’s “why, oh why, CC-BY?” for a smart discussion of the rationale for adopting this license). There are two principal means of disseminating open access scholarly work: green, through depositing works in disciplinary repositories (like arXiv) or institutional repositories (like DSpace@MIT), and gold, through publishing open journals and monographs. Note that many publishers allow scholars to self archive work in repositories; visit SHERPA RoMEO to access publisher policies.

Unfortunately, the humanities seem to be behind the sciences in practicing openness. As Wikipedia explains, the open science movement aims to enlarge access to research, data, and publications, speed up scholarly communication, facilitate collaboration, and improve the sharing and building of knowledge, whether through open lab notebooks, open data, or open access to scholarly literature. There isn’t even a Wikipedia page for open humanities (let’s get to work!). The Directory of Open Access and Hybrid Journals lists nearly 3000 journals in the sciences as opposed to a little over 1300 in the arts & humanities. Much of the rhetoric around openness focuses on science; as a rough measure, there are approximately 973,000 Google results for “open science” versus around 38,000 for “open humanities”.

In a 2004 essay, Peter Suber pointed to a number of reasons why the humanities have been more reluctant to embrace openness than the sciences, including the greater availability of public funding for scientific research (and publishing fees), a deeper sense of a cost crisis with science journals, the significance of pre-print repositories in the sciences, the importance of monographs in the humanities, and the greater public pressure for open access to science. Updating Suber’s analysis eight years later, Gary Daught suggests that the time may be ripe for efforts to promote openness in the humanities. He notes that the price inflation of humanities journals has become a greater concern and that open source tools such as Open Journal Systems have brought down publishing costs. Perhaps most importantly, as scholars become more accustomed to the speed, convenience and openness of online communication, they may more expect that research is easily accessible.

Indeed, I’ve identified a number of open humanities projects, mainly in the digital humanities. Openness in the humanities can take many forms, including:

While these different ways of categorizing openness are helpful, I agree with Clint Lalonde (riffing on Gardner Campbell) that “open is an attitude”– not only being willing to share resources, but also to work in such a way that others can observe, learn and offer to help. In my next post, I’ll provide a number of examples of open humanities projects and initiatives.

Of course, open humanities projects aren’t necessarily focused on digital humanities; note, for instance, publishing initiatives such as Open Humanities Press. With digital humanities, we often see the intersection of humanistic values and what I’ll call Web values. Driven by a desire to make it easier for scientists to share their data and collaborate, Tim Berners-Lee created the foundations of the Web. Rather than being a proprietary system, the Web is built upon open protocols, standards and design principles. The success of the Web comes from the way that it connects people to each other, information, and experiences, enabling them to share ideas, converse with each other, and explore and interact with information. Hence Berners-Lee’s message (appropriately delivered via Twitter) at the 2012 Summer Olympics: “this is for everyone.” What would it take to say the same about humanities scholarship and educational resources?

[Note: This post expands on a presentation I gave at WPI’s Digital Humanities Symposium in November.]

Call for Submissions to Anvil’s Built Upon Series

Following last week’s call for archives to participate in Anvil Academic‘s Built Upon initiative, I’m now pleased to announce that we’ve released our call for authors to contribute to the series. If you are interested in producing a work of digital scholarship that makes creative, effective use of digital collections, please consider submitting a proposal.

Current archives partners include:

We hope to announce additional partners soon. You’re welcome to work with digital collections other than the ones listed here.  Initial “Built Upon” works will be clustered based upon the broad categories listed above.

Call for Digital Collections to Participate in Anvil’s New Built Upon Series

Although there are a number of excellent digital collections in the humanities, I’m troubled that many don’t get the scholarly recognition and usage that they deserve.  Moreover, it seems that there are too few examples of works of digital scholarship that make use of such collections in imaginative ways, such as by employing text mining, image analysis, or other algorithmic approaches,  crafting scholarly arguments that take advantage of the affordances of digital publishing, or inviting the audience to explore supporting evidence. I suspect that one reason for the paucity of such scholarship is the lack of appropriate publishing venues (although there are some terrific journals in this space, including Vectors, Southern Spaces, Kairos, Sensate and Archive).

That’s why Anvil Academic (the start-up digital publisher for which I serve as program manager) is launching the Built Upon series.

Building Blocks by Holger Zscheyge

Building Blocks by Holger Zscheyge

Contributors to Built Upon will develop digital scholarly arguments or pedagogical projects that make innovative use of digital collections and tools. These contributions will be arranged into thematic clusters (such as “Civil War America”), and we expect that the contributions will be in conversation with each other and with their larger audience.

Soon we will release our call for authors, which will provide more details about our expectations for Built Upon contributions. At this stage, we are inviting digital collections (aka digital archives, digital libraries, etc) in the humanities to participate in the Built Upon series.  As Built Upon partners, digital collections would make their resources available for scholarly use (which many already do) and provide limited technical assistance to authors. We also invite partners to participate in the peer review process and to assist with outreach and promotion efforts. Already Anvil has lined up some first-rate partners, including Visualizing EmancipationValley of the Shadow, many of the NINES federated projects, and ORBIS.  For more about this initiative, please see yesterday’s announcement on the Anvil web site.

Opening the Humanities Part 1: Overview

Today marks the fifth anniversary of my blog. Over the course of those five years, I’ve learned a simple, vital lesson: sharing is good. When I began my blog, I planned to document the process of remixing my dissertation (completed five years earlier, in 2002) as a work of digital scholarship. I got distracted by other topics, such as making the case for social scholarship, summarizing the year in digital humanities (a task that seems far too daunting today), examining collaboration in DH, and providing resources for getting started in DH. Since I didn’t really expect that the blog would find much of an audience, I was jazzed when people commented on my posts and talked with me about my blog at conferences. Blogging opened up new opportunities for me– invitations to speak or to contribute to essay collections– and made me feel like I was part of a lively community of scholars. Sharing made my work more visible and gave me a greater sense of purpose.

An interest in sharing also led me to team up with several other librarians to start the Digital Research Tools (DiRT) wiki. As I tried to keep up with all of the tools that help researchers find, manage, analyze and present information, I figured it would be better to take on the task collectively and produce a community resource.

Program Building @ THATCamp Vanderbilt by derekbruff

Program Building @ THATCamp Vanderbilt by derekbruff

With DiRT, I was struck by the willingness of the community to share; as I recall, both Alan Liu and Dan Cohen invited me to grab resources from their own tool collections and include them in DiRT, and people volunteered their time to add new information to the wiki. But I also learned that it  requires continuous effort to maintain an active community of contributors; no matter how good our intentions, we only have so much time (and I myself had only limited time to commit to DiRT). Now DiRT has achieved what many start-ups aim for: it’s been acquired by a larger organization. Reborn as Bamboo DiRT, it is nurtured by a steering/ curatorial committee (led by Quinn Dombrowski, who did much of the work creating Bamboo DiRT) that shares its time and expertise to maintain a resource of value to the community.

In retrospect, I see that my attraction to digital humanities comes not so much from a love of technology or method, but of the community and its values. It’s difficult (and perhaps presumptuous) to define the values of such a diverse community, but I would point to openness, collaboration, collegiality and connectedness, diversity and experimentation (as I did in my chapter in Debates in the Digital Humanities). Underlying all of these is openness, broadly defined: openness to new ideas and new participants, openness as a commitment to sharing.

We see openness throughout the digital humanities. As the Manifesto for the Digital Humanities declares, digital humanists are “building a community of practice that is solidary, open, welcoming and freely accessible” as well as “multilingual and multidisciplinary.” This community calls for “open access to data and metadata,” open source software, the development of “collective expertise” and the sharing of best practices. I would point to THATCamp, with its openness to all, spirit of sharing and discovery, and emphasis on collaboration, as the embodiment of this community (appropriately enough, the Manifesto was produced collectively at THATCamp Paris). Openness defines how much of the DH community operates and animates its larger goal to promote the growth of knowledge. Indeed, Mark Sample proposes that The digital humanities is not about building, it’s about sharing, arguing that the “promise of the digital” comes in the circulation, sharing and discussion of knowledge. Instead of tolerating the slow dissemination of knowledge through antiquated print processes and allowing knowledge to be restricted to those with access to well-funded libraries, Sample suggests, we can develop open solutions that promote conversation, sharing, reuse, and the growth of knowledge.

Noting how frequently terms like “open” and “collaboration” are used in definitions of digital humanities, Eric Johnson suggests that the digital humanities have much in common with the public humanities. Like museum professionals and librarians, digital humanists embrace values such as collaboration, open access, and “[i]nvolvement of the public and/or public ‘communities of passion.’” (I love that term “communities of passion,” which captures the generosity, sense of common purpose and enthusiasm I see in DH).  Many digital humanities projects aim to share knowledge with the public and even engage the public in the construction of that knowledge. Eric advances a useful definition of the open humanities: “those aspects of the humanities aimed at democratizing production and consumption of humanities research.” (I would add teaching and learning).

With this post, I am beginning a series on the open humanities, elaborating on ideas I discussed in my November 2 talk at WPI’s Digital Humanities Symposium. I’ll look at the contexts around open humanities, explore the rationale for open humanities (drawing many examples from digital humanities), and examine challenges facing open humanities, particularly cultural and economic ones. Along the way, I’ll discuss the ongoing development of Anvil Academic, an open publisher for the digital humanities (I’m the program manager).  I hope this series shines a light on some of the great work being done in the DH community and stimulates further conversation about the open humanities.

Thanks to everyone who has commented on a post, spread the word about my blog, encouraged me, shared ideas with me, and helped make the DH community (as contentious as it sometimes can be) one of passion.

Presentations on the Future of Libraries and Open Humanities

Yesterday I started the day by discussing the future of academic libraries with an sharp, engaged group of faculty, librarians and staff at Worcester Polytechnic Institute and ended it by advocating for open humanities at WPI’s conference Digitize This: Exploring/Exploiting the Rise of Digital Arts & Digital Humanities. It was a kick that three people whom I consider to be leaders in open humanities–John Unsworth, Julia Flanders and Tom Scheinfeldt–were at my evening presentation.

Here are the slides from my presentations (PDF):

For those who want more, I’ve been obsessively bookmarking resources on open humanities and the future of libraries.

Thanks to Dr. Tracey Leger-Hornby, WPI’s Dean of Library Services and my classmate at the Frye Leadership Institute (go class of 2003!), for hosting my visit, and Prof. Joshua Rosenstock for making it possible for me to speak at the DH symposium.

Update, 11/16/12: After being notified by an attentive reader (thanks Dad!) that my slides contained some typos, I’ve uploaded corrected versions. I hope I caught ‘em all.

Anvil Academic Launches

Since June, I have been working as program manager of Anvil Academic, a new all-digital, open publisher for the digital humanities. Anvil LogoOver the past four months, we’ve built a stellar editorial board, met with our advisory board and with colleagues at the University of Michigan’s MPublishing, done some preliminary work identifying potential publications, and developed a web site in partnership with Interactive Mechanics. I’m pleased to announce that our web site is now publicly available and that Anvil is officially launched. Like Anvil itself, we expect that the web site will undergo significant changes; for now it functions mainly to promote our vision and to offer basic information for potential authors. In the coming months, it will provide an access point for our publications.

So what sort of publications will Anvil produce? Anvil focuses on born-digital humanities scholarship that could not exist in print form, such as works that are built upon rich collections of data and offer tools for analyzing and exploring that data; multimodal compositions that incorporate audio, video, images, simulations, and/or other rich media; works of networked authorship, which engage the community in ongoing online conversation; and flexible, interactive educational content. But we are open to forms and genres beyond what we’ve identified here. In evaluating potential projects, Anvil will consider factors such as their quality, contributions to scholarship, technical robustness, level of innovation, and likely audience.  Please get in touch with Fred Moody, Anvil’s editor, if you would like to discuss a potential project.

Anvil has the potential to make a significant contribution to the humanities and to academic publishing. It addresses the need to bring publishing services such as peer review, distribution, and editing to the digital humanities. I hope that Anvil will help to increase the visibility and credibility of digital humanities scholarship and perhaps assist the field in continuing to develop argumentative and interpretive frameworks. Since I’ve had a long-standing interest in open access publishing models for the humanities, I welcome the opportunity to help build the structures to support open scholarship. I see open access as an ethical obligation, a means to increase the value and visibility of humanities scholarship, and an opportunity to foster scholarly conversations with diverse, engaged communities. As a start-up, we have latitude to experiment with different dimensions of publishing in a way consistent with the values of the digital humanities, including peer review (open, peer-to-peer, hybrid), business models, genres, etc.  We plan to share the results of such experiments.

If you’d like to learn more about Anvil, please participate in a Twitterchat with  me (@lisaspiro), Fred Moody (@moodyfred), and Korey Jackson (@koreybjackson, Program Coordinator and Analyst) on Friday, October 5 at 12 EDT; we will be using the #anvil hashtag. The Twitterchat will be facilitated by editorial board member Adeline Koh (@adelinekoh), who recently wrote two ProfHacker pieces about Anvil: an introduction to Anvil and an interview with Fred Moody.  Of course, I also welcome questions and comments.