Matt Wilkens, post-doctoral fellow at Rice’s Humanities Research Center, recently launched Work Product, a blog that chronicles his research in digital humanities, contemporary fiction, and literary theory. Matt details how he is working through the challenges he faces as he tries to analyze the relationship between allegory and revolution by using text mining, such as:
• Where and how to get large literary corpora. Matt looks at how much content is available through Project Gutenberg, Open Content Alliance, Google Books, and Hathi Trust and how difficult it is to access
• Evaluating Part of Speech taggers, with information about speed and accuracy
I think that other researchers working on text mining projects will benefit from Matt’s careful documentation of his process.
By the way, Matt’s blog can be thought of as part of the movement called “open notebook science,” which Jean Claude Bradley defines as “a laboratory notebook… that is freely available and indexed on common search engines.” Other humanities and social sciences blogs that are likewise ongoing explorations of particular research projects include Wesley Raabe’s blog, Another Anthro Blog, and Erkan’s Field Diary. (Please alert me to others!)