Monthly Archives: December 2008

Studying the History of Reading Using Google Books (and Other Sources)

To what extent can digital collections such as Google Books help to reconstruct us to the history of readers’ responses to literary works–in my case, readers’ responses to Reveries of a Bachelor (1850), which I’m using as a case study of doing research in the Library of Google?  (For an account of my post-marital fascination with bachelors, see my last post.) Readers’ enthusiasm for this sentimental work stirred up my own interest in it.  At Yale’s Beineke Library, I examined a cache of fan letters in which readers rhapsodized  over the bachelor’s

Patrick Henry's annotations to Reveries

Patrick Henry's annotations to Reveries

reveries and connected them to their own experiences.  As one correspondent, a doctor, wrote,  “I have found it really a book of the heart—of my heart—an echo of my own reveries.”  At Yale I also examined Emily Dickinson’s copy of Reveries, where she (or perhaps someone to whom she loaned the volume) made marks next to significant passages. At the University of Virginia Library, I stumbled across an 1886 edition of Reveries heavily annotated by a young man named Patrick Henry.  In a passage where Mitchell described “a Bachelor of seven and twenty,” Patrick crossed out the seven and wrote in “four,” signaling his own intense identification with the bachelor narrator.  Drawing on these and other examples, I wrote a dissertation chapter on readers’ responses to Reveries (later to morph into a 2003 article in Book History) that challenged the notion that sentimental readers were passive.  But I was examining a fairly limited set of reader responses–about 25 letters from the 1850s to the late 19th century, plus a couple of annotated copies of Reveries.  I could offer an even richer analysis of readers’ reactions to Reveries by examining journal entries, memoirs, and letters, as well as even more annotated copies.  I’m especially interested in whether readers’ views of the book changed over time, given that the book was popular from 1850 into the twentieth century. Could I find such evidence in Google Books?

What I Found

Here’s what I found doing a keyword search in Google Books “Reveries of a Bachelor”; I still need to process the hundreds of results I got searching for “Ik Marvel” and “Ike Marvel” (the pen name of the author of Reveries), as well as searching for those terms in the Open Content Alliance.

  • Recent secondary sources on reading that include short passages on Reveries:
    • Ronald and Mary Saracino Zboray’s 2006 account of a would-be suitor attempting to woo an already-engaged woman by giving her a copy of Reveries; she noted in her diary that she would prefer to read the book than spend time with him
    • Claire White Putala’s Reading and Writing Ourselves Into Being, which discusses how Joe Lord recommended Reveries to Eliza Wright Osborne immediately before she married another man
    • Alan Boye’s account in Tales from the Journey of the Dead of a soldier suffering from a broken heart who read Reveries in a Confederate camp
    • So, hmm, Reveries seems to have been read by heartbroken men, who seemed to use the book to express how they felt to the women they were pursuing.  All three of the above books are based on archival research, which leads me to suspect that I would find a number of references to Reveries in archival collections (if I had the time and money to visit them).
  • Memoirs that include brief mentions of Reveries:
    • Mountaineer Belmore Browne’s association of Reveries with melancholia in The Conquest of Mt. McKinley (first published 1913): “I know of nothing in this world that will produce a stronger attack of melancholia than reading The Reveries of a Bachelor on a fog-draped glacier!”
    • Philosopher Morris R. Cohen’s sense that Reveries stimulated feeling and brought relief: “Today I felt very relieved by reading Marvel’s Reveries of a Bachelor. It aroused new strains of feeling I don’t know whether I should be ashamed of wishing…”  [snippet view]
    • Richard St. Clair Steel’s description of the beauty of Reveries
    • My questions: Did women memoirists likewise praise Reveries? Why did the book have such emotional resonance?
  • Evidence that Reveries was embraced by educational, religious, and cultural authorities
    • the University of the State of New York Regents High School Exam, American Literature section included questions about Reveries in 1906, 1894, 1908, 1899, 1903, and 1897 (for whatever reason, I discovered this information not in my original search for “Reveries of a Bachelor,” but in a later search  for “”Reveries of a Bachelor” enrica”, Enrica being the name of one of the women for whom the bachelor longs)
    • Reveries was excerpted in several literary anthologies, including Harper’s First [ -sixth] Reader (1889),  The Ridpath Library of Universal Literature (1898), and American Literature Through Illustrative Readings (1915)
    • Reveries was recommended  for the high school reading list (essays) by the National Council of Teachers of English (1913).  It also appeared in quotation books.
    • The author of the satiric “Reflections of a War Camp Librarian” (1918) notes that American citizens sent Reveries and other gift books to soldiers on the battlefield in WWI, not exactly the kind of reading material soldiers craved
    • A “Country Parson” noted in 1862 how Reveries brought about “revelations of personal feeling” among the unmarried
  • Reveries appeared in many printed library catalogs from the 1850s to the 1920s, including catalogs for the Boston Public, Detroit Public, New Zealand Parliament Library, Princeton University, Library company of Philadelphia, and the British Museum Dept. of Printed Books
  • Reveries was not only read in private, but re-imagined as tableaux and read aloud at home and in public
  • Reviews of Reveries

Google Books as a Research Source

  • Except for the reviews (many of which I had already consulted) and the secondary sources on reading (which I probably would have consulted), searching Google Books enabled me to find many resources that I probably never would have discovered, including memoirs, high school curricula, and guides to performing (reciting/acting out) Reveries.  Although these sources (which I haven’t fully analyzed) haven’t radically changed my view of Reveries, they do give me a better sense of the cultural impact that the book had, as well as its personal significance to readers, who read it while climbing mountains, dealing with emotional turmoil, etc.
  • I had hoped to find annotations in scanned versions of Reveries collected in Google Books and Open Content Alliance.  However, in the copies I examined (and I should say that I glanced over them rather than scrutinized every page), I only found minor annotations–people would typically write their names in their books or inscribe a message to the recipient of the gift book, and a few readers made marks next to passages, but I found nothing like Patrick Henry’s ecstatic annotations.
  • For the texts are only available as fragments around a search term, Google Books functions as a ramped-up research index, pointing me to materials that I often need to consult in the print to put the search results in context, at least until Google Book Search settlement goes through and the out-of-print materials are also available as full text.  (For some of the limited preview books, such as reference books, however, I’m able to pull out enough information from the pages that are available without having to see the whole book.)

Using Google Books to Research Publishing History

At the upcoming Modern Language Association conference, I will join Amanda French and Eleanor Shevlin on a panel called “The Library of Google: Researching Scanned Books,” which is sponsored by SHARP and will be moderated by Michael Hancher.  Google Books has already scanned over 7 million volumes (more than many research libraries hold) and, according to Planet Google, aims to scan every volume in the WorldCat catalog, around 32 million. Our panel will focus on the significance of Google Books for literary research, looking at questions such as whether scholars can trust it and how they should deal with such plenitude.  I plan to discuss my study examining how many of the works in my dissertation bibliography are now available electronically, as well as more recent work using Google Books and other online sources to explore the history of a nineteenth-century bestseller, Donald Grant Mitchell’s Reveries of a Bachelor (1850).  Reveries fascinates me—not so much because I identify with the bachelor narrator’s fantasies and fears of what it’s like to be married (actually, I find the book kind of cloying), but because I’m intrigued by Reveries‘ cultural impact from the 1850s into the early twentieth century.  It sold at least a million copies and appeared in dozens of editions,  from a cheap edition selling for 8 cents to a $6 gift volume in an exquisite morocco binding.  Emily Dickinson loved it, as did readers who evinced their admiration by sending fan letters to Mitchell or making marks in the margins of their book.  In this blog post, I’ll focus on how I’ve employed Google Books to illuminate Reveries‘ publishing history; future posts will look at reader responses, textual history, and authorship.

For a graduate seminar on textual editing way back in the 90s,  I developed an online critical edition of the book’s first reverie.  I also wrote an article analyzing a series of letters that Reveries’ publisher, Charles Scribner II, sent to Mitchell to negotiate the pricing and physical form of new editions between 1883 and 1907, as the publisher and author worked to sustain the popularity of the book and maintain their hold on the market after their copyright expired.  But my publishing history is incomplete; I want to know more about the different forms Reveries took, how it was advertised, what the prices were at different times, how well the book sold, what marketing strategies Scribner and other publishers pursued, and whether Reveries is a unique case or fairly typical, at least for a nineteenth century bestseller.

By using Google Books, I’ve been able to fill in some details about the book’s publishing history, particularly about pricing and advertising.  As amazed as I am by ability to search across millions of books for references to Reveries, I’m also somewhat frustrated by the strange ways that Google Book search works (or doesn’t work) and disappointed that some materials don’t seem to be available.

Title page of 1850 Reveries of a Bachelor

Title page of 1850 Reveries of a Bachelor

What I already knew:

  • The authorized publisher of Reveries, Scribner’s, issued many editions, including:
  • Copyright on Reveries expired in 1892, which meant that other publishers could legally come out with their own editions of the book.  Charles Scribner II wrote to Donald Grant Mitchell to discuss how to respond to this challenge, particularly the threat from Altemus, which he characterized as a “piratical publisher.” Scribner proposed offering a cheap (30 cent) edition “to make it so unprofitable that the publisher [Altemus] will not be encouraged to take up the other books [by Mitchell],” along with a moderately-priced (75 cent) edition.  At the suggestion of Mitchell, Scribner also advertised that the company remained the only authorized publisher of Reveries.
  • Undeterred, many publishers issued unauthorized editions, including Henry Altemus Company, Optimus Printing Company, The Rodgers Company, Donohue, Henneberry, & Co, Porter, W. L. Allison Company, F. T. Neely, Thomas Y. Crowell Company Publishers, The Mershon Company Publishers, G. Munro’s Sons, H. M. Caldwell Company, The Henneberry Company, M. A. Donohue & Company, Homewood Company, A. L. Burt Company, The F. M. Lupton Company, H. M. Caldwell Co., Strawbridge & Clothier, The Edward Publishing Company, W. B. Conkey Company, Acme Printing Company, The Bobbs-Merrill Company Publishers, and R. F. Fenno & Company (BAL, 240-1; NUC, 664-667).   While I was researching Reveries at Yale, I came across several of these volumes, one of which had annotations such as “The illustrations are [most of them] execrable, & there is an occasional ‘mending’ of the text…”  In the preface to the 1907 Author’s Complete Edition of Reveries, Mitchell fixated on the problem of piracy, noting that he had amassed a collection of over 40 imprints of Reveries, only one of which brought him any money.  Apparently Mitchell’s collection–and annotations–ended up at Yale.

Method

To determine how many Reveries related works were available in Google Books, I did a keyword search for “Reveries of a Bachelor.”  The total number of results fluctuated; one day it was 641, another 916, another 809.  But forget about getting to result #641.  One result screen says: “151 – 200 of 809,” but then the next one says “Books 201 – 220 of 220.”  Huh? So what happened to everything else?  Perhaps duplicates are eliminated as you make your way through the results (although there were plenty of duplicates in the results I looked at), perhaps the algorithm used to calculate the number of results is, er, inexact and shifting, or perhaps Google figures you don’t really want to look that many results anyway.  Whatever the explanation, I can’t help wonder about what I’m not getting to see, so my trust in Google Books is diminished a bit, even as I feast on the plenty that is available. 

In any case, I looked at each result available to me, discarding those that weren’t really focused on Reveries and grabbing the bibliographic info for the rest through Zotero.  (I love Zotero, but I was a little frustrated that it didn’t capture the URL and publisher info for  Google Books, which may have to do with the way that Google makes available that information.)  When I wasn’t impeded by texts that offered only snippet views or no preview at all, I copied out a chunk of text that contained the Reveries reference and dumped it into a note in Zotero.  Categorizing as I waded through the results, I added a tag or two for each work, such as “reveries_ad” or “reveries_review.”

Since Mitchell used the pen name “Ik Marvel,” I also searched for “Ik Marvel” (1285 results, today) and “Ike Marvel” (606 results); I’m still working through those results.   I used TAPOR to generate a list of word pairs in Reveries that I hoped to use in searching for works connected to Reveries, but there were only a few pairs that seemed at all unique, such as “Aunt Tabithy,” the name of a character in the book.

Bobbs-Merrill Ad for Reveries

Bobbs-Merrill Ad for Reveries

What I discovered about publishing history using Google Books

  • Pricing: By searching book catalogs, advertisements, and old issues of Publishers Weekly, I was able to track the price for different versions of Reveries between 1851 and 1906.  The pricing data reveals the many choices enjoyed by consumers who wanted to buy a copy of Reveries, particularly at the end of the nineteenth century, when competing publishers entered the market.  Say a consumer in the late nineteenth century wanted a cheap copy of Reveries.  How about paying 8 cents for the “Ideal Library” version, or 18 cents for “Handy Volume” edition? How about a moderately priced edition?  The price of Scribner’s standard duodecimo edition remained fairly steady between 1854 and 1903: $1.25.  If people craved a fine edition, they would have many choices, such as the 1903 Dainty Small Gift Books, Agate Morocco Series with gilt edges for $2.25, the 1906 Bobbs-Merrill Ashe Illustrated Gift Edition for $2, the 1903 Limp Walrus Edition for $2,  the 1903 Limp Lizard Series for $1.50,   (If I start a band, I’m going to call it Limp Lizard.)Big gaps in my knowledge remain–I wasn’t able to find pricing information for the 1850 first edition or the 1907 Edgewood Edition, or for many of the unauthorized editions.   However, without the ability to search across a vast collection of texts I doubt I would have been able to find much of the pricing information at all, particularly in the book advertisements that appeared in magazines and at the end of books, as publishers promoted other books in their catalog.  I probably should have known to look for information about Reveries in book catalogs and late nineteenth-century issues of Publisher’s Weekly, but Google Book Search sure made it easy for me to find relevant information.
  • Response to the copyright expiration: In one of Scribner’s letters to Mitchell, I found a copy of an ad Scribners planned to run advertising its cheap edition and asserting that some portions of Reveries (the new prefaces) remained in copyright.  In Publisher’s Weekly from 1893, I found what I think is that very ad.  I wondered if Scribner’s was unique in handling copyright expiration by releasing a cheap edition and asserting continued copyright over some section. Apparently not. Right after a Scribner’s ad warning that “An action will be promptly brought against any one infringing upon the author rights,” I saw a similar ad from J. B. Lippincott Company for Susan Warner’s The Wide, Wide World, reminding “the trade” that the illustrations remained in copyright and promoting a new 75 cent cheap edition.
  • Marketing: By examining over 25 ads for Reveries available through Google Books, I’ve noticed some (fairly unsurprising) patterns:  Although the book was in Scribner’s catalog throughout the late 19th century, promotion of the book was ramped up when new editions were issued; the publisher often took out full page ads or put Reveries at the top of ads announcing several books.  By the 1890s, Scribner’s was describing Reveries as “an American classic” and predicting that the book would win over “fresh fields” of new readers.  Although I’ve found few ads from competing publishers, Bobbs-Merrill came out with an eye-catching ad for its illustrated gift edition in 1906.   So that I have a visual record of stuff I’ve look at, I’ve set up a Google notebook with clippings of ads for and reviews of Reveries that I found in Google Books.  Creating the notebook was easy; if the book is in the public domain, you can clip out sections of text and post them to your Google Notebook or Blogger blog. (If only you could post to a WordPress blog, or Flickr…)
  • Versions of Reveries: I expected to find more editions of Reveries in Google Books.  When I did a title search for “Reveries of a Bachelor,” only 21 results were returned, and only 4 of those are available as full view, even though 20 were published before 1921 and are in the public domain. (Another is a large print reprint edition from 2008.)  By contrast, the Open Content Alliance provides full access to 18 versions of Reveries, including an 1889 edition marked “Book digitized by Google from the library of the New York Public Library and uploaded to the Internet Archive by user tpb.” (By the way, tpb has apparently uploaded a number of Google Books into the Open Content Archive, prompting some folks to complain about the “pollution” of the OCA by “marginal” Google content.) So why are so many public domain texts in Google Books not fully available?  I’m not really sure, although Planet Google says that Google Books contains metadata (catalog) records for works that it did not digitize and thus are not in its collection.  In any case, if you’re interested in the physical form of books, the Open Content Alliance seems to be a better source than Google Books, since every page is scanned in full color (except, of coure, what’s been uploaded from Google Books) and is presented in a book-like interface, with flippable pages.  You can download pdf, plain text, and DJVU versions, which promotes (re-)use and analysis of the books. I should note that the Open Content Alliance has its own quirks.   OCA content appears to be available through two online collections: the Internet Archive and Open Library.  It’s not immediately obvious how to do a full-text search in OCA. It seems that you can only search bibliographic metadata in the Internet Archive, but you can do full text search at the Open Library.  To do so, go to the advanced search (http://openlibrary.org/advanced) and enter your query into the search box at the bottom.  Another quirk:  you can’t see front covers in OCA in the flip-view interface, but you can if you look at the DJVU files. But it’s even easier to put page images from OCA content into a Google Notebook; whereas in Google Books you have to crop out a section of a page and select where to send it, with OCA you just right click and send the entire page image to your notebook. (For instance, I created one for different editions of Reveries, documenting illustrations, title pages, etc.)

Limitations of Google Books

  • As noted above, not all public domain materials are available
  • Weirdness in retrieval of search results; 800 results suddenly become 220 when you work your way through the results
  • OCR errors: Among the different variations of “Ik Marvel” and “Reveries of a Bachelor; A Book of the Heart” that I found:
    o    IK MABVEL
    o    Heveries of a Bachelor (a search for this term yields 10 results in Google Books)
    o    REVERIES OF A BACHELOR; or, a Rook of the Heart
    o    REVERIES OF A BACHELOR; or, a Bonk of the Heart.
    o    Reveries of a Bad elor.
    o    REVERIES OF A BACHELOR, a Boob of the Heart. By IK. MAETEL
    You have to be resourceful, then, in how you construct a search, taking into account OCR problems.  That said, “Reveries of a Bachelor” returned hundreds of results.
  • Google Books does not contain archival materials. (Google has moved into digitizing newspapers and magazines, so who knows–maybe archives are coming? But it would be very tricky and expensive for Google to undertake such a project.)  Although searching Google Books is certainly more convenient than visiting an archive, I love being in archives, looking at stuff that few others have seen.  Even though I found a lot of useful resources in Google Books, I learned the most about the publishing history of Reveries by examining the letters from Charles Scribner II to Mitchell held by the Beinecke Library at Yale and by examining the volumes referenced in the letters.
  • If you’re interested in bibliography, as I am, looking at even a high quality scan can’t substitute for examining the physical volume, studying details such as the size of the book, the quality of the paper, the bindings, etc. But scans can give you an idea of what the volume looks like and help you to identify it.

In my next post, I’ll look at how using Google Books is helping me reconstruct the history of readers’ responses to Reveries.

Work Product Blog

Matt Wilkens, post-doctoral fellow at Rice’s Humanities Research Center, recently launched Work Product, a blog that chronicles his research in digital humanities, contemporary fiction, and literary theory.  Matt details how he is working through the challenges he faces as he tries to analyze the relationship between allegory and revolution by using text mining, such as:
•    Where and how to get large literary corpora. Matt looks at how much content is available through Project Gutenberg, Open Content Alliance, Google Books, and  Hathi Trust and  how difficult it is to access
•    Evaluating Part of Speech taggers, with information about speed and accuracy

I think that other researchers working on text mining projects will benefit from Matt’s careful documentation of his process.

By the way, Matt’s blog can be thought of as part of the movement called “open notebook science,” which Jean Claude Bradley defines as “a laboratory notebook… that is freely available and indexed on common search engines.”  Other humanities and social sciences blogs that are likewise ongoing explorations of particular research projects include Wesley Raabe’s blog, Another Anthro Blog, and Erkan’s Field Diary.  (Please alert me to others!)