Monthly Archives: May 2016

Studying How Digital Humanists Use GitHub

Over the past academic year, I’ve been fortunate to participate in Rice’s Mellon-sponsored Sawyer Seminar on Platforms of Knowledge, where we’ve examined platforms for authoring, annotation, mapping, and social networking. We’ve discussed both the possibilities that platforms may open up for inquiry, public engagement and scholarly communications and the risks that they may pose for privacy and nuanced humanistic analysis. Inspired by the questions raised by the Seminar, my colleague Sean Smith and I are studying a platform used by a number of digital humanists: GitHub. Digital humanists employ GitHub not only for code, but also for writing projects, syllabi, websites, and other scholarly resources. We’ll present our initial findings at Digital Humanities 2016, but I wanted to offer some background to the study, especially since some of you will soon be receiving emails from me inviting you to participate in it.

Initially I was interested in using GitHub for a case study of how we assess and select digital platforms. Even as many researchers (myself included) rely on digital platforms, I haven’t been able to find many clear rubrics for evaluating them. Building on Quinn Dombrowski’s recommendations for choosing a platform for a web project, we are looking at criteria such as functionality and ease of use. In previous work examining archival management systems, I learned how important it is to talk with users about their experience with tools, so we will be conducting a survey and interviews about GitHub. Sean and I also also realized that GitHub itself provides valuable data about how people use GitHub, such as information about collaboration, code re-use, and connections to others. Our study will thus include analysis of publicly available data about selected GitHub users and repositories. (Of course, there is significant prior work on this topic in fields such as social computing that we will draw upon.)

With this project, we are:

  1. Identifying digital humanists who have GitHub accounts. For the purposes of this study, we are looking at presenters at the last three Digital Humanities conferences and people affiliated with organizations that belong to centerNet (assuming that the information is publicly available). Of course, this method is imperfect– it misses digital humanists who didn’t attend the DH conferences or who aren’t affiliated with DH centers, and it may include some people who don’t really consider themselves digital humanists. But it’s a start.
  2. Contacting those whose email addresses are easily retrievable (e.g. available via GitHub) and:
    1. Giving them the opportunity to opt out of having their publicly available GitHub data being included in our analysis and in the dataset that we plan to share at the end of the study. (Added 5/18/16: To be extra careful, we plan to anonymize this dataset.)
    2. Inviting them to take a brief survey about their usage and opinions of GitHub
    3. Inviting them to participate in an interview

    We may also contact people whose emails aren’t in the GitHub data but are otherwise available.

  3. Analyzing GitHub data from our dataset to gain insight into how digital humanists use GitHub.

We want to conduct this study openly while at the same respecting privacy. In conducting interviews for past studies, I’ve been frustrated that I can’t publicly identify and credit people who have made brilliant comments because of the promise of confidentiality.  So we’re giving interviewees the option to make all or some of their interview notes public–but of course they can instead keep the notes private and remain anonymous. Survey data will be anonymized but ultimately shared.

Here are important documents related to our study:

I welcome feedback and questions about this study. I hope that it will contribute to developing criteria for evaluating platforms like GitHub and offer insights into how digital humanities researchers and developers work.