Studying How Digital Humanists Use GitHub

Over the past academic year, I’ve been fortunate to participate in Rice’s Mellon-sponsored Sawyer Seminar on Platforms of Knowledge, where we’ve examined platforms for authoring, annotation, mapping, and social networking. We’ve discussed both the possibilities that platforms may open up for inquiry, public engagement and scholarly communications and the risks that they may pose for privacy and nuanced humanistic analysis. Inspired by the questions raised by the Seminar, my colleague Sean Smith and I are studying a platform used by a number of digital humanists: GitHub. Digital humanists employ GitHub not only for code, but also for writing projects, syllabi, websites, and other scholarly resources. We’ll present our initial findings at Digital Humanities 2016, but I wanted to offer some background to the study, especially since some of you will soon be receiving emails from me inviting you to participate in it.

Initially I was interested in using GitHub for a case study of how we assess and select digital platforms. Even as many researchers (myself included) rely on digital platforms, I haven’t been able to find many clear rubrics for evaluating them. Building on Quinn Dombrowski’s recommendations for choosing a platform for a web project, we are looking at criteria such as functionality and ease of use. In previous work examining archival management systems, I learned how important it is to talk with users about their experience with tools, so we will be conducting a survey and interviews about GitHub. Sean and I also also realized that GitHub itself provides valuable data about how people use GitHub, such as information about collaboration, code re-use, and connections to others. Our study will thus include analysis of publicly available data about selected GitHub users and repositories. (Of course, there is significant prior work on this topic in fields such as social computing that we will draw upon.)

With this project, we are:

  1. Identifying digital humanists who have GitHub accounts. For the purposes of this study, we are looking at presenters at the last three Digital Humanities conferences and people affiliated with organizations that belong to centerNet (assuming that the information is publicly available). Of course, this method is imperfect– it misses digital humanists who didn’t attend the DH conferences or who aren’t affiliated with DH centers, and it may include some people who don’t really consider themselves digital humanists. But it’s a start.
  2. Contacting those whose email addresses are easily retrievable (e.g. available via GitHub) and:
    1. Giving them the opportunity to opt out of having their publicly available GitHub data being included in our analysis and in the dataset that we plan to share at the end of the study. (Added 5/18/16: To be extra careful, we plan to anonymize this dataset.)
    2. Inviting them to take a brief survey about their usage and opinions of GitHub
    3. Inviting them to participate in an interview

    We may also contact people whose emails aren’t in the GitHub data but are otherwise available.

  3. Analyzing GitHub data from our dataset to gain insight into how digital humanists use GitHub.

We want to conduct this study openly while at the same respecting privacy. In conducting interviews for past studies, I’ve been frustrated that I can’t publicly identify and credit people who have made brilliant comments because of the promise of confidentiality.  So we’re giving interviewees the option to make all or some of their interview notes public–but of course they can instead keep the notes private and remain anonymous. Survey data will be anonymized but ultimately shared.

Here are important documents related to our study:

I welcome feedback and questions about this study. I hope that it will contribute to developing criteria for evaluating platforms like GitHub and offer insights into how digital humanities researchers and developers work.

One response to “Studying How Digital Humanists Use GitHub

  1. A really useful area for study, I’ll fill out a response.
    Some “open” thoughts on that.

    I probably registered my first GitHub account when auditing “Digital History Methods” in Spring 2014, an undergraduate class taught by Rice [digital] historian Caleb McDaniel. We explored the programminghistorian.org/ website and a number of open-source tools, and registering for GitHub was a requirement of the class.

    Since then I have followed a number of GitHub-based projects (including LoC’s Viewshare and Knightlab’s TimelineJS and StoryMapJS projects), but haven’t contributed myself to a GitHub project except in terms of [power-]user response. I am more a visualization technologist/librarian than a coder.

    In my view, just having a GitHub account is not equivalent to being open-source. A true open-source project formats its contents with an eye to it being understood by an outside audience.

    I remember the old story (true or not) that one of the reasons McDonald’s franchises succeeded was because people could see the employees preparing the food in the back (something still true today). I think we need a higher standard ofor DH projects open-ness to be considered open-source.

    Being able to peek in on people doing their work is nice, but doesn’t constitute real access – to their project planning documents, internal conversations, and future intentions for the project. True, meaningful access requires more explanatory work on the part of the GitHub project creators – and ideally should include a tutorial for more-developed projects.

    As with most things involving “Digital Humanities,” what constitutes openness (and inclusivity) is being hashed out in real-time.
    For the past couple of years, Dr. McDaniel has been attempting to practice “Open Notebook History” (decribed here: http://wcm1.web.rice.edu/open-notebook-history.html). I’m curious as to whether Dr. McDaniel has revised this page since it was published 3 years ago (half-joking – if yes, is there a record of how the post has changed?).
    This question may seem cheeky (and unnecessary in this case), but it is totally pertinent to the goals of the project, “openness” regarding process.

    Regards, Andrew Taylor, gistro.wordpress.com

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s