Data Mining in the Humanities

Byrne Seminar for first-year undergraduate students. Syllabus.

Popular media often portray “big data” as the exclusive province of information scientists, but data collection in the humanities can swiftly exceed the capacity of the human brain to analyze. Increasingly, humanists turn to digital tools to conduct quantitative research on literary texts, websites, tweets, images and sound recordings. How does one create or reuse a humanities data set? What tools are used to store, manipulate and process that data? How does one begin to analyze humanities research data and share findings in the form of visualizations? This course will explore some methodologies of quantitative analysis in the humanities using free and open source digital tools to yield insights into data that would otherwise be difficult to obtain. Through lectures, discussion, labs, and a digital final project, students will familiarize themselves with the tools of digital humanities scholarship and learn to form arguments on the basis of a few simple computational techniques.

This course introduces the particularities of humanities data and metadata through readings, discussions, and the examination of scholarly digital projects. Students experiment with several digital methods for collecting, processing and presenting humanities data. Sample tools and platforms include Twitter, WordPress, TAGS, and Gephi, among others.