DIGI405-20S2 (C) Semester Two 2020

Texts, Discourses and Data: the Humanities and Data Science

15 points

Start Date: Monday, 13 July 2020
End Date: Sunday, 8 November 2020
Withdrawal Dates
Last Day to withdraw from this course:
  • Without financial penalty (full fee refund): Friday, 24 July 2020
  • Without academic penalty (including no fee refund): Friday, 25 September 2020


This course examines computer-aided methods used in digital humanities and the social sciences for analysing discourses, an object of study that draws together multiple ways that language reflects and shapes social meanings. Within this context, it introduces concepts and methods for analysing natural language data and applies these through a series of practical lab classes. The first part of the course focuses on classic discourse analysis methods drawn from corpus linguistics, as well as the essential preprocessing steps used to prepare texts for a range of analytical purposes. In the second part of the course we study topic modeling, a technique for unsupervised, exploratory data analysis that has been widely used in digital humanities, and, finally, consider supervised text classification methods to identify discursive attributes such as sentiment, genre, or style.

Learning Outcomes

  • By the end of the course, students will be able to:
  • Apply exploratory data analysis methods relevant to the humanities to a corpus of texts
  • Preprocess text data in a range of ways and provide appropriate justification for these in their analyses
  • Explain the social and cultural contexts of data analysis and interpretation practices and their implications
  • Identify ethical and political dimensions to discourse analysis and text mining methods.


Subject to approval of the Programme Coordinator.

Timetable Note

Please note: Attendance at 'Drop in Class A' is optional.

Course Coordinator / Lecturer

Geoffrey Ford


Assessment Due Date Percentage  Description
Course participation 10% Each week there will be online activities to complete to support your learning. Weekly participation grades will be given based on completing these tasks. Your 10 best participation grades will count towards 10% of the overall course grade.
Lab exercises 20% Four lab exercises (5% each) to be completed during lab time that build on course readings, lectures and activities completed in previous labs.
Corpus building project 15% Submit a 1500 word report on building a corpus.
Corpus analysis assignment 25% Write a 2000 word analysis on a set topic, incorporating corpus-assisted discourse analysis methods.
Final examination 30% A test of your understanding of topics covered throughout the course.

Indicative Fees

Domestic fee $942.00

* All fees are inclusive of NZ GST or any equivalent overseas tax, and do not include any programme level discount or additional course-related expenses.

For further information see Humanities and Creative Arts.

All DIGI405 Occurrences