COSC471-21S1 (C) Semester One 2021

Special Topic: Natural Language Processing

15 points

Details:
Start Date: Monday, 22 February 2021
End Date: Sunday, 27 June 2021
Withdrawal Dates
Last Day to withdraw from this course:
  • Without financial penalty (full fee refund): Sunday, 7 March 2021
  • Without academic penalty (including no fee refund): Friday, 14 May 2021

Description

This course introduces central problems and methods in natural language processing. There is a special focus on the challenges presented by low-resource languages in the Pacific. Through their experiences in this course, students will be able to describe the central problems and methods in natural language processing, apply standard methods and models to existing text datasets, compare standard methods by their assumptions and applications, design an application of existing methods to a NZ-specific context, and evaluate the performance of the above application against reasonable baselines.

In this course we will examine Natural Language Processing theory and applications with an emphasis on how NLP algorithms are built typically, though not exclusively, using statistical machine learning.

The theoretical topics we will cover include:

•             Encoding natural language as features.
•             Estimating features using smoothing, normalization, sampling, and expectation-maximization.
•             Classifying text, training and cross-validation.
•             Distributed word representations such as skip-grams, word2vec and evaluating stability and similarity.
•             Language models: training and evaluation (perplexity), word prediction, and other applications.
•             Sequence models: problem of transitions, Viterbi algorithm, and parsing

Applications of these concepts that we will look at include:

•             Corpus similarity measures
•             Building dictionaries
•             Named-entity recognition
•             Part-of-speech tagging
•             Language identification
•             Topic classification
•             Finding lexical clusters
•             Phrase completion
•             Predicting sentence probabilities

Prerequisites

(1) COSC262; (2) Approval by the Head of Department of Computer Science and Software Engineering

Course Coordinator

Ben Adams

Lecturer

Jonathan Dunn

Indicative Fees

Domestic fee $1,033.00

* All fees are inclusive of NZ GST or any equivalent overseas tax, and do not include any programme level discount or additional course-related expenses.

For further information see Computer Science and Software Engineering .

All COSC471 Occurrences

  • COSC471-21S1 (C) Semester One 2021