DATA423-21S2 (C) Semester Two 2021

Data Science in Industry

15 points

Details:
Start Date: Monday, 19 July 2021
End Date: Sunday, 14 November 2021
Withdrawal Dates
Last Day to withdraw from this course:
  • Without financial penalty (full fee refund): Sunday, 1 August 2021
  • Without academic penalty (including no fee refund): Friday, 1 October 2021

Description

In this course we will address core topics in the application of data science in industry.

This course is taught by a practising Data Scientist and attempts to teach real-life issues that will not be found in text books. The course will cover topics deemed central for a career in Data Science.

This course is heavily focused on the “applied” side of data science rather than the
theoretical. We will use R as the language of choice. Much of the material involving R and shiny
will involve a degree of self learning especially in the early part of the course.

Learning Outcomes

  • There is an emphasis on three main themes.

  • Best statistical practise
    We will progressively look at each stage of analysing data and producing a model of it.
    Best practise is mainly about doing the right things in the order right. In particular we look at the vexing issue of “data leakage.”

  • Communication through visualisation
    We will employ “Shiny” to visualise our data science. Shiny is built upon R and enables you to write an interactive web page employing dynamic visualisations. This is a great way to “sell” your work to your “clients” through a clear message that non-technical decision makers can relate to.

  • Problems typical of the “real” world
    Real life data is not like the numerous data sets that are available in the public domain. Real life data sets are messy; they have: ambiguity, missing data, useless variables, units, data-gaps, measurement uncertainty, correlation, near-zero variance, too many variables, unbalanced categories etc.

Pre-requisites

Subject to approval of the Head of Department of Mathematics and Statistics.

Timetable 2021

Students must attend one activity from each section.

Lecture A
Activity Day Time Location Weeks
01 Thursday 12:00 - 13:00 Psychology - Sociology 252 Lecture Theatre
19 Jul - 29 Aug
13 Sep - 24 Oct
Lecture B
Activity Day Time Location Weeks
01 Friday 11:00 - 12:00 Ernest Rutherford 141
19 Jul - 29 Aug
13 Sep - 24 Oct
Computer Lab A
Activity Day Time Location Weeks
01 Monday 12:00 - 13:00 Ernest Rutherford 464 Computer Lab
19 Jul - 29 Aug
13 Sep - 24 Oct

Course Coordinator / Lecturer

Nicholas Ward

Textbooks / Resources

There is no prescribed textbook.

Indicative Fees

Domestic fee $1,033.00

* All fees are inclusive of NZ GST or any equivalent overseas tax, and do not include any programme level discount or additional course-related expenses.

For further information see Mathematics and Statistics .

All DATA423 Occurrences