CEng 574 Statistical Data Analysis

MIDDLE EAST TECHNICAL UNIVERSITY
DEPT. OF COMPUTER ENGINEERING

CEng 574 STATISTICAL DATA ANALYSIS
Fall 2022
Course Webpage

Instructor      Volkan Atalay
e-mail              vatalay AT metu.edu.tr
office                A-406
Class               Thursday 9:40-12:30  at G-102 (this classroom is in Computer Center)
Office Hour   TBA

Course web page address http://blog.metu.edu.tr/vatalay/ceng-574-statistical-data-analysis/
Course Objectives
The objective of this course is to introduce the concepts and techniques of clustering and multivariate and exploratory data analysis. This course also offers an opportunity to perform data analysis by using data visualization, projection, and embedding.
Prerequisites     Knowledge of programming, probability, and linear algebra.
Main Reference Book
Alpaydın, Introduction to Machine Learning, 2nd Edition (2010) or 3rd Edition (2014), The MIT Press.
(Yapay Öğrenme, Turkish language edition, translated by the author, Boğaziçi Üniversitesi Yayınevi, 1st Edition 2011, 2nd Edition 2013, 3rd Edition 2017)

Course Outline
1          Data representation, distance metrics, and similarity measures
2          Linear and non-linear projection methods, data embedding methods
3          Data clustering algorithms and methods
4          Evaluation of clustering algorithms and validation of clusterings
5          Applications of data clustering in various fields such as bioinformatics and data stream analysis

Grading
Assignment #1, #2                    3pts each         6
Assignment #3, #4, #5           10pts each       30
Assignment #6                           6pts                  6
Final dataset analysis                                       18
Midterm                                                              30
Attendance and class participation                10

Notes and Remarks
Students coming from graduate programs other than the Computer Engineering program should attend the first class.

We will use ODTUClass for the conduct and for all of the materials for this course.
Assignments should be done on an individual basis.
The final dataset analysis will be performed in a team setting of 2 persons.
Late submission policy: you have a total of 4 days of late submission.
Academic Integrity Guide for Students: http://oidb.metu.edu.tr/system/files/Academic%20Integrity%20Guide%20for%20Students.pdf

Introduction lecture slides

 

Related Links
PCA
A Tutorial on Principal Component Analysis, Jonathon Shlens, 2014, http://arxiv.org/pdf/1404.1100v1.pdf
A tutorial on Principal Components Analysis, Lindsay I Smith, February 26, 2002, http://www.cs.otago.ac.nz/cosc453/student_tutorials/principal_components.pdf