DI 725– Transformers and Attention-Based Deep Networks

This course explores advanced concepts and applications of transformers and attention-based models in various domains, focusing particularly on natural language processing (NLP), time series and computer vision as well as unified vision and language understanding. It covers topics such as attention, vanilla transformer, large language models (LLM), LLM frameworks, NLP applications with LLM, Unified Vision-Language Understanding and Multi-modal Transformers, Distillation and data-efficient transformers, explainability, flash attention, in-context learning, prompting, and ethical concerns. The course aims to give both theoretical and practical aspects of the topics and present real-world use cases.

MMI 712– Machine Learning Systems Design and Deployment

The course covers several aspects of designing reliable and scalable machine learning systems for real-world deployment. It deals with development of production quality models and introduces the machine learning pipeline, concepts on machine learning system design and data engineering. It provides know-how on model development, and how to scale up the training for large models as well as evaluation, calibration and debugging of these models. Generation of reproducible models via experiment tracking tools and model versioning is also covered. Hardware platforms and frameworks for deployment are introduced, followed by basic deployment concepts, containerized deployment and testing.

MMI 727 – Deep Learning: Methods and Applications

This course aims to give background knowledge on several topics related to deep learning and provide a laboratory environment for practical applications. Backpropagation, convolutional neural networks, generative adversarial networks, energy-based learning, optimization techniques, recurrent networks, long short-term memory (LSTM), Deep Reinforcement Learning are some of the core topics that will be covered through the lectures. The course aims to balance theory and practice in that it will involve students implementing various algorithms, testing those algorithms under several domains and accessing GPU clouds during laboratory sessions to code the program examples using Tensorflow.

MMI 713 – Applied Parallel Programming on GPU

The course has been designed to give hands-on knowledge and development experience on general purpose GPU programming. The students will learn about the GPU as part of the PC architecture. Then they will learn about development of GPU software using CUDA C and OpenCL. Various optimization issues, particularly effective use of memory and floating point calculations will be discussed. The concepts and the effects of optimization will be demonstrated with case studies. The students will be expected to propose a compute-expensive problem to implement on the GPU and then develop and optimize it on the GPU and compare the performance results with the CPU implementation. They are also expected to compare various optimization strategies.