Advanced Machine Learning

MSc Course, CuCEng, 2025

Course Description

This Master’s level course dives deep into the theoretical and practical foundations of modern machine learning, focusing primarily on Deep Learning (DL) architectures, advanced optimization techniques, and probabilistic models. The course adopts a seminar and project-based structure, requiring students to engage critically with seminal papers and implement sophisticated models.

Students will be assessed solely based on weekly critiques and a significant hands-on term project.


Assessment and Evaluation

ComponentWeightDescription
Weekly Paper Critique & Participation40%Critical analysis of the assigned reading and active engagement in discussion.
Final Project60%Code implementation, technical report, and final presentation.

Weekly Schedule and Reading List

The curriculum is structured around three modules: Deep Foundations, Architectures & Generative Models, and Special Topics.

WeekTopicPrimary Reading (Paper/Chapter)
1Foundations & Advanced Optimization
(Review of SGD, Momentum, Adam)
Book Chapter: Goodfellow et al. Deep Learning, Chapter 8 (Optimization).
2Regularization in DL
(Batch Normalization, Dropout, L2)
Paper: Ioffe, S., & Szegedy, C. (2015). Batch Normalization: Accelerating Deep Network Training. ICML.
3Probabilistic ML & GPs
(Gaussian Processes, Bayesian Inference)
Book Chapter: Bishop, C. M. PRML, Chapter 2 & 6 (Gaussian Processes).
4Convolutional Architectures
(ResNets, VGG, Modern CNNs)
Paper: He, K. et al. (2016). Deep Residual Learning for Image Recognition. CVPR.
5Sequence Modeling with RNNs
(LSTMs, GRUs, Backpropagation Through Time)
Paper: Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural Computation.
6Attention Mechanisms
(Soft Attention, Global/Local Attention)
Paper: Bahdanau, D. et al. (2015). Neural Machine Translation by Jointly Learning to Align and Translate. ICLR.
7The Transformer Model
(Architecture, Positional Encoding, Multi-head Attention)
Seminal Paper: Vaswani, A. et al. (2017). Attention Is All You Need. NeurIPS.
8Midterm Week / No Exam - No Class 
9Variational Autoencoders (VAEs)
(ELBO, Latent Space Sampling)
Paper: Kingma, D. P., & Welling, M. (2014). Auto-Encoding Variational Bayes. ICLR.
10Generative Adversarial Networks (GANs)
(Minimax Game, WGANs, Training Stability)
Paper: Goodfellow, I. et al. (2014). Generative Adversarial Networks. NeurIPS.
11Transfer Learning & Fine-Tuning
(Pre-trained Models: BERT, ViT, Adapter Layers)
Paper: Devlin, J. et al. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. NAACL.
12Reinforcement Learning & Constitutional AI
(RLHF, Value Alignment, Ethical Control)
Paper: Bai, Y. et al. (2022). Constitutional AI: Harmlessness from AI Feedback. arXiv.
13Explainable AI (XAI) & Ethics
(LIME, SHAP, Interpretability vs. Explainability)
Survey Paper: Lipton, Z. (2018). The Mythos of Model Interpretability. ML for Healthcare.
14Final Project PresentationsDemonstration of working models and presentation of the final research findings.

Term Project Topics

Students are required to select one topic and deliver a functional codebase, a technical report, and a final presentation. The projects must involve the implementation or fine-tuning of an advanced ML architecture.

1. Transformer Implementation for Time Series Forecasting

Goal: Implement the Transformer architecture (or a specialized variant like Informer/Autoformer) from scratch and apply it to a multivariate time series forecasting task (e.g., energy consumption or stock prices).

  • Focus: Attention mechanism visualization and complexity analysis.

2. Controllable Image Generation using VAEs/GANs

Goal: Implement a VAE or a conditional GAN (cGAN) and train it on a specific dataset (e.g., CelebA for faces or a custom dataset).

  • Focus: Manipulate the latent space to control a specific output feature (e.g., generating images with specific attributes).

3. Deep Reinforcement Learning for Simple Control

Goal: Apply Deep Q-Learning (DQN) or Policy Gradients (REINFORCE) to solve a classic OpenAI Gym control problem (e.g., CartPole, LunarLander).

  • Focus: Compare the performance of different reward functions and network architectures.

4. Zero/Few-Shot Transfer Learning with LLMs

Goal: Fine-tune a pre-trained large language model (e.g., a smaller open-source BERT or GPT variant) for a highly specific, low-data downstream task (e.g., legal document classification or niche domain intent detection).

  • Focus: Evaluate performance using zero-shot, one-shot, and few-shot learning setups.

5. Interpreting Black-Box Models using XAI

Goal: Take a high-performing classification model (CNN or Transformer) and apply two different local explanation techniques (LIME and SHAP).

  • Focus: Compare the generated explanations for consistency and robustness across multiple data points and classes.