Computational Linguistics

MSc Course, CuCEng, 2025

Course Objective

Theoretical foundations of modern language models, deep linguistic analysis, and LLM architectures. Unlike standard NLP engineering courses, this course focuses on the “Why” and “How” of linguistic theories as applied to deep learning architectures, moving from sub-word morphology to the latest large language model alignment techniques.

Assessment and Evaluation

Paper Critiques (40%): Weekly critical analysis of the assigned seminal paper (10-12 papers throughout the semester).
Applied Term Project (60%): Literature review, implementation/coding, and final presentation.

Weekly Schedule

Week	Topic	Seminal Paper (Reading Assignment)
1	Advanced Morphology & Sub-word Tokenization	Sennrich, R., et al. (2016). “Neural Machine Translation of Rare Words with Subword Units”. (BPE Algorithm)
2	Syntactic Parsing	Dozat, T., & Manning, C. D. (2017). “Deep Biaffine Attention for Neural Dependency Parsing”.
3	Distributional & Contextual Semantics	Peters, M. E., et al. (2018). “Deep contextualized word representations”. (ELMo)
4	Compositional Semantics & Generalization	Lake, B., & Baroni, M. (2018). “Generalization without Systematicity: On the Compositional Skills of Seq2Seq Networks”.
5	Transformer Architecture & Self-Attention	Vaswani, A., et al. (2017). “Attention Is All You Need”.
6	LLM Training & PEFT (LoRA)	Hu, E. J., et al. (2021). “LoRA: Low-Rank Adaptation of Large Language Models”.
7	Alignment & RLHF	Ouyang, L., et al. (2022). “Training language models to follow instructions with human feedback”. (InstructGPT)
8	RAG & Information Retrieval	Lewis, P., et al. (2020). “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks”.
9	Long Context Management	Beltagy, I., et al. (2020). “Longformer: The Long-Document Transformer”.
10	Next-Gen Evaluation Metrics	Zhang, T., et al. (2020). “BERTScore: Evaluating Text Generation with BERT”.
11	Interpretability (XAI) & Probing	Hewitt, J., & Manning, C. D. (2019). “A Structural Probe for Finding Syntax in Word Representations”.
12	Multilinguality & Low-Resource Languages	Conneau, A., et al. (2020). “Unsupervised Cross-lingual Representation Learning at Scale”. (XLM-R)
13	Project Presentations - I	Literature Review and Methodology Presentation
14	Project Presentations - II	Experimental Results and Final Presentation

📝 Paper Reading & Review Template

Students are expected to use the following 4-step template when preparing their weekly paper critiques:

1. Problem Identification > What specific problem are the authors trying to solve? Which gap in the existing literature are they aiming to fill?
2. Core Innovation (The Hook) > What is the unique aspect of the proposed method/model? (e.g., Is it a new architecture, a new loss function, or a new dataset?)
3. Evidence & Experiments > How did they prove their claims? Which “Baseline” models did they compare against, and on which metrics (SOTA) did they achieve success?
4. Critique (Weakness) > What is the missing, biased, or limited aspect of the paper or method that needs future improvement?