Computational Linguistics

MSc Course, CuCEng, 2025

Course Objective

Theoretical foundations of modern language models, deep linguistic analysis, and LLM architectures. Unlike standard NLP engineering courses, this course focuses on the “Why” and “How” of linguistic theories as applied to deep learning architectures, moving from sub-word morphology to the latest large language model alignment techniques.


Assessment and Evaluation

  • Paper Critiques (40%): Weekly critical analysis of the assigned seminal paper (10-12 papers throughout the semester).
  • Applied Term Project (60%): Literature review, implementation/coding, and final presentation.

Weekly Schedule

WeekTopicSeminal Paper (Reading Assignment)
1Advanced Morphology & Sub-word TokenizationSennrich, R., et al. (2016). “Neural Machine Translation of Rare Words with Subword Units”. (BPE Algorithm)
2Syntactic ParsingDozat, T., & Manning, C. D. (2017). “Deep Biaffine Attention for Neural Dependency Parsing”.
3Distributional & Contextual SemanticsPeters, M. E., et al. (2018). “Deep contextualized word representations”. (ELMo)
4Compositional Semantics & GeneralizationLake, B., & Baroni, M. (2018). “Generalization without Systematicity: On the Compositional Skills of Seq2Seq Networks”.
5Transformer Architecture & Self-AttentionVaswani, A., et al. (2017). “Attention Is All You Need”.
6LLM Training & PEFT (LoRA)Hu, E. J., et al. (2021). “LoRA: Low-Rank Adaptation of Large Language Models”.
7Alignment & RLHFOuyang, L., et al. (2022). “Training language models to follow instructions with human feedback”. (InstructGPT)
8RAG & Information RetrievalLewis, P., et al. (2020). “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks”.
9Long Context ManagementBeltagy, I., et al. (2020). “Longformer: The Long-Document Transformer”.
10Next-Gen Evaluation MetricsZhang, T., et al. (2020). “BERTScore: Evaluating Text Generation with BERT”.
11Interpretability (XAI) & ProbingHewitt, J., & Manning, C. D. (2019). “A Structural Probe for Finding Syntax in Word Representations”.
12Multilinguality & Low-Resource LanguagesConneau, A., et al. (2020). “Unsupervised Cross-lingual Representation Learning at Scale”. (XLM-R)
13Project Presentations - ILiterature Review and Methodology Presentation
14Project Presentations - IIExperimental Results and Final Presentation

📝 Paper Reading & Review Template

Students are expected to use the following 4-step template when preparing their weekly paper critiques:

1. Problem Identification > What specific problem are the authors trying to solve? Which gap in the existing literature are they aiming to fill?

2. Core Innovation (The Hook) > What is the unique aspect of the proposed method/model? (e.g., Is it a new architecture, a new loss function, or a new dataset?)

3. Evidence & Experiments > How did they prove their claims? Which “Baseline” models did they compare against, and on which metrics (SOTA) did they achieve success?

4. Critique (Weakness) > What is the missing, biased, or limited aspect of the paper or method that needs future improvement?