Syllabus

Readings & Textbooks

Readings will frequently be drawn from the following textbooks:

  1. Daniel Jurafsky and James H. Martin. Speech and Language Processing. Draft chapters for the third edition can be found here. (I refer to this book as “SLP3” in the syllabus.)

    This textbook is the single most comprehensive and up-to-date introduction available to the field of computational linguistics. Note: we will also be using some chapters from the second edition (2008); these will be referred to as J&M2008 on the syllabus, and will be available on Canvas.

  2. Jacob Eisenstein. 2019. Introduction to Natural Language Processing. MIT Press. Pre-publication PDF available here.

    This is an excellent more recent textbook on natural language processing, with contemporary deep learning thoroughly integrated. We will be using the pre-publication version freely available, under the Creative Commons CC BY-NC-ND license, here.

  3. Bird, Steven, Ewan Klein, and Edward Loper. 2009. Natural Language Processing with Python. O’Reilly Media. (I refer to this book as “NLTK” in the syllabus.)

    This is the book for the Natural Language Toolkit (or NLTK), which we may be using for programming. We will also be doing some of our programming in the Python programming language, and will make quite a bit of use of Python. You can buy this book, or you can freely access it on the Web at http://www.nltk.org/book.

  4. Emily Bender. 2013. Linguistic Fundamentals for Natural Language Processing: 100 Essentials from Morphology and Syntax. Springer. If you are authenticated using your MIT credentials you can download this book in PDF format here.

  5. Christopher D. Manning and Hinrich Schütze. 1999. Foundations of statistical natural language processing. Cambridge: MIT press. Book chapter PDFs can be obtained through the MIT library website. (I refer to this book as “M&S” in the syllabus.)

    This is an older but still very useful book on natural language processing (NLP).

We’ll also frequently draw on other sources for readings, including original research papers in computational linguistics, psycholinguistics, and other areas of the cognitive science of language.

Schedule

9.19: Computational Psycholinguistics
Week Day Topic Videos Slides Readings In-class handouts and exercises Optional videos Related readings Problem sets
Week 1 Wed 6 Sep Course Introduction and Intro to Probability Theory Introductory Probability Theory Introductory Probability Theory M&S 2.1     Goldsmith, 2007 Pset 1 out
Week 2 Mon 11 Sep Introductory Speech Perception Introduction to Speech Perception and Rational Analysis Speech Perception and Rational Analysis (with builds; no builds) PMSL 5.2.4        
  Wed 13 Sep Speech Perception (continued) and Introductory Rational Analysis     Anderson 1990, Chapter 1     Clayards et al., 2008  
Week 3 Mon 18 Sep Introduction to Word Meaning   Introduction to word meaning (with builds; no builds) Lake & Murphy, 2023        
  Wed 20 Sep Word Embeddings   Word embeddings (with builds; no builds) Jurafsky & Martin SLP3 draft, Chapter 6       Pset 2 out
Week 4 Mon 25 Sep Sentences, N-grams, and language models Introductory language models, part 1; Introductory language models, part 2 Introductory Language Models (no builds; with builds) SLP3 3, Sections 3.1–3.5 and 3.8       Pset 1 due
  Wed 27 Sep Psycholinguistic methods Introduction to Psycholinguistic Methods (Part 1; Part 2: visual world; Part 3: reading; Part 4: neural methods) Psycholinguistic methods, prediction in human language processing, and surprisal theory (no builds; with builds) Kutas et al., 2011        
Week 5 Mon 2 Oct Regular expressions, phonotactics, and finite-state machines Part 1: Regular expressions; Part 2: Phonotactics; Part 3: Finite-state automata Regular expressions and finite-state machines Eisenstein, 2018, Section 9.1; Optional: J&M2008 Chapter 2.1–2.4, Chapter 3 (downloadable from Canvas) Finite-state syntax fragment      
  Wed 4 Oct Regular languages, morphology, and syntax. Is human language finite-state? Regular languages and their relation with finite-state models; Finite-state models, English syntax, and weak vs. strong generative capacity; Multiple center-embedding, the pumping lemma, and limitations of finite-state machines Regular expressions and finite-state machines     Finite-state transducers   Pset 2 due, Pset 3 out
Week 6 Mon 9 Oct Indigenous People’s Day, no class              
  Wed 11 Oct Context-free grammars; syntactic analysis Context-free grammars (part 1; part 2; part 3) Context-free grammars SLP3 Chapter 17, Sections 1–5; Bender 2013, Chapters 5 (Syntax) and 6 (Parts of speech) Context-free grammar fragment      
Week 7 Mon 16 Oct Context-free grammars and syntactic analysis continued Unbounded dependency constructions (part 1; part 2; part 3) Context-free grammars, part 2 Bender 2013, Chapter 7 (Heads, arguments, and adjuncts); NLTK book, Chapter 8, Sections 1–5 More CFG fragment development      
  Wed 18 Oct Probabilistic context-free grammars Surprisal as a measure of linguistic expectation; Syntactic corpus annotation and the Penn Treebank; Syntactic ambiguity and interpretation preferences; Probabilistic context-free grammars and the probabilistic Earley algorithm; Human syntactic processing and surprisal: garden-pathing Probabilistic context-free grammars, garden-pathing, and surprisal (with builds; no builds) SLP3 Appendix C; Levy, 2013; NLTK book, Chapter 8, Section 6 Syntactic ambiguity and tree search     Pset 3 due, Pset 4 out
Week 8 Mon 23 Oct Midterm Exam              
  Wed 25 Oct Bayes Nets and the perceptual magnet Bayes Nets; The perceptual magnet effect: a Bayesian account Bayes Nets (with builds; no builds); Perceptual Magnet (with builds; no builds) Russell & Norvig, 2010, Chapter 14; Feldman & Griffiths, 2006; Levy in progress, Directed Graphical Models appendix Bayes nets and the perceptual magnet      
Week 9 Mon 30 Oct Multi-factor and hierarchical models: logistic regression; word-order preferences in language; the binomial construction. Binomial ordering preferences; Using n-gram statistics to study binomials; Logistic regression; Idiosyncrasy and hierarchical models Binomials & Logistic Regression (with builds; no builds) Russell & Norvig 2010, Chapter 18.6 (on Canvas); Morgan & Levy 2015 Binomials and logistic regression     Pset 4 due
  Wed 1 Nov Neural Networks for Natural Language Introduction to neural networks; The neural n-gram model Logistic regression recap, and simple neural networks (no builds; with builds; Neural networks for natural language (no builds; with builds     Recap of binomials and logistic regression; Simple recurrent networks; GRUs and LSTMs; Learning the “counting language”    
Week 10 Mon 6 Nov Transformers and large language models Transformer model architecture; Interacting with GPT-2; Targeted syntactic evaluation with SyntaxGym; Filler–Gap Dependencies; Island constraints Transformers, targeted syntactic evaluation, and learnability (no builds; with builds) Vaswani et al., 2017; Sasha Rush’s The Annotated Transformer; Wilcox et al., 2018 Transformer language models      
  Wed 8 Nov Predictive processing in human language comprehension   Predictive processing in human language comprehension (no builds; with builds) Kutas et al., 2011 Testing GPT-2     Pset 5 out
Week 11 Mon 13 Nov Noisy-channel language understanding Language comprehension and rational analysis; Surprisal and local coherence effects; Noisy channel models; Accounting for local coherence effects, part 1; Accounting for local coherence effects, part 2 Noisy-channel language processing (no builds; with builds) Gibson, Bergen, and Piantadosi, 2013        
  Wed 15 Nov Natural language semantics I   Introductory compositional semantics (with builds; no builds)         Pset 6 out
Week 12 Mon 20 Nov Natural language semantics II              
  Wed 22 Nov Pragmatics   Introductory Bayesian Pragmatics (with builds; no builds) Goodman & Frank, 2016     Grice, 1975 Pset 5 due
Week 13 Mon 27 Nov Human Language Production   Language Production I Levy & Jaeger, 2006       Pset 7 out
  Wed 29 Nov Human Language Production, continued   Language Production II Zhan & Levy, 2018; Clark et al., 2022       Pset 6 due
Week 14 Mon 4 Dec Language development and acquisition First language acquisition as unsupervised learning; Learning vowel categories; Conjugacy; Gibbs sampling   Feldman et al., 2013 (Simulations 2–4 not necessary but encouraged)        
  Wed 6 Dec Language development and acquisition II Unsupervised word segmentation; Transition probabilities; Generative model for word segmentation; Bigram model for word segmentation   Goldwater et al., 2009        
  Fri 8 Dec               Pset 7 due; Graduate course projects due
Week 15 Mon 11 Dec The diversity of languages across the world   Global linguistic diversity (with builds; no builds)          
  Wed 13 Dec End-of-semester review              
Final Exam Wed 20 Dec Final Exam