Overview#
Welcome to STATS 305B! Officially, this course is called Applied Statistics II. Unofficially, I’m calling it Models and Algorithms for Discrete Data, because that’s what it’s really about. We will cover models ranging from generalized linear models to sequential latent variable models, autoregressive models, and transformers. On the algorithm side, we will cover a few techniques for convex optimization, as well as approximate Bayesian inference algorithms like MCMC and variational inference. I think the best way to learn these concepts is to implement them from scratch, so coding will be a big focus of this course. By the end of the course, you’ll have a strong grasp of classical techniques as well as modern methods for modeling discrete data.
Logistics#
Instructor: Scott Linderman
TAs: Xavier Gonzalez and Leda Liang
Term: Winter 202324
Time: Monday and Wednesday, 1:302:50pm
Location: Room 380380D, Stanford University
Office Hours
Scott: Wednesday 910am in the 2nd floor lounge of the Wu Tsai Neurosciences Institute
Leda: Thursday 57pm in Sequoia Hall, Room 207 (Bowker)
Xavier: Friday 35pm in Building 360, Room 361A
Prerequisites#
Students should be comfortable with undergraduate probability and statistics as well as multivariate calculus and linear algebra. This course will emphasize implementing models and algorithms, so coding proficiency with Python is required. (HW0: Python Primer will help you get up to speed.)
Books#
This course will draw from a few textbooks:
Agresti, Alan. Categorical Data Analysis, 2nd edition. John Wiley & Sons, 2002. link
Gelman, Andrew, et al. Bayesian Data Analysis, 3rd edition. Chapman and Hall/CRC, 2013. link
Bishop, Christopher. Pattern Recognition and Machine Learning. Springer, 2006. link
We will also cover material from research papers.
Schedule#
Please note that this is a tentative schedule. It may change slightly depending on our pace.
Date 
Topic 
Reading 

Jan 8, 2024 
Discrete Distributions and the Basics of Statistical Inference 
[Agr02] Ch. 1 
Jan 10, 2024 
[Agr02] Ch. 23 

Jan 15, 2024 
MLK Day. No class 

Jan 17, 2024 
[Agr02] Ch. 45 

Jan 22, 2024 
[Agr02] Ch. 45 

Jan 24, 2024 
[Agr02] Ch. 6 

Jan 29, 2024 
[GCS+95] Ch. 1 

Jan 31, 2024 
[AC93] 

Feb 5, 2024 

Feb 7, 2024 
Midterm (in class) 

Feb 12, 2024 
[Bis06] Ch. 9 

Feb 14, 2024 
[Bis06] Ch. 13 

Feb 19, 2024 
Presidents’ Day. No class 

Feb 21, 2024 
[KW19] Ch.12 

Feb 26, 2024 
[GBC16] Ch. 10 

Feb 28, 2024 
[Tur23] 

Mar 4, 2024 
State Space Layers (S4, S5, Mamba) 

Mar 6, 2024 

Mar 11, 2024 
Cancelled 

Mar 13, 2024 
[TDM+24] 
Assignments#
There will be 5 assignments due roughly every other Friday. They will not be equally weighted. The first one is just a primer to get you up to speed; the last one will be a bit more substantial than the rest.

Released Mon, Jan 8, 2024
Due Fri, Jan 12, 2024 at 11:59pm
Homework 1: Logistic Regression
Released Wed, Jan 17, 2024
Due Fri, Jan 26, 2024 at 11:59pm

Released Wed, Jan 31, 2024
Due Wed, Feb 14, 2024 at 11:59pm
Homework 3: Hidden Markov Models
Released Fri, Feb 16, 2024
Due Mon, Feb 26, 2024 at 11:59pm
Homework 4: Large Language Models
Released Wed, Feb 28, 2024
Due Fri, Mar 15, 2024 at 11:59pm
Exams#
Midterm Exam: In class on Wed, Feb 7, 2024
You may bring a cheat sheet covering one side of an 8.5x11” piece of paper
Final Exam: Wed, March 20, 2024 from 3:306:30pm in Room 530127
In addition to reviewing the midterm and the lecture notes, you may want to try these practice problems (solutions are here).
You may bring a cheat sheet covering both sides of an 8.5x11” piece of paper
Grading#
Tentatively:
Assignment 
Percentage 

HW 0 
5% 
HW 13 
15% each 
HW 4 
20% 
Midterm 
10% 
Final 
15% 
Participation 
5% 