Overview#

Welcome to STATS 305B! Officially, this course is called Applied Statistics II. Unofficially, I call it Models and Algorithms for Discrete Data. We will cover models ranging from generalized linear models to sequential latent variable models, autoregressive models, and transformers. On the algorithm side, we will cover a few techniques for convex optimization, as well as approximate Bayesian inference algorithms like MCMC and variational inference. I think the best way to learn these concepts is to implement them from scratch, so coding will be a big focus of this course. By the end of the course, you’ll have a strong grasp of classical techniques as well as modern methods for modeling discrete data.

Logistics#

Instructor: Scott Linderman
TAs: Amber Hu
Term: Winter 2024-25
Time: Monday and Wednesday, 1:30-2:50pm
Location: Sequoia Hall, Room 200, Stanford University

Office Hours

  • Scott: TBD

  • Amber: TBD

Prerequisites#

Students should be comfortable with undergraduate probability and statistics as well as multivariate calculus and linear algebra. This course will emphasize implementing models and algorithms, so coding proficiency with Python is required. (HW0: Python Primer will help you get up to speed.)

Books#

This course will draw from a few textbooks:

  • Agresti, Alan. Categorical Data Analysis, 2nd edition. John Wiley & Sons, 2002. link

  • Gelman, Andrew, et al. Bayesian Data Analysis, 3rd edition. Chapman and Hall/CRC, 2013. link

  • Bishop, Christopher. Pattern Recognition and Machine Learning. Springer, 2006. link

We will also cover material from research papers.

Schedule#

Please note that this is a tentative schedule. It may change slightly depending on our pace.

Date

Topic

Reading

Jan 6, 2024

Discrete Distributions and the Basics of Statistical Inference

[Agr02] Ch. 1

Jan 8, 2024

Contingency Tables

[Agr02] Ch. 2-3

Jan 13, 2024

Logistic Regression

[Agr02] Ch. 4-5

Jan 15, 2024

Exponential Families

[Agr02] Ch. 4-5

Jan 20, 2024

MLK Day. No class

Jan 22, 2024

Generalized Linear Models

[Agr02] Ch. 6

Jan 27, 2024

Bayesian Inference

[GCS+95] Ch. 1

Jan 29, 2024

Bayesian GLMs

[AC93]

Feb 3, 2024

L1-regularized GLMs

[FHT10] and [LSS14]

Feb 5, 2024

Midterm (in class)

Feb 10, 2024

Mixture Models and EM

[Bis06] Ch. 9

Feb 12, 2024

Hidden Markov Models

[Bis06] Ch. 13

Feb 17, 2024

Presidents’ Day. No class

Feb 19, 2024

Variational Autoencoders (Demo)

[KW19] Ch.1-2

Feb 24, 2024

Recurrent Neural Networks

[GBC16] Ch. 10

Feb 26, 2024

Tranformers

[Tur23]

Mar 3, 2024

State Space Layers (S4, S5, Mamba)
Guest lecture by Jimmy Smith

[SWL23] and [GD23]

Mar 5, 2024

Random Graph Models

Mar 10, 2024

Denoising Diffusion Models

[TDM+24]

Mar 12, 2024

Wrap Up

Assignments#

There will be 5 assignments due roughly every other Friday. They will not be equally weighted. The first one is just a primer to get you up to speed; the last one will be a bit more substantial than the rest.

Exams#

  • Midterm Exam: In class on TBD

    • You may bring a cheat sheet covering one side of an 8.5x11” piece of paper

  • Final Exam: On TBD in Room TBD

    • You may bring a cheat sheet covering both sides of an 8.5x11” piece of paper

Grading#

Tentatively:

Assignment

Percentage

HW 0

5%

HW 1-3

15% each

HW 4

20%

Midterm

10%

Final

15%

Participation

5%