Overview#

Welcome to STATS 305B! Officially, this course is called Applied Statistics II. Unofficially, I call it Models and Algorithms for Discrete Data. We will cover models ranging from generalized linear models to sequential latent variable models, autoregressive models, and transformers. On the algorithm side, we will cover a few techniques for convex optimization, as well as approximate Bayesian inference algorithms like MCMC and variational inference. I think the best way to learn these concepts is to implement them from scratch, so coding will be a big focus of this course. By the end of the course, you’ll have a strong grasp of classical techniques as well as modern methods for modeling discrete data.

Logistics#

Instructor: Scott Linderman
TAs: Amber Hu and Michael Salerno
Term: Winter 2024-25
Time: Monday and Wednesday, 1:30-2:50pm
Location: Sequoia Hall, Room 200, Stanford University

Office Hours

  • Scott: Wed 10-11am, Wu Tsai Neurosciences Institute, 2nd Floor in the Theory Center

  • Michael: Thu, 5-7pm, Sequoia library (Rm 105)

  • Amber: Fri 1:30-3:30pm, Sequoia library (Rm 105)

Prerequisites#

Students should be comfortable with undergraduate probability and statistics as well as multivariate calculus and linear algebra. This course will emphasize implementing models and algorithms, so coding proficiency with Python is required. (HW0: Python Primer will help you get up to speed.)

Books#

This course will draw from a few textbooks:

  • Agresti, Alan. Categorical Data Analysis, 2nd edition. John Wiley & Sons, 2002. link

  • Gelman, Andrew, et al. Bayesian Data Analysis, 3rd edition. Chapman and Hall/CRC, 2013. link

  • Bishop, Christopher. Pattern Recognition and Machine Learning. Springer, 2006. link

We will also cover material from research papers.

Schedule#

Please note that this is a tentative schedule. It may change slightly depending on our pace.

Date

Topic

Slides

Additional Reading

Mon, Jan 6, 2025

Basics of Probability and Statistics and Contingency Tables
HW0 Released

download

[Agr02] Ch. 1-3

Wed, Jan 8, 2025

Logistic Regression

download

[Agr02] Ch. 4-5

Fri, Jan 10, 2025

HW0 Due

Mon, Jan 13, 2025

Exponential Families
HW1 Released

download

[Agr02] Ch. 4-5

Wed, Jan 25, 2025

Generalized Linear Models

download

[Agr02] Ch. 6

Mon, Jan 20, 2025

MLK Day. No class

Wed, Feb 22, 2025

Sparse GLMs

download

[FHT10] and [LSS14]

Fri, Jan 24, 2025

HW1 Due

Mon, Jan 27, 2025

Bayesian Inference
HW2 Released

download

[GCS+95] Ch. 1

Wed, Jan 29, 2025

Markov Chain Monte Carlo and Bayesian GLM Demo

download

Mon, Feb 3, 2025

Variational Inference

Wed, Feb 5, 2025

Midterm Exam from 1:30-2:50pm in MCCULL 115.

download

Mon, Feb 10, 2025

Mixture Models and EM
HW2 Due; HW3 Released

[Bis06] Ch. 9

Wed, Feb 12, 2025

Hidden Markov Models

[Bis06] Ch. 13

Mon, Feb 17, 2025

Presidents’ Day. No class

Wed, Feb 19, 2025

Linear Dynamical Systems

Fri, Feb 21, 2025

HW3 Due

Mon, Feb 24, 2025

Variational Autoencoders
HW4 Released

[KW19] Ch.1-2

Wed, Feb 26, 2025

Tranformers

[Tur23]

Mon, Mar 3, 2025

State Space Layers (S4, S5, Mamba)

[SWL23] and [GD23]

Wed, Mar 5, 2025

Denoising Diffusion Models

[TDM+24]

Mon, Mar 10, 2025

Point Processes

Wed, Mar 12, 2025

Wrap Up

Fri, Mar 14, 2025

HW4 Due

Assignments#

There will be 5 assignments due roughly every other Friday. They will not be equally weighted. The first one is just a primer to get you up to speed; the last one will be a bit more substantial than the rest.

Late Policy#

We will allow 5 late days to be used as needed throughout the quarter.

Exams#

  • Midterm Exam: In class on TBD

    • You may bring a cheat sheet covering one side of an 8.5x11” piece of paper

    • Practice Exam: download

    • Practice Exam Solutions: Coming soon

  • Final Exam: On TBD in Room TBD

    • You may bring a cheat sheet covering both sides of an 8.5x11” piece of paper

Grading#

Tentatively:

Assignment

Percentage

HW 0

5%

HW 1-3

15% each

HW 4

20%

Midterm

10%

Final

15%

Participation

5%