Overview

Overview#

Welcome to STATS 305B! Officially, this course is called Applied Statistics II. Unofficially, I call it Models and Algorithms for Discrete Data. We will cover models ranging from generalized linear models to sequential latent variable models, autoregressive models, and transformers. On the algorithm side, we will cover a few techniques for convex optimization, as well as approximate Bayesian inference algorithms like MCMC and variational inference. I think the best way to learn these concepts is to implement them from scratch, so coding will be a big focus of this course. By the end of the course, you’ll have a strong grasp of classical techniques as well as modern methods for modeling discrete data.

Logistics#

Instructor: Scott Linderman
TAs: Amber Hu and Michael Salerno
Term: Winter 2024-25
Time: Monday and Wednesday, 1:30-2:50pm
Location: Sequoia Hall, Room 200, Stanford University

Office Hours

Scott: Wed 10-11am, Wu Tsai Neurosciences Institute, 2nd Floor in the Theory Center
Michael: Thu, 5-7pm, Sequoia library (Rm 105)
Amber: Fri 1:30-3:30pm, Sequoia library (Rm 105) [except Feb 7 and 14]
- [Feb 3 and 10 only] Mon 10am-12pm, Wu Tsai Neurosciences Institute, 2nd Floor in the Theory Center

Prerequisites#

Students should be comfortable with undergraduate probability and statistics as well as multivariate calculus and linear algebra. This course will emphasize implementing models and algorithms, so coding proficiency with Python is required. (HW0: Python Primer will help you get up to speed.)

Books#

This course will draw from a few textbooks:

Agresti, Alan. Categorical Data Analysis, 2nd edition. John Wiley & Sons, 2002. link
Gelman, Andrew, et al. Bayesian Data Analysis, 3rd edition. Chapman and Hall/CRC, 2013. link
Bishop, Christopher. Pattern Recognition and Machine Learning. Springer, 2006. link

We will also cover material from research papers.

Schedule#

Please note that this is a tentative schedule. It may change slightly depending on our pace.

Date	Topic	Slides	Additional Reading
Mon, Jan 6, 2025	Basics of Probability and Statistics and Contingency Tables HW0 Released	download	[Agr02] Ch. 1-3
Wed, Jan 8, 2025	Logistic Regression	download	[Agr02] Ch. 4-5
Fri, Jan 10, 2025	HW0 Due
Mon, Jan 13, 2025	Exponential Families HW1 Released	download	[Agr02] Ch. 4-5
Wed, Jan 25, 2025	Generalized Linear Models	download	[Agr02] Ch. 6
Mon, Jan 20, 2025	MLK Day. No class
Wed, Feb 22, 2025	Sparse GLMs	download	[FHT10] and [LSS14]
Fri, Jan 24, 2025	HW1 Due
Mon, Jan 27, 2025	Bayesian Inference HW2 Released	download	[GCS+95] Ch. 1
Wed, Jan 29, 2025	Markov Chain Monte Carlo and Bayesian GLM Demo	download
Mon, Feb 3, 2025	Variational Inference	download	[BKM17]
Wed, Feb 5, 2025	Midterm Exam from 1:30-2:50pm in MCCULL 115.	download download
Mon, Feb 10, 2025	Mixture Models and EM	download	[Bis06] Ch. 9
Wed, Feb 12, 2025	Hidden Markov Models HW2 Due; HW3 Released	download	[Bis06] Ch. 13
Mon, Feb 17, 2025	Presidents’ Day. No class
Wed, Feb 19, 2025	Linear Gaussian Latent Variable Models	download
Mon, Feb 24, 2025	Variational Autoencoders HW3 Due; HW4 Released	download	[KW19] Ch.1-2
Wed, Feb 26, 2025	Transformers	download	[Tur23]
Mon, Mar 3, 2025	Recurrent Neural Networks	download	[GBC16] Ch 9 [SWL23] and [GD23]
Wed, Mar 5, 2025	Denoising Diffusion Models	download	[TDM+24]
Mon, Mar 10, 2025	Poisson Processes	download
Wed, Mar 12, 2025	Discrete Denoising Diffusion Models		[CBDB+22]
Fri, Mar 14, 2025	HW4 Due

Assignments#

There will be 5 assignments due roughly every other Friday. They will not be equally weighted. The first one is just a primer to get you up to speed; the last one will be a bit more substantial than the rest.

Homework 0: Python Primer
- Released Mon, Jan 6, 2025
- Due Fri, Jan 10, 2025 at 11:59pm
Homework 1: Logistic Regression
- Released Mon, Jan 13, 2025
- Due Fri, Jan 24, 2025 at 11:59pm
Homework 2: Bayesian GLMs
- Released Wed, Jan 29, 2025
- Due Wed, Feb 12, 2025 at 11:59pm
Homework 3: Hidden Markov Models
- Released Wed, Feb 12, 2025
- Due Mon, Feb 24, 2025 at 11:59pm
Homework 4: Large Language Models
- Released Mon, Feb 24, 2025
- Due Fri, Mar 14, 2025 at 11:59pm

Late Policy#

We will allow 5 late days to be used as needed throughout the quarter.

Exams#

Midterm Exam: Wed, Feb. 5 from 1:30-2:50pm in MCCULL 115
- You may bring a cheat sheet covering one side of an 8.5x11” piece of paper
- Practice Exam: download
- Practice Exam Solutions: download
- We will provide a reference of common distributions: download
Final Exam: Wed, Mar 19 from 3:30-6:30pm in Building 370, Room 370
- You may bring a cheat sheet covering both sides of an 8.5x11” piece of paper
- Practice Exam: download
- Practice Exam Solutions: download

Grading#

Tentatively:

Assignment	Percentage
HW 0	5%
HW 1-3	15% each
HW 4	20%
Midterm	10%
Final	15%
Participation	5%