References#
Alan Agresti. Categorical data analysis. Volume 792. John Wiley & Sons, 2002. URL: https://onlinelibrary.wiley.com/doi/book/10.1002/0471249688.
James H Albert and Siddhartha Chib. Bayesian analysis of binary and polychotomous response data. Journal of the American statistical Association, pages 669–679, 1993.
David J Aldous. Representations for partially exchangeable arrays of random variables. Journal of Multivariate Analysis, 11(4):581–598, 1981.
Christopher Bishop. Pattern Recognition and Machine Learning. Springer, 2006.
David M Blei. Build, compute, critique, repeat: Data analysis with latent variable models. Annual Review of Statistics and Its Application, 1:203–232, 2014.
Joseph K Blitzstein and Jessica Hwang. Introduction to Probability. CRC Press, 2019.
Benjamin Bloem-Reddy and Peter Orbanz. Random-walk models of network formation and sequential Monte Carlo methods for graphs. Journal of the Royal Statistical Society Series B: Statistical Methodology, 80(5):871–898, 2018.
Stephen P Boyd and Lieven Vandenberghe. Convex Optimization. Cambridge university press, 2004.
Andrew Campbell, Joe Benton, Valentin De Bortoli, Thomas Rainforth, George Deligiannidis, and Arnaud Doucet. A continuous time framework for discrete denoising models. Advances in Neural Information Processing Systems, 35:28266–28279, 2022.
Bradley Efron. Exponential families in theory and practice. Cambridge University Press, 2022.
Jerome Friedman, Trevor Hastie, and Rob Tibshirani. Regularization paths for generalized linear models via coordinate descent. Journal of statistical software, 33(1):1, 2010.
Andrew Gelman, John B Carlin, Hal S Stern, Aki Vehtari, and Donald B Rubin. Bayesian Data Analysis. Chapman and Hall/CRC, 1995.
Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep Learning. MIT Press, 2016. http://www.deeplearningbook.org.
Albert Gu and Tri Dao. Mamba: linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752, 2023.
Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020.
Peter Hoff. Modeling homophily and stochastic equivalence in symmetric relational data. Advances in neural information processing systems, 2007.
Peter D Hoff, Adrian E Raftery, and Mark S Handcock. Latent space approaches to social network analysis. Journal of the American Statistical Association, 97(460):1090–1098, 2002.
Douglas Hoover. Relations on probability spaces and arrays of random variables. Technical Report, Institute for Advanced Study, Princeton, NJ, 1979.
Diederik P Kingma and Max Welling. An introduction to variational autoencoders. Foundations and Trends® in Machine Learning, 12(4):307–392, 2019.
Jason D Lee, Yuekai Sun, and Michael A Saunders. Proximal Newton-type methods for minimizing composite functions. SIAM Journal on Optimization, 24(3):1420–1443, 2014.
Ann C McKee, Jesse Mez, Bobak Abdolmohammadi, Morgane Butler, Bertrand Russell Huber, Madeline Uretsky, Katharine Babcock, Jonathan D Cherry, Victor E Alvarez, Brett Martin, and others. Neuropathologic and clinical findings in young contact sport athletes exposed to repetitive head impacts. JAMA neurology, 80(10):1037–1050, 2023.
P Orbanz and DM Roy. Bayesian models of graphs, arrays, and other exchangeable random structures. arXiv, 37(02):1–25, 2013.
Nicholas G Polson, James G Scott, and Jesse Windle. Bayesian inference for logistic models using Pólya–gamma latent variables. Journal of the American statistical Association, 108(504):1339–1349, 2013.
Jimmy T.H. Smith, Andrew Warrington, and Scott Linderman. Simplified state space layers for sequence modeling. In The Eleventh International Conference on Learning Representations. 2023.
Jascha Sohl-Dickstein, Eric Weiss, Niru Maheswaranathan, and Surya Ganguli. Deep unsupervised learning using nonequilibrium thermodynamics. In International conference on machine learning, 2256–2265. PMLR, 2015.
Yang Song and Stefano Ermon. Generative modeling by estimating gradients of the data distribution. Advances in neural information processing systems, 2019.
Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456, 2020.
Tijmen Tieleman. Training restricted Boltzmann machines using approximations to the likelihood gradient. In Proceedings of the 25th international conference on Machine learning, 1064–1071. 2008.
Richard E Turner. An introduction to transformers. arXiv preprint arXiv:2304.10557, 2023.
Richard E Turner, Cristiana-Diana Diaconu, Stratis Markou, Aliaksandra Shysheya, Andrew YK Foong, and Bruno Mlodozeniec. Denoising diffusion probabilistic models in six simple steps. arXiv preprint arXiv:2402.04384, 2024.
Jesse Windle, Nicholas G Polson, and James G Scott. Sampling pólya-gamma random variates: alternate and approximate techniques. arXiv preprint arXiv:1405.0506, 2014.