{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# HW2: Bayesian GLMs\n", "\n", "In this assignment, you will develop Bayesian inference algorithms for generalized linear (mixed) models (GL(M)Ms). We'll put football aside and focus on another timely subject: US presidential elections. We have downloaded data from the [MIT Election Data Science Lab](https://electionlab.mit.edu/data) consisting of presidential votes for each county in the US from 2000-2020. We have also downloaded demographic covariates from 2018 for each county. In this assignment, you will develop models to predict county-level votes given demographic data. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Setup" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%%capture\n", "!pip install jaxtyping\n", "!pip install kaleido" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/Users/scott/anaconda3/lib/python3.10/site-packages/tqdm/auto.py:22: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n", " from .autonotebook import tqdm as notebook_tqdm\n" ] } ], "source": [ "import json\n", "import torch\n", "import matplotlib.pyplot as plt\n", "import pandas as pd\n", "import plotly.express as px\n", "\n", "from fastprogress import progress_bar\n", "from IPython.display import Image\n", "from jaxtyping import Float, Array\n", "from urllib.request import urlopen\n", "from torch.distributions import Binomial, Gamma, MultivariateNormal, Normal" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Load the vote data\n", "\n", "_Note: we did a little preprocessing of the raw data from the MIT Election Data Science Lab to extract counties that have both election and demographic data._\n", "\n", "The vote data consists of votes per candidate for each county in the US for presidential elections from 2000 to 2020. Each county is identified by its [FIPS code](https://transition.fcc.gov/oet/info/maps/census/fips/fips.txt). The FIPS code is a five digit integer representing the state and county within the state. We represent the FIPS code as a string since some codes start with zero.\n", "\n", "**Note**: Broomfield County CO (FIPS code 08014) did not exist in 2000\n" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | year | \n", "state | \n", "state_po | \n", "county_name | \n", "fips | \n", "candidate | \n", "party | \n", "totalvotes | \n", "candidatevotes | \n", "
---|---|---|---|---|---|---|---|---|---|
0 | \n", "2000 | \n", "ALABAMA | \n", "AL | \n", "AUTAUGA | \n", "01001 | \n", "AL GORE | \n", "DEMOCRAT | \n", "17208 | \n", "4942 | \n", "
1 | \n", "2000 | \n", "ALABAMA | \n", "AL | \n", "AUTAUGA | \n", "01001 | \n", "GEORGE W. BUSH | \n", "REPUBLICAN | \n", "17208 | \n", "11993 | \n", "
2 | \n", "2000 | \n", "ALABAMA | \n", "AL | \n", "AUTAUGA | \n", "01001 | \n", "OTHER | \n", "OTHER | \n", "17208 | \n", "113 | \n", "
3 | \n", "2000 | \n", "ALABAMA | \n", "AL | \n", "AUTAUGA | \n", "01001 | \n", "RALPH NADER | \n", "GREEN | \n", "17208 | \n", "160 | \n", "
4 | \n", "2000 | \n", "ALABAMA | \n", "AL | \n", "BALDWIN | \n", "01003 | \n", "AL GORE | \n", "DEMOCRAT | \n", "56480 | \n", "13997 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
63206 | \n", "2020 | \n", "WYOMING | \n", "WY | \n", "WASHAKIE | \n", "56043 | \n", "OTHER | \n", "OTHER | \n", "4032 | \n", "71 | \n", "
63207 | \n", "2020 | \n", "WYOMING | \n", "WY | \n", "WESTON | \n", "56045 | \n", "DONALD J TRUMP | \n", "REPUBLICAN | \n", "3560 | \n", "3107 | \n", "
63208 | \n", "2020 | \n", "WYOMING | \n", "WY | \n", "WESTON | \n", "56045 | \n", "JO JORGENSEN | \n", "LIBERTARIAN | \n", "3560 | \n", "46 | \n", "
63209 | \n", "2020 | \n", "WYOMING | \n", "WY | \n", "WESTON | \n", "56045 | \n", "JOSEPH R BIDEN JR | \n", "DEMOCRAT | \n", "3560 | \n", "360 | \n", "
63210 | \n", "2020 | \n", "WYOMING | \n", "WY | \n", "WESTON | \n", "56045 | \n", "OTHER | \n", "OTHER | \n", "3560 | \n", "47 | \n", "
63211 rows × 9 columns
\n", "\n", " | state | \n", "county | \n", "trump16 | \n", "clinton16 | \n", "otherpres16 | \n", "romney12 | \n", "obama12 | \n", "otherpres12 | \n", "demsen16 | \n", "repsen16 | \n", "... | \n", "age65andolder_pct | \n", "median_hh_inc | \n", "clf_unemploy_pct | \n", "lesshs_pct | \n", "lesscollege_pct | \n", "lesshs_whites_pct | \n", "lesscollege_whites_pct | \n", "rural_pct | \n", "ruralurban_cc | \n", "fips | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "Alabama | \n", "Autauga | \n", "18172 | \n", "5936 | \n", "865 | \n", "17379 | \n", "6363 | \n", "190 | \n", "6331.0 | \n", "18220.0 | \n", "... | \n", "13.978456 | \n", "53099.0 | \n", "5.591657 | \n", "12.417046 | \n", "75.407229 | \n", "10.002112 | \n", "74.065601 | \n", "42.002162 | \n", "2.0 | \n", "01001 | \n", "
1 | \n", "Alabama | \n", "Baldwin | \n", "72883 | \n", "18458 | \n", "3874 | \n", "66016 | \n", "18424 | \n", "898 | \n", "19145.0 | \n", "74021.0 | \n", "... | \n", "18.714851 | \n", "51365.0 | \n", "6.286843 | \n", "9.972418 | \n", "70.452889 | \n", "7.842227 | \n", "68.405607 | \n", "42.279099 | \n", "3.0 | \n", "01003 | \n", "
2 | \n", "Alabama | \n", "Barbour | \n", "5454 | \n", "4871 | \n", "144 | \n", "5550 | \n", "5912 | \n", "47 | \n", "4777.0 | \n", "5436.0 | \n", "... | \n", "16.528895 | \n", "33956.0 | \n", "12.824738 | \n", "26.235928 | \n", "87.132213 | \n", "19.579752 | \n", "81.364746 | \n", "67.789635 | \n", "6.0 | \n", "01005 | \n", "
3 | \n", "Alabama | \n", "Bibb | \n", "6738 | \n", "1874 | \n", "207 | \n", "6132 | \n", "2202 | \n", "86 | \n", "2082.0 | \n", "6612.0 | \n", "... | \n", "14.885699 | \n", "39776.0 | \n", "7.146827 | \n", "19.301587 | \n", "88.000000 | \n", "15.020490 | \n", "87.471774 | \n", "68.352607 | \n", "1.0 | \n", "01007 | \n", "
4 | \n", "Alabama | \n", "Blount | \n", "22859 | \n", "2156 | \n", "573 | \n", "20757 | \n", "2970 | \n", "279 | \n", "2980.0 | \n", "22169.0 | \n", "... | \n", "17.192916 | \n", "46212.0 | \n", "5.953833 | \n", "19.968585 | \n", "86.950243 | \n", "16.643368 | \n", "86.163610 | \n", "89.951502 | \n", "1.0 | \n", "01009 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
3106 | \n", "Wyoming | \n", "Sweetwater | \n", "12154 | \n", "3231 | \n", "1745 | \n", "11428 | \n", "4774 | \n", "693 | \n", "NaN | \n", "NaN | \n", "... | \n", "9.417120 | \n", "68233.0 | \n", "5.072255 | \n", "9.314606 | \n", "78.628507 | \n", "6.238463 | \n", "76.606813 | \n", "10.916313 | \n", "5.0 | \n", "56037 | \n", "
3107 | \n", "Wyoming | \n", "Teton | \n", "3921 | \n", "7314 | \n", "1392 | \n", "4858 | \n", "6213 | \n", "393 | \n", "NaN | \n", "NaN | \n", "... | \n", "11.837510 | \n", "75594.0 | \n", "2.123447 | \n", "4.633570 | \n", "46.211584 | \n", "1.526877 | \n", "41.769504 | \n", "46.430920 | \n", "7.0 | \n", "56039 | \n", "
3108 | \n", "Wyoming | \n", "Uinta | \n", "6154 | \n", "1202 | \n", "1114 | \n", "6615 | \n", "1628 | \n", "296 | \n", "NaN | \n", "NaN | \n", "... | \n", "10.678218 | \n", "53323.0 | \n", "6.390755 | \n", "10.361224 | \n", "81.793082 | \n", "8.806312 | \n", "81.080852 | \n", "43.095937 | \n", "7.0 | \n", "56041 | \n", "
3109 | \n", "Wyoming | \n", "Washakie | \n", "2911 | \n", "532 | \n", "371 | \n", "3014 | \n", "794 | \n", "136 | \n", "NaN | \n", "NaN | \n", "... | \n", "19.650341 | \n", "46212.0 | \n", "7.441860 | \n", "12.577108 | \n", "78.923920 | \n", "10.299738 | \n", "75.980688 | \n", "35.954529 | \n", "7.0 | \n", "56043 | \n", "
3110 | \n", "Wyoming | \n", "Weston | \n", "3033 | \n", "299 | \n", "194 | \n", "2821 | \n", "422 | \n", "116 | \n", "NaN | \n", "NaN | \n", "... | \n", "18.355401 | \n", "55640.0 | \n", "3.610949 | \n", "8.592392 | \n", "81.193281 | \n", "7.342144 | \n", "81.141179 | \n", "54.536626 | \n", "7.0 | \n", "56045 | \n", "
3111 rows × 39 columns
\n", "