Lab 0: PyTorch Primer#
We’ll use Python and PyTorch for the labs in this course. This lab is to help you get up to speed. It will introduce:
Tensors: PyTorch’s equivalent of NumPy arrays, but with more bells and whistles for running on GPUs and supporting automatic differentiation.
Broadcasting and Fancy Indexing: If you’re coming from Matlab or NumPy, you probably know that you can avoid costly for-loops by broadcasting computation over dimensions of an array (here, tensor) and using fancy indexing tricks.
Distributions: PyTorch has an excellent library of distributions for sampling, evaluating log probabilities, and much more.
We’ll introduce these concepts in the context of the Poisson mixture model from class (c.f. Probabilistic Modeling).
import torch
import torch.distributions as dist
import matplotlib.pyplot as plt
1. Constructing Tensors#
Tensors are PyTorch’s equivalent of NumPy arrays. The PyTorch documentation already has a great tutorial on tensors. Rather than recreate the wheel, please start by reading that.
Once you’ve read through that, try using torch functions like arange
, reshape
, etc. to construct the following tensors.
Problem 1.1#
Construct the following tensor:
tensor([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
Note: For this problems and the ones below, don’t literally construct the tensor from the specified list. Use torch functions.
# YOUR CODE HERE
Problem 1.2#
Construct the following tensor:
tensor([[0, 3, 6],
[1, 4, 7],
[2, 5, 8]])
# YOUR CODE HERE
Problem 1.3#
Construct the following tensor:
tensor([0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4])
Note: Here the sequence is repeated 3 times. Does your code support arbitrary numbers of repeats?
# YOUR CODE HERE
Problem 1.4#
Construct the following tensor:
tensor([[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4]])
# YOUR CODE HERE
Problem 1.5#
Construct the following tensor:
tensor([[ 1., -2., 0., 0.],
[-2., 1., -2., 0.],
[ 0., -2., 1., -2.],
[ 0., 0., -2., 1.]])
# YOUR CODE HERE
Problem 1.6#
Construct the following tensor:
tensor([[[[0, 1, 2]]]])
# YOUR CODE HERE
2. Broadcasting and Fancy Indexing#
Your life will be much easier and your code will be much faster once you get the hang of broadcasting and indexing. Start by reading the PyTorch documentation.
Problem 2.1#
Construct a tensor X
where X[i,j] = i + j
by broadcasting a sum of two 1-dimensional tensors.
For example, broadcast a sum to construct the following tensor,
tensor([[0, 1, 2],
[1, 2, 3],
[2, 3, 4],
[3, 4, 5]])
# YOUR CODE HERE
Problem 2.2#
Compute a distance matrix D
where D[i,j]
is the Euclidean distance between X[i]
and X[j]
, with
X = torch.arange(10, dtype=float).reshape(5, 2)
Your answer should be,
tensor([[ 0.0000, 2.8284, 5.6569, 8.4853, 11.3137],
[ 2.8284, 0.0000, 2.8284, 5.6569, 8.4853],
[ 5.6569, 2.8284, 0.0000, 2.8284, 5.6569],
[ 8.4853, 5.6569, 2.8284, 0.0000, 2.8284],
[11.3137, 8.4853, 5.6569, 2.8284, 0.0000]])
X = torch.arange(10, dtype=float).reshape(5, 2)
# YOUR CODE HERE
Problem 2.3#
Extract the submatrix of rows [2,3]
and columns [0,1,4]
of the tensor,
A = torch.arange(25).reshape(5, 5)
Your answer should be,
tensor([[10, 11, 14],
[15, 16, 19]])
A = torch.arange(25).reshape(5, 5)
# YOUR CODE HERE
Problem 2.4#
Create a binary mask matrix M
of the same shape as A
where M[i,j]
is True if and only if A[i,j]
is divisible by 7. Let
A = torch.arange(25).reshape(5, 5)
Your answer should be
tensor([[ True, False, False, False, False],
[False, False, True, False, False],
[False, False, False, False, True],
[False, False, False, False, False],
[False, True, False, False, False]])
A = torch.arange(25).reshape(5, 5)
# YOUR CODE HERE
Problem 2.5#
Add one to the entries in A
that are divisible by 7. After updating in place, A
should be,
tensor([[ 1, 1, 2, 3, 4],
[ 5, 6, 8, 8, 9],
[10, 11, 12, 13, 15],
[15, 16, 17, 18, 19],
[20, 22, 22, 23, 24]])
# YOUR CODE HERE
3. Distributions#
PyTorch has an excellent library of distributions in torch.distributions
. Read the docs here.
We will use these distribution objects to construct and fit a Poisson mixture model.
Problem 3.1#
Draw 50 samples from a Poisson distribution with rate 10.
# YOUR CODE HERE
Problem 3.2#
One of the awesome thing about PyTorch distributions is that they support broadcasting too.
Construct a matrix P
where P[i,j]
equals \(\mathrm{Pois}(x=j; \lambda=i)\) for \(i=0,\ldots,4\) and \(j=0,\ldots,4\).
Your answer should be,
tensor([[1.0000, 0.0000, 0.0000, 0.0000, 0.0000],
[0.3679, 0.3679, 0.1839, 0.0613, 0.0153],
[0.1353, 0.2707, 0.2707, 0.1804, 0.0902],
[0.0498, 0.1494, 0.2240, 0.2240, 0.1680],
[0.0183, 0.0733, 0.1465, 0.1954, 0.1954]])
# YOUR CODE HERE
Problem 3.3#
Evaluate the log probability of the points [1.5, 3., 4.2]
under a gamma distribution with shape (aka concentration) 2.0 and inverse scale (aka rate) 1.5.
Your answer should be,
tensor([-1.0336, -2.5905, -4.0540])
# YOUR CODE HERE
Problem 3.4#
Draw 1000 samples from a Poisson mixture model,
Use matplotlib.pyplot.hist
to plot a normalized histogram of the samples.
# YOUR CODE HERE
# data = ...
4. MAP estimation#
Problem 4.1#
Let data
be the samples from above. Assume there are \(K = 2\) clusters and the prior cluster probabilities \([\tfrac{1}{2}, \tfrac{1}{2}]\) are known. Complete the code below to perform MAP estimation.
def update_assignments(data, rates, probs):
"""Update the cluster assignments ($z$) given the data, rates,
and cluster probabilities.
Args:
data: shape `(N,)` tensor of counts
rates: shape `(K,)` tensor of nonnegative rates for each cluster.
probs: shape `(K,)` tensor of cluster probabilities
Returns:
assignments: shape `(N,)` tensor of integer cluster assignments
"""
##
# YOUR CODE HERE
##
return assignments
def update_rates(data, assignments, shape=1.0, inv_scale=1.0):
"""Update the rates for each cluster under a gamma prior.
Args:
data: shape `(N,)` tensor of counts
assignments: shape `(N,)` tensor of integer cluster assignments
shape: shape (aka concentration) of gamma prior. Defaults to 1.0.
inv_scale: inverse scale (aka rate) of gamma prior. Defaults to 1.0.
Returns:
rates: shape `(K,)` tensor of updated rates for each cluster
"""
##
# YOUR CODE HERE
##
return rates
def log_joint(data, assignments, rates, probs, shape=1.0, inv_scale=1.0):
"""_summary_
Args:
data: shape `(N,)` tensor of counts
assignments: shape `(N,)` tensor of integer cluster assignments
rates: shape `(K,)` tensor of updated rates for each cluster
probs: shape `(K,)` tensor of cluster probabilities
shape: shape (aka concentration) of gamma prior. Defaults to 1.0.
inv_scale: inverse scale (aka rate) of gamma prior. Defaults to 1.0.
Returns:
lp: scalar log joint probability under the mixture model
"""
###
# YOUR CODE HERE
##
return lp
# Run coordinate ascent for some number of iterations, starting
# with random cluster assignments
probs = torch.ones(2) / 2.0
assignments = torch.randint(0, 2, data.shape)
rates = 10 * torch.rand(2)
lps = []
for i in range(20):
lps.append(log_joint(data, assignments, rates, probs))
rates = update_rates(data, assignments)
assignments = update_assignments(data, rates, probs)
plt.plot(lps)
plt.xlabel("iteration")
plt.ylabel("log joint probability")
print("estimated rates:", rates)
Problem 4.2 (Bonus)#
Now consider a more general model in which
where the prior cluster probabilities \(\boldsymbol{\pi}\) are unknown. (Above, we assumed they were known to be \(\boldsymbol{\pi} = [\tfrac{1}{2}, \tfrac{1}{2}]\).) Derive and implement a coordinate ascent algorithm for MAP estimation of \(\mathbf{z}_{\mathsf{MAP}}\), \(\boldsymbol{\lambda}_{\mathsf{MAP}}\), and \(\boldsymbol{\pi}_{\mathsf{MAP}}\).