This homework assignment is due May 26th 2017 before class. Submit your results in a zipped archive with name BDA+CM_HW2_YOURLASTNAME.zip
which includes both Rmarkdown and a compiled version (preferably HTML). Use the same naming scheme for the *.Rmd
and *.html
files. All other files, if needed, should start with BDA+CM_HW1_YOURLASTNAME_
as well. Upload the archive to the Dropbox folder.
Keep your descriptions and answers as short and concise as possible, without reverting to bullet points. All of the exercises below are required and count equally to the final score.
A prominent method of parameter inference is maximum likelihood estimation (MLE). MLE looks at the observed data \(D\) and a likelihood function \(P(D \mid \theta\) given values for parameter \(\theta\). We then estimate the best fitting parameter \(\theta^*\) which maximizes the likelihood:
\[ \theta^* = \arg\max_\theta P(D \mid \theta) \]
MLE is used, for example, to obtain point-estimates for regression coefficients and to compare models with notions such as AIC (more on this later during the course).
Suppose Jones draws a coin from her pocket and flips it once. It’s a success. Let’s assume that the relevant likelihood function is the Binomial distribution.
The lectures used a homebrew (and hacky) version of the Metropolis Hastings algorithm. Here is a version of the code with some changes.
library('coda')
# generate fake data
fakeData = rnorm(100, mean = 0.5, sd = 1)
# target function to approximate:
# non-normalized posterior: prior times likelihood
f = function(mu, sigma){
if (sigma <=0){
return(0)
}
priorMu = dunif(mu, min = -3.5, max = 5)
priorSigma = dunif(sigma, min = 0, max = 4)
likelihood = prod(dnorm(fakeData, mean = mu, sd = sigma))
return(priorMu * priorSigma * likelihood)
}
MH = function(f, iterations = 50, chains = 2, burnIn = 0){
out = array(0, dim = c(chains, iterations - burnIn, 2))
dimnames(out) = list("chain" = 1:chains,
"iteration" = 1:(iterations-burnIn),
"variable" = c("mu", "sigma"))
for (c in 1:chains){
mu = runif(1, min = -4, max = 4)
sigma = runif(1, min = 0, max = 4)
for (i in 1:iterations){
muNext = mu + rnorm(1, mean = 0.5, sd = 1)
sigmaNext = sigma + runif(1, min = -0.25, max = 0.25)
rndm = runif(1, 0, 1)
if (f(mu, sigma) < f(muNext, sigmaNext) | f(muNext, sigmaNext) >= f(mu, sigma) * rndm) {
mu = muNext
sigma = sigmaNext
}
if (i >= burnIn){
out[c,i-burnIn,1] = mu
out[c,i-burnIn,2] = sigma
}
}
}
return(mcmc.list(mcmc(out[1,,]), mcmc(out[2,,])))
}
set.seed(1789)
(so that we all get the same samples and results, do so before generating your fakeData
). Collect 2 chains of 50,000 samples each after a burn-in of 10,000 using the fixed code. (If you failed to solve the previous exercise, use the code from the lecture, provided in the Rmarkdown for the slides of lectures 6 and 7).
mu
and sigma
.Suppose we have observed a vector \(\vec{x}\) and a vector \(\vec{y}\) of data, of which we are fairly certain that they come from a normal distribution, but we do not know their mean or standard deviation. We are interested in whether the mean could be the same.
Here’s our data.
set.seed(2017)
x_obs = rnorm(50,0,1)
y_obs = rnorm(25,0.3,1.5)
A standard statistical test to assess whether means are equal would be:
t.test(x_obs,y_obs)
##
## Welch Two Sample t-test
##
## data: x_obs and y_obs
## t = -0.75412, df = 35.46, p-value = 0.4558
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.9679298 0.4434164
## sample estimates:
## mean of x mean of y
## 0.08248896 0.34474567