Homework 7: Parameter Estimation

Instructions

Exercise 1: MLE and MAP (10 points)

Fix a model \(M\) with a single parameter \(\theta\) and an uninformative prior over some finite interval \([a;b]\), so that \(\theta\) is uniformly distributed with \(\theta \sim \mathrm{Uniform}(a,b)\). Show that the frequentist MLE is identical to the Bayesian MAP.

Exercise 2: Linear regression with greta (40 points)

In this exercise we will test whether there is plausibly a negative effect of (the log of) total_amount_sold on average_price in the avocado data set. The model used is shown in the figure below, albeit that we will add specific priors for a Bayesian analysis.

We will use MLEs and Bayesian estimation. To achive the latter we will implemented the model in greta and we will look at the posterior estimates of the slope parameter \(\beta_1\), comparing it against the null value of \(\beta_1=0\) (representing no linear relationship).

Load and preprocess the data (2 points)

Load the data into a variable avocado_data and preprocess it, using the code from the course notes (e.g., in Section 5.2.1).

Plot the data (4 points)

Make a scatterplot of the logarithm of total_amount_sold on the \(x\)-axis and average_price on the \(y\)-axis. Add a linear regression line (without any grouping), using function geom_smooth. Your output could look roughly like this:

Find the MLE (6 points)

Find the best fitting parameter values of the model, i.e., the best values for \(\beta_0\) (= intercept), \(\beta_1\) (= slope) and \(\sigma\) (= standard deviation), for the model’s likelihood function. Use the optim function for this purpose. The course notes have very relevant code in Section 9.5.1, but you need to make some adjustments to that code. Here, we want to regress average_price on the logarithm of total_amount_sold. Finish your calculation with a clear statement or code output that identifies the estimated values to each parameter. A (wrong!) example of how your answer should look like it given below.

## # A tibble: 3 x 2
##   parameter          value
##   <chr>              <dbl>
## 1 intercept              0
## 2 slope                  8
## 3 standard deviation    15

Plot a second regression line (6 points)

Use the previous plot from above, where you added a regression line with geom_smooth. But now add a second regression line (in a different color) as the final layer of the plot. Use geom_abline, which takes as argument an aesthetic with aes(slope = ..., intercept = ...) and supply the MLEs for slope and intercept. (Hint: After doing this, you should, as was promised earlier in the course, understand exactly what geom_smooth is doing under its default method lm (= linear model).)

Implement the model with greta (6 points)

Complete the code below to obtain an implementation in greta.

# select data to use
price     <- as_data(FILL_ME)
log_sold  <- as_data(FILL_ME)
# latent variables and priors
intercept <- student(df= 1, mu = 0, sigma = 10)
slope     <- student(df= 1, mu = 0, sigma = 10)
sigma     <- normal(0 , 5, truncation = c(0, Inf))
# derived latent variable (linear model)
mean <- FILL_ME
# likelihood 
FILL_ME <- normal(mean, sigma)
# finalize model, register which parameters to monitor
m <- model(intercept, slope, sigma)

Obtain samples (4 points)

Draw 2000 samples from each of 4 chains, after a warmup of 1000 using greta::mcmc.

Wrangle, summarize and interpret (6 points)

Get a tidy version of your samples using the function ggmcmc::ggs and produce the Bayesian point- and interval-based estimates (95% credible intervals a.k.a. HDIs) for all parameters. Interpret the results with respect to our research question of whether the log amount sold has a linear effect on the average price.

Point estimates under improper priors in greta (6 points)

Rerun the greta model code above, but use improper priors, i.e., do not specify any prior for the parameters, just declaring that these are variables like so

# latent variables and improper priors
intercept <- variable()
slope     <- variable()
sigma     <- variable(lower = 0)

Define the rest of the model (and the data) as before. Instead of calling greta::mcmcm, call the function greta::opt for model m. Inspect and explain the output. What are the values returned from this function? Have you seen them before?

Exercise 2: Analyzing the King of France (22 points)

Let’s find out whether the King of France data gives any reason to believe that sentences like “The King of France is bald.” (Condition 0) tend to be judged differently from sentences of the form “France has a King, and he is bald.” (Condition 1). Remember that this question is theoretically interesting because the first sentence presupposes tha there is a King of France, while the second asserts this information explcitly. Does falsly pressuposing cause different truth-value judgements than falsely asserting?

Get the data (6 points)

Load the cleaned KoF data and extract the relevant counts: the number \(k_0\) of “true” responses out of \(N_0\) total observations for Condition 0, and the same for Condition 1.

Define a greta model to compare latent biases (10 points)

Adapt the T-Test model given in the course notes to apply to this case. Instead of two means, represent two biases \(\theta_0\) and \(\theta_1\). Use the version of the T-Test model which does not define a prior on the difference. Rather, use the following priors: \(\theta_0, \theta_1 \sim \text{Beta}(1,1)\). Calculate \(\delta = \theta_0 - \theta_1\) inside the greta model.

Sample and interpret (6 points)

Take 2000 samples after a warmup of 1000, from 4 chains. Look at Bayesian 95% credible intervals for the parameter \(\delta\) and interpret the result with respect to the research question of interest (see above).