Fifth Homework: Probability

Instructions

If you need help, take a look at the suggested readings in the lecture, make use of the cheat sheets and the help possibility in R
Create an Rmd-file with the matrikel number of each group member (equivalent to StudIP group) in the ‘author’ heading and answer the following questions.
When all answers are ready, ‘Knit’ the document to produce a HTML file.
Create a ZIP archive called “IDA_HW5-Group-XYZ.zip” (where ‘XYZ’ is your group) containing:
- an R Markdown file “IDA_HW5-Group-XYZ.Rmd”
- a knitted HTML document “IDA_HW5-Group-XYZ.html”
Upload the ZIP archive on Stud.IP in your group folder before the deadline. You may upload as many times as you like before the deadline, only your final submission will count.
Please do not suppress the code in the HTML-Output
Include an R code chunk in your Rmarkdown file (the preamble) in which you set the following global options for the document, and set the options for this code chunk to echo = F (so as not to have it show up in your output):

Exercise 1: Another flip-and-draw scenario (14 points)

Consider a flip-and-draw scenario similar to one we discussed in class. First, we flip a coin with a bias of .75 of landing heads. If we observe heads, we draw from urn 1, otherwise from urn 2. The content of the urns is:

urn 1: 6 white, 3 black, 1 red
urn 2: 2 white, 5 black, 3 red

Joint probability table (4 points)

Calculate and write down (maybe using Rmarkdown) the joint probability table for this scenario.

Marginal probability (2 points)

Calculate the marginal probability of drawing a red ball.

Conditional probability (4 points)

Calculate the conditional probability of observing a red ball given that the ball we observed was not black. In symbols, calculate: \(P(\text{red} \mid \neg \text{black}) = P(\text{red} \mid \{ \text{red}, \text{white} \} )\).

Bayes rule (4 points)

Using Bayes rule, calculate the probability of a heads outcome given that we observed a draw of a red ball.

Exercise 2: Bayes rule for medical tests (5 points)

Here is a common mistake in probabilistic reasoning. Jones knows that a medical test has only a 0.5% chance of yielding a false alarm, i.e., diagnose a diseas when in fact there is none. The test indicates that Jones has the diseas. So, Jones thinks that the chance that they are affected is 99.5%. That’s not true. Let’s find out why.

Imagine the following for concreteness. A new blood glucose meter should enable patients to recognize elevated blood glucose. If a certain threshold was exceeded the device gives a warning signal. It is known that 50 of 1000 people have an elevated blood sugar. The device gives a warning signal with 99.5% probability, if the threshold was exceeded. With a probability of 2% the device gives also a warning signal, although the threshold was not exceeded.

Assume a person X to whom an increased blood sugar value has not yet been noticed. How certain can person X be, that he or she has an elevated blood sugar value, if the device gives a warning signal?

Exercise 3: Bayes’ Rule and Bertrand’s Box Paradox (5 points)

Suppose you are presented with three desks, each with two drawers containing one coin each. There are two kinds of coins: Silver(S) and Gold(G). The desks are such that one has Gold coins in both the drawers (GG), one has silver coins in both (SS), and the third desk has a gold coin in one drawer and a silver coin in the other (GS).

Now, suppose you are free to choose ONE of the three desks at will. After you choose a desk, one of the two drawers is opened at random, and you find a gold coin inside it. What is the chance that the other drawer also contains a gold coin?

Note: You may be tempted initially to conclude that the said probability of the second drawer also containing a gold coin is 1/2 since the choice is equally likely between the GS desk and the GG desk (SS desk being eliminated now from the scenario). But there is a flaw in this reasoning!

Use Bayes’ rule to find out the answer. The observation here is ‘sighting of a gold coin’ and you are looking for the conditional probability \(P(\text{GG} \mid \textit{sight gold})\).

Exercise 4: Normal distribution from a random walk process (24 points)

Let’s exercise with creating and plotting samples. Let’s assume that there are n_critters <- 10000 critters. These critters are initially aligned vertically at the same horizontal zero position critter_positions <- rep(0,n_critters). Now each critter will perform n_steps <- 10000 random steps. Each step moves the critter left or write along the \(x\) axis by a random amount between -1 and 1. We can draw such a number using the function runif(n = XXX, min = -1, max = 1) which will return a vector of XXX samples of numbers between -1 and 1, each sampled uniformly at random.

Let the critters roam (6 points)

Update the vector critter_positions a number of n_steps times, by the random procedure described above. The result will be a vector of where each of the critters is located along the \(x\) axis after these steps (having wiggled around with great enthusiasm).

Get summary statistics (2 points)

Calculate the mean and the standard deviation of the critter positions after the wiggling.

Plot the critter positions (7)

Draw a density plot of two vectors (overlayed, with alpha = 0.5). The first vector is the critter positions you calculated. The second vector is a vector of the same length with samples from a normal distribution, whose mean is the mean of the critter positions, and whose standard deviation is the standard deviation of the critter_positions.

The resulting plot should look (roughly) like this:

Repeat for fewer samples (5)

Now do the same thing (initializing, wiggling, and plotting), but for only n_critters <- 50 critters, while still using the full 10000 samples for the normal distribution. Your result might look like the plot below.

Interpret the critter walks (4 points)

Name two or three points that are noteworthy about these two simulations with respect to sampling, probability and/or the normal distribution. Be very brief and to-the-point in your answer.

Reminder

Please submit exercise 1 of the last homework with this homework set as well, even if you have already done so.

HW 5: Probability

Due: Friday, Dezember 13 by 11:59 CET