Fifth Homework: Probability

Instructions

Exercise 1: Another flip-and-draw scenario (14 points)

Consider a flip-and-draw scenario similar to one we discussed in class. First, we flip a coin with a bias of .75 of landing heads. If we observe heads, we draw from urn 1, otherwise from urn 2. The content of the urns is:

Joint probability table (4 points)

Calculate and write down (maybe using Rmarkdown) the joint probability table for this scenario.

Marginal probability (2 points)

Calculate the marginal probability of drawing a red ball.

Conditional probability (4 points)

Calculate the conditional probability of observing a red ball given that the ball we observed was not black. In symbols, calculate: \(P(\text{red} \mid \neg \text{black}) = P(\text{red} \mid \{ \text{red}, \text{white} \} )\).

Bayes rule (4 points)

Using Bayes rule, calculate the probability of a heads outcome given that we observed a draw of a red ball.

Exercise 2: Bayes rule for medical tests (5 points)

Here is a common mistake in probabilistic reasoning. Jones knows that a medical test has only a 0.5% chance of yielding a false alarm, i.e., diagnose a diseas when in fact there is none. The test indicates that Jones has the diseas. So, Jones thinks that the chance that they are affected is 99.5%. That’s not true. Let’s find out why.

Imagine the following for concreteness. A new blood glucose meter should enable patients to recognize elevated blood glucose. If a certain threshold was exceeded the device gives a warning signal. It is known that 50 of 1000 people have an elevated blood sugar. The device gives a warning signal with 99.5% probability, if the threshold was exceeded. With a probability of 2% the device gives also a warning signal, although the threshold was not exceeded.

Assume a person X to whom an increased blood sugar value has not yet been noticed. How certain can person X be, that he or she has an elevated blood sugar value, if the device gives a warning signal?

Exercise 3: Bayes’ Rule and Bertrand’s Box Paradox (5 points)

Suppose you are presented with three desks, each with two drawers containing one coin each. There are two kinds of coins: Silver(S) and Gold(G). The desks are such that one has Gold coins in both the drawers (GG), one has silver coins in both (SS), and the third desk has a gold coin in one drawer and a silver coin in the other (GS).

Now, suppose you are free to choose ONE of the three desks at will. After you choose a desk, one of the two drawers is opened at random, and you find a gold coin inside it. What is the chance that the other drawer also contains a gold coin?

Note: You may be tempted initially to conclude that the said probability of the second drawer also containing a gold coin is 1/2 since the choice is equally likely between the GS desk and the GG desk (SS desk being eliminated now from the scenario). But there is a flaw in this reasoning!

Use Bayes’ rule to find out the answer. The observation here is ‘sighting of a gold coin’ and you are looking for the conditional probability \(P(\text{GG} \mid \textit{sight gold})\).

Exercise 4: Normal distribution from a random walk process (24 points)

Let’s exercise with creating and plotting samples. Let’s assume that there are n_critters <- 10000 critters. These critters are initially aligned vertically at the same horizontal zero position critter_positions <- rep(0,n_critters). Now each critter will perform n_steps <- 10000 random steps. Each step moves the critter left or write along the \(x\) axis by a random amount between -1 and 1. We can draw such a number using the function runif(n = XXX, min = -1, max = 1) which will return a vector of XXX samples of numbers between -1 and 1, each sampled uniformly at random.

Let the critters roam (6 points)

Update the vector critter_positions a number of n_steps times, by the random procedure described above. The result will be a vector of where each of the critters is located along the \(x\) axis after these steps (having wiggled around with great enthusiasm).

Get summary statistics (2 points)

Calculate the mean and the standard deviation of the critter positions after the wiggling.

Plot the critter positions (7)

Draw a density plot of two vectors (overlayed, with alpha = 0.5). The first vector is the critter positions you calculated. The second vector is a vector of the same length with samples from a normal distribution, whose mean is the mean of the critter positions, and whose standard deviation is the standard deviation of the critter_positions.

The resulting plot should look (roughly) like this:

Repeat for fewer samples (5)

Now do the same thing (initializing, wiggling, and plotting), but for only n_critters <- 50 critters, while still using the full 10000 samples for the normal distribution. Your result might look like the plot below.

Interpret the critter walks (4 points)

Name two or three points that are noteworthy about these two simulations with respect to sampling, probability and/or the normal distribution. Be very brief and to-the-point in your answer.

Reminder

Please submit exercise 1 of the last homework with this homework set as well, even if you have already done so.