Recent Question/Assignment

Probability for Statistics / Elements of Probability — Lab
Assignment 2022
We refer to Canvas for the instructions on this lab assignment.

Name:
Student ID:

Question 1
Your lecturer loves chocolate and has two boxes of chocolates in her office, one in the upper drawer and one in the lower drawer of her desk. Whenever she craves for a chocolate, she selects a drawer at random and takes a chocolate from the box in that drawer. We assume that each of the boxes originally contained 20 chocolates. Suppose your lecturer opens a drawer and discovers for the first time that the box in that drawer is empty. We let X denote the number of chocolates left in the other box.
It can be shown that the pmf of X is
40 - x x-40
P(X = x) = 2 , x = 0,1,2,...,20.
20
(a)
Define the pmf of X as a function in R, then plot this pmf over its range.
(b)
Find the probability P(X = 5) using the function you defined in (a).
(c)
Use the sample(...) function to generate 10000 observations from the pmf of X. Assign the results to a variable.
(d)
Use the observations generated in (c) to obtain an estimate of P(X = 5), and compare your answer with what you found in (b).
My answer:
(e)
Find the mean of X, E(X), using the function you defined in (a).
(f)
Use the observations generated in (c) to obtain an estimate of E(X), and compare your answer with what you found in (e).
My answer:
(g)
Find E[(X + 1)-2] both using the function you defined in (a) and using the observations generated in (c), and compare your two answers.
My answer:

Question 2
For this question, we require the MosaicCalc package.
# This code installs the package only if it is not already installed, and loads it.
if(!(-mosaicCalc- %in% installed.packages()[,-Package-])){ install.packages(-mosaicCalc-, repos = -https://cran.ms.unimelb.edu.au/-)
} library(-mosaicCalc-)
Let a continuous random variable X have the pdf
( 29(x + 1)(2 - x), -1 x 2, f(x) =
0, elsewhere.
(a)
Define the pdf of X as a function in R, plot that function over the support of X, and verify that R .
(b)
Plot the cdf of X over its support.
Hint: Use the imported antiD(...) function from the package MosaicCalc.
(c)
Find the mean and variance of X.
Hint: If you would like the integrate(...) function to return only the numerical approximation of the integral (without the absolute error), you need to use integrate(...)$value.
(d)
Find P(-2 X 1) using the pdf.
(e)
Find P(-2 X 1) using the cdf.
(f)
Let Y = e-X. Plot the cdf of Y on its support, using the distribution function technique covered in the lectures.

Question 3
Consider the continuous random variables X and Y which have the following joint pdf
?
?,
f(x,y) =
? 0, elsewhere.
(a)
Find the marginal pdf f1(x) and cdf FX(x) of X analytically.
My answer:
(b)
Use Theorem 3.5-1 in Module 3 to simulate 10000 observations from the marginal distribution of X. Assign the results to a variable. Plot your observations in a histogram. Does the shape of the histogram mimic the shape of the pdf f1(x)?
My answer:
(c)
Repeat (a) for Y .
My answer:
(d)
Plot the cdf of Y
(e)
Repeat (b) for Y .
My answer:
(f)
Using the results of the X and Y simulations, calculate the proportion of times where, for a given observation i, X 2Y .
(g)
Using the results of the X and Y simulations, calculate an estimate of E(X + Y ).
(h)
Compute the theoretical values of P(X 2Y ) and E(X + Y ), and compare them with what you obtained in (f) and (g). Do these values match with the estimations you obtained by simulations? Why/why not?
My answer:
(i)
BONUS QUESTION (optional: you will not loose mark if you do not answer this question, but will receive extra mark(s) if you answwer correctly): How would you generate observations of the pair (X,Y )?
My answer:

Question 4
Let X have the normal distribution N(0,1), and let Y = eX.
(a)
Find the support of Y .
My answer:
(b)
Find the pdf of Y using the change of variable technique. You should define four functions throughout the process: f(x), v(y), v0(y), and the pdf of Y . Then plot the pdf of Y over (0,5].
(c)
Find the third moment of Y , E(Y 3), in two different ways: (i) using the pdf of Y , and (ii) using the known form of the mgf of a standard normal random variable.
Note: You cannot simply write xf(x) as the first argument of the integrate(...) function; instead you need to define a new function as xf(x).
(d)
Compute P(Y 1). What do you conclude about the median of Y ?
My answer:

Question 5
Let X be a continuous random variable with the density function
? 2
? , 1 x 2, f(x) = x2
? 0, elsewhere.
(a)
Define a function that computes the kth moment of X for any k = 1.
(b)
Use the function in (a) to obtain the variance of X.
(c)
Find the theoretical cdf of X, and explain how you could use it to simulate a realisation/observation from X.
My answer:
(d)
Generate a sample of 10000 observations from the distribution of X, and plot the corresponding histogram. Compare the shape of the histogram with that of the density of X (try to superimpose them on the same figure).

Question 6
Let X be a Gamma distribution with mean 8 and variance 16.
Let X1,X2,...,Xn be n independent random variables with the same distribution as X. Let Yn = Pni=1 Xi/n be the sample mean.
(a)
Define a function which generates 10000 observations from Yn for any value of n.
My answer:
(b)
Plot the histogram of the generated observations from Yn for n = 1, n = 5, n = 25. What do you observe? Can you compare Yn to a known distribution when n is large? Elaborate on your answer.
My answer:
(c)
Find an appropriate normal random variable which approximates Y25, and plot the histogram of the generated observations from Y25 on the same graph as the density of that normal random variable for comparison.
My answer:

Question 7
Let N be the number of offspring of a female Seychelles warbler (a species of birds) during a one-year period. One may assume that N has a Poisson distribution with mean 4.
Seychelles warblers are known to have an adaptive sex ratio bias: on high quality territories, females produce 90% daughters. Let X be the number of daughters of one female bird during a one-year period (on a high quality territory).
(a)
What are the theoretical mean and variance of X?
My answer:
(b)
Conduct a two-step simulation analysis to verify these theoretical values by replacing the ‘#?’ in the code chunk with the appropriate commands (and remove the argument eval=FALSE before running the code or knitting the file).
n.daughter -c() # initialises the vector which will contain simulated values of X
for (i in 1:10000){
n.offspring - #? # ith simulation of N
n.daughter -#? # updates the vector n.daughter with the ith simulation of X
}
mean(n.daughter) var(n.daughter)
(c)
What is the distribution of X? Compare the results of your simulations with the true probabilities using a histogram.
My answer:

Looking for answers ?