Dublin Business School
Module Title: Statistics for Data Analytics
Module Code: B9DA101
Module Leader: Dr Shahram Azizi
Stage (if relevant):
Assessment Title: CA two
Assessment Number (if relevant):
Restrictions on Time/Length : Submission before deadline
Hand In Date: Before final exam
Planned Feedback Date:
Mode of Submission: Online
This CA assesses students on core concept in Hypotheses tests, GLM analytics, and Bayesian analytics.
All questions are mandatory.
Use R/Rstudio to solve questions and perform analytics.
Any submission after deadline will not be considered and scored.
Consider a relational dataset and specify your input and output variables , then:
Train the model using 80% of this dataset and suggest an appropriate GLM to model ouput to input variables.
Specify the significant variables on the output variable at the level of ??=0.05 and explore the related hypotheses test. Estimate the parameters of your model.
Predict the output of the test dataset using the trained model. Provide the functional form of the optimal predictive model.
Provide the confusion matrix and obtain the probability of correctness of predictions.
(Total: 35 Marks)
Let x_1,…,x_10 are identically independently distributed (iid) with Poisson(?).
Compute the likelihood function (LF). (10 Marks)
Adopt the appropriate conjugate prior to the parameter ? (Hint: Choose hyperparameters optionally within the support of distribution). (10 Marks)
Using (a) and (b), find the posterior distribution of ?. (10 Marks)
Compute the minimum Bayesian risk estimator of ?. (5 Marks)
(Total: 35 Marks)
An opinion poll surveyed a simple random sample of 1000 students. Respondents were classified by gender (male or female) and by opinion (Reservation for women, No Reservation, or No Opinion). Results are shown in the observed contingency table below.
Does the gender and opinion on women reservation are independent? Use a 0.05 level of significance. To do so,
State the hypotheses. (5 Marks)
Find the statistic and critical values. (10 Marks)
Explain your decision and Interpret results. (15 Marks)
(Total: 30 Marks)