### Recent Question/Assignment

Use this following as the cover page for the word file
“Title: semester 2, 2020 BUS105 computing assignment”
“Name:”
“Student number:”
“Sample: ”
“I am using the sample that is allocated to me based on my student number”
“Instructions for the computing assignment worth 20% of your final grade
Overview
Materials that must be used in the assignment, these are provided on moodle
*An excel file with the datasets for all students , each student must follow the instructions and get 3 datasets using their student number , each student will have different datasets
*An automatic dataset summarizer.
*Instructions for checking that you have properly found your sample, students must use their sample.
Students must submit a word file AND an excel file to moodle
¬*The word file needs to be submitted to the Turnitin link, the word file needs a cover page and the answers to 10 questions given in full detail later in this document (pages 2 to 6) a vital part of answering the question is using the dataset and the dataset summarizer.
*The excel file needs to be submitted to the assignment dropbox, the excel file should have the students 3 datasets and summaries NOT made by the automatic dataset summarizer students need to summarize the dataset using PivotTables and the scatterplot. Instructions for submitting the excel file are given on page 7“

“Instructions for the Major part of assignment , the word file worth 18% of your final grade you submit to Turnitin.
Overview
You need to submit a word file with the answers to 10 questions the first 8 are about the dataset the last question is a paraphrasing task (refer to pages 3 to 6)
You will use your dataset and the automatic dataset summarizer to get the descriptive statistics that are used questions 1 to 5 and the inferential statistics that are used in question 6 to 8.
to check you have correctly obtained your dataset check both p-values are correct when you investigate both categorical variables (question 6 to 8)
The word count can be less than 1500 words if you are giving answers that demonstrate you have understood the material.
Summary of the datasets (question 1 to 8 given on pages 3 to 6 are about the datasets)
Dataset 1
University XYZ gives out a survey to students in a statistics course
The survey questions were
Do you think the course is useful and do you understand why?
How many videos have you watched ?
The questions and the students’ answers are a dataset
Dataset 2
University XYZ gives out a survey to students in a statistics course
The survey questions were
What style of Youtube video do you prefer, chatty or direct ?
Are you scared of maths
How many videos did you watch ?
The questions and the students’ answers are a dataset
Dataset 3
Business XYZ is using videos to replace meetings to maintain social distancing
The duration of the video (in seconds) and engagement score is recorded for many videos
The engagement score is low if people only watch the first part of the video. “

Question 1
Paste dataset 1 into the dataset summarizer
a) Paste in the descriptive statistics into the word file. The descriptive sample statistics let you investigate the relationship between the variables “course useful?” and “number of videos watched ?” using the sample
b) Use the output in part (a) to describe the relationship between the two variables, your discussion must use one of the following sample statistics (choose one)
Difference between sample means? x ¯?_1 -? x ¯?_2
Difference between sample proportions ? p ^?_1 -? p ^?_2
correlation coefficient r
Question 2
Paste dataset 2 into the dataset summarizer
a) Paste in the descriptive statistics into the word file. The descriptive sample statistics let you investigate the relationship between the variables “Preferred style?” and “Scared of maths?” using the sample
b) Use the output in part (a) to describe the relationship between the two variables, your discussion must use one of the following sample statistics (chose one)
Difference between sample means? x ¯?_1 -? x ¯?_2
Difference between sample proportions ? p ^?_1 -? p ^?_2
correlation coefficient r

Question 3
Paste dataset 3 into the dataset summarizer
a) Paste in the descriptive sample statistics and the scatterplot into the word file. The descriptive statistics let you investigate the relationship between the variables “Duration?” and “Engagement score?” using the sample
b) Use the output in part (a) to describe the relationship between the two variables, your discussion must use one of the following sample statistics
Difference between sample means? x ¯?_1 -? x ¯?_2
Difference between sample proportions ? p ^?_1 -? p ^?_2
correlation coefficient r
c) Predict the engagement score of a video with duration 600.
Question 4
Use the output for question 1a
Just considering the people that do not find the course useful find the zscore of the sample mean if you assume the population mean is µ=5 and the population standard deviation is s=3
Just considering the people that do find the course useful find the zscore of the sample mean if you assume the population mean is µ=5 and the population standard deviation is s=3
Question 5
Just considering the people that prefer the chatty style of video find a 90% confidence interval for the proportion of people that are scared of maths
Just considering the people that prefer the direct style of video find a 90% confidence interval for the proportion of people that are scared of maths
Question 6
Paste dataset 1 into the dataset summarizer
a) Paste in inferential statistics that measure evidence for the claim there is a relationship between the variables “course useful?” and “number of videos watched ?” if you consider the whole population
b) Make suitable comments about the output in part (a)
c) Go back to the dataset summarizer and scroll down , Paste in the output for question 6c given below the inferential statistics and fill in the blank , replace the blank with a number that would make the p-value lower than the p-value in question 6a

Question 7
Paste dataset 2 into the dataset summarizer
a) Paste in computer output that measure evidence for the claim there is a relationship between the variables “preferred style ?” and “scared of Maths?” if you consider the whole population
Hint: inferential statistics measure evidence for a claim.
b) Make suitable comments about the output in part (a)
c) Go back to the dataset summarizer and scroll down , Paste in the output for question 7c given below the inferential statistics and fill in the blanks, you have to replace the blanks with numbers that give a smaller p-value than the p-value in question 7a , Note that the total of blanks must also agree with the existing total as well.
Question 8
Paste dataset 3 into the dataset summarizer
a) Paste in computer output that measures evidence for the claim there is a relationship between the variables “Duration?” and “engagement score?” if you consider the whole population
Hint: inferential statistics measure evidence for a claim.
b) Make suitable comments about the output in part (a)
c) If another sample had a higher correlation would you expect the pvalue to be lower or higher ?
Question 9
Briefly discuss the sample report given in the link below 300 words is enough , in particular discuss the dataset , how the data was analysed and the main message of the report
https://app.box.com/s/lr3nmaozfgaxf7r4mq1quw5m69v20s3m
(you need to click download, logging in will not work) and discuss how it is communicated. Do not cut and paste text and use a computer to randomly change the words
Question 10
Give a quick comment about the discussion of p-values given in the link below, 300 words is enough.
https://app.box.com/s/mnbbg5gn4e10ysetgg0t94psw1nzo141
For each case discuss the relationship between p-value and the percentile of p-value (in other words discuss the distribution of p-value). Note that in the first case discussed there is a large difference in population means so there is a strong relationship in the population, how would you describe the distribution of p-value? In the second case there is almost no difference between the means so there is almost no relationship, how would you describe the distribution of p-value ?

Upload the word file to the Turnitin link on moodle

Instructions for the excel file ,
This is worth 2% of your final grade
you have to use the excel commands discussed below and not the dataset summarizer
However you should check that your summaries are the same as the output from the dataset summarizer you used in the word file.
If you have different information you will get at most 1 out of 2
You need to cut and paste just your dataset into a new excel file and follow the 4 instructions below, DO NOT use a cover page for the excel file, you must check that you have the correct sample
Note that you can still do this at home even if you do not have excel, just use google sheets
Select all of dataset 1 and use excel PivotTable commands (or google sheet pivot table commands) to find appropriate sample statistics that let you investigate the relationship between the fields (variables) “course useful?” and “number of videos watched?”
Select all of dataset 2 and use excel PivotTable commands (or google sheet pivot table commands) to find appropriate sample statistics that let you investigate the relationship between the fields (variables) “preferred style?” and “scared of maths?”
Select all of dataset 3 and use excel commands to make a graph that lets you investigate the relationship between the fields (variables) “duration?” and “engagement score?”
Upload the excel file with the pivot tables and scatterplot to the assignment dropbox