Please read the marking rubric carefully while doing the assignment
BUS708 Statistics and Data Analysis
Trimester 1, 2021
1 OVERVIEW OF THE ASSIGNMENT
This assignment will test your skills to collect, summarise and present data using Microsoft Excel and/or other approved tools. It will also test your understanding to interpret the output produced by the tools to solve business problems.
You will need to use the dataset allocated to you, as well as to perform data collection and produce numerical and graphical summary. You will need to submit an Excel file following the requirement as explained below.
2 TASK DESCRIPTION
There are two datasets involved in this assignment: Dataset 1 and Dataset 2, detailed below.
Dataset 1: This dataset will be sent to your KOI email address by the end of Week 3. Please email the lecturer if you have not received the dataset. This dataset is a subset of Google Play Store Apps dataset, by including only apps that have more than 1 million installs. The original dataset can be obtained from https://www.kaggle.com/gauthamp10/google-playstore-apps. Note that you will need to use the dataset sent to your email, not the original dataset.
Dataset 2: You will need to collect a dataset to answer your own research question. The details are given in Section 6 below. You will need to collect data from at least 30 international students. The data should have 2 categorical variables that will answer your research question (see Section 6).
Both datasets should be saved in an Excel file (see Submission Requirement on the next page). All data processing should be performed in Excel or other approved tools (this will be communicated during tutorials). Failure to use Excel or approved tools will results in mark deduction.
Your tasks are described below.
1. Section 1: Description about the Datasets
a. Dataset 1: Give a short but clear description (2-3 sentences) about this dataset (e.g.
what the dataset is about, where it comes from, is this primary or secondary data). b. Dataset 2:
i. State your research question.
ii. Give a short but clear description (2-3 sentences) about this dataset (e.g. what the dataset is about, how you collect the data, is this primary or secondary data, is it biased).
2. Section 2: Do you believe that 99.8% of Google Play Apps (with more than 1 million installs) are free?
Using Dataset 1, provide the frequency and the proportion (either as a decimal or a percentage) for each category for the variable Free. You also need to provide a graphical display that easily shows the proportion of each category. Finally, write a comment about your findings and answer the question.
3. Section 3: Is the average rating of Google Play Apps (with more than 1 million installs) less than 4.1?
Using Dataset 1, describe the Rating distribution of Google Play Store apps. You need to provide numerical summary (sample size, mean, standard deviation and median) as well as graphical display which shows the outliers, if any. Finally, write a
comment about your findings and answer the question.
4. Section 4: Is there a difference in the rating of Google Play Store Apps of different categories?
Using Dataset 1, first filter the variable Category to include only Entertainment, Tools, and Shopping. Then provide the numerical summary for the Rating grouped by the three different categories. You also need to provide graphical display which shows any outliers. Finally, write a comment about your findings and answer the question.
5. Section 5: Is there a relationship between the rating of a paid app and its price?
First, filter Dataset 1 to include only paid apps, then describe the relationship between the Rating of the apps and the Price. You need to provide both numerical summary as well as graphical display. Finally, write a comment about your findings and answer the question.
6. Section 6: Exploration of Two Categorical Variables
Suggest a research question around the topic of mobile phone or mobile apps, which involves only 2 categorical variables and involves international students. You need to design a survey that consist of only two questions, resulting in two categorical variables.
After collecting the data and store it in Dataset 2, describe the relationship between the two variables. You need to provide both numerical summary and graphical display. Finally, write a comment about your findings and answer the research question.
3 SUBMISSION REQUIREMENT
Deadline to submit the report: end of Week 7, Sunday 25th April 2021, 23:59 AEDT
You need to submit an Excel file to Moodle with the following requirements:
1. Your Excel file should have at least 8 worksheets with the following names and order: Sec 1,
Sec 2, …, Sec 6, Dataset 1, Dataset 2. Each worksheet should address the tasks in each
section correspondingly. Additional worksheets are allowed if named appropriately, these should be placed at the end, after Dataset 2.
2. In the first worksheet (Sec 1), you should write your student number on the top left corner, either on cell A1, or in a text box.
3. Failure to follow this requirement will result in mark deduction.
You should submit a correct file, in any case of submitting an incorrect file, resubmission may be approved for a valid reason, but this may attract mark deduction.
4 MARKING CRITERIA
Students are advised to read the marking rubric provided on Moodle, as well as detailed marking criteria based on this rubric.
5 DEDUCTION, LATE SUBMISSION AND EXTENSION
Late submission penalty: - 5% of the total available marks per calendar day unless an extension is approved. This means 0.75 marks (out of 15 marks) per day.
For extension application procedure, please refer to Section 3.3 of the Subject Outline. Please do NOT email the lecturer or tutor to seek an extension, you need to follow the procedure described in the Subject Outline.
Please read Section 3.4 Plagiarism and Referencing, from the Subject Outline. Below is part of the statement:
“Students plagiarising run the risk of severe penalties ranging from a reduction through to 0 marks for a first offence for a single assessment task, to exclusion from KOI in the most serious repeat cases. Exclusion has serious visa implications.”
“Authorship is also an issue under Plagiarism – KOI expects students to submit their own original work in both assessment and exams, or the original work of their group in the case of a group project. All students agree to a statement of authorship when submitting assessments online via Moodle, stating that the work submitted is their own original work.
The following are examples of academic misconduct and can attract severe penalties:
• Handing in work created by someone else (without acknowledgement), whether copied from another student, written by someone else, or from any published or electronic source, is fraud, and falls under the general Plagiarism guidelines.
• Students who willingly allow another student to copy their work in any assessment may be considered to assisting in copying/cheating, and similar penalties may be applied. ”