Describing and presenting statistical information
Weighting: Worth 10% of final mark
Due date for submission: 30th August 2019
Place of submission: MyLO assignment dropbox
This assignment links to Learning Outcome 2: To demonstrate sound practices in sampling, data description and presentation in a business environment. And to Learning Outcome 5: Quantify an association between two variables, explain variation and improve prediction
PROJECT: YOUR ROLE
Assessment of your work will be based on 2 reports that analyse two different scenarios.
Important: In each part there are four different data sets labelled Blue, Green, Red and Yellow. To find which version you have been allocated go to Mylo and click on “Groups. You must analyse the data set for the group you have been assigned to. If you attempt any other version than your allocated version you will receive no marks for this assignment.
Readership of your reports. Pretend that you are a consultant and that you are preparing the reports in a professional context. Your boss hates typos and you want to keep your job, so make sure everything is well formatted and that the spelling and grammar have been checked.
PART ONE (EXPLORING DATA WITH PIVOT TABLES)
Tools required. You should use the PivotTable tools that you learned about in computer lab 2.
Scenario. UniInc has been running a course in Data Visualisation. They had 200 enrolments of students from different degrees and at different stages in their degrees. Each student was assigned a tutor. The manager of the unit is interested to compare the performance of the two tutors hired for the course. Data has been collected on the following attributes of students in the course (see the file TutorPerformance_[group].xls).
STUDENT An identification number for the student
TUTOR The tutor that the student had (A or B)
DEGREE The overall degree the student is doing (BBus or BSc)
YEAR Whether the student is in 1st or 2nd year
MARK The student’s mark in the course
The manager of the course is concerned that the students who have tutor A are not doing as well as the students that have tutor B. You are not completely convinced that difference in tutoring provides a complete explanation and want to investigate the data further.
Questions to explore (not all need to be reported).
1. What is the difference in average mark for tutor A and tutor B?
2. Does average mark differ by degree? Does it differ by year?
3. How many students in each year are allocated to tutor A and tutor B? Similarly, how many students in each degree are allocated to tutor A and tutor B?
4. Reflecting on the questions above, create a pivot table that compares the average mark across two categorical variables. Taking this into account, is there evidence that tutor A’s performance is worse than tutor B?
5. What are the limitations of this sort of data for trying to establish cause and effect?
Prepare a report for the manager of the course. Your report should be structured to have
• a title,
• an aim,
• your main findings,
• supporting evidence for your findings, in particular include at least:
o a two-way pivot table of averages o a two-way pivot table of counts
• A discussion section that explains: o if you think one tutor is performing better than another, o why you think that, o any caveats or concerns that remain.
The report for Part One should be no more than 2 pages.
PART TWO (LINEAR REGRESSION)
Tools required. You have a choice of either completing the calculations for the regression analysis manually or using built in Excel functions (as covered in the lectures and in computer lab 3).
Scenario. The data set Accidents_[group].xls contains data on the number of traffic accidents over the last 30 years in the small island country of Beafoo. Last year, the Minister in Charge of Road Safety introduced harsher fines for anyone travelling even slightly over the speed limit. She is interested to know if this policy has had an effect of the number of accidents.
Your task is to develop a linear model that can predict accidents based on year and to use it to decide if the number of accidents this year is meaningfully lower than in previous years. Make your linear regression model based on the data from years 1 to 30 (i.e. do not include year 31).
Prepare a report for the Minister in Charge of Road Safety. Your report should be structured to have
• a title,
• an aim,
• a main finding that gives the linear regression model you suggest using for prediction of accidents based on year
• a supporting graph (e.g. a scatterplot) that displays the data and model.
• a discussion of how accurate you expect the model to be at making predictions (this should include common summary statistics such the standard error, the r2 value (coefficient of determination) and the correlation coefficient). Within this context, do you think the new fines policy has made a difference?
• Are there any points of concern that should be taken into account when using the model?
The report for Part Two should be no more than 2 pages.
BEA140 Quantitative Methods Assignment 2
BEA140: Quantitative Methods 2019 Semester 2
Assignment 2: Describing and presenting statistical information weighting 10%
Good Satisfactory Poor
In your submission, you: In your submission, you: In your submission, you:
Communicate your findings
Structure your reports in an appropriate format that includes title, aims, findings, supporting evidence and discussion.
Ensure that your reports have been thoroughly spell-checked and are grammatically correct.
Clearly label your tables and figures, include captions that are descriptive enough to stand alone, ensure that figure axes and number formats are appropriate.
Include a well-rounded discussion of what your results mean. Structure your reports in a mostly appropriate format that includes most of the following elements: title, aims, findings, supporting evidence and discussion.
Reports contain some spelling or grammatical errors
Clearly label any tables and figures. Include captions.
Include some interpretation of what your results mean.
Reports are poorly structured and difficult to follow.
Reports contain many spelling or grammatical errors.
Tables and figures do not have captions and lack informative axes labels.
Interpretation of results is missing.
Present data summaries that support your findings
Correctly produce PivotTables and use them to address the scenario in Part 1.
Correctly produce a scatterplot and use it to support your answer for the scenario in Part 2
Fail to create a PivotTable or use it inappropriately.
Fail to create a scatterplot or use the incorrect data to create it.
Perform a regression analysis and associated calculations Correctly perform the required regression analysis and report the least-squares line of best fit.
Correctly calculate and report the standard error, r2 and the correlation coefficient.
Give supporting detail of your calculations.
As for good, however, calculations may have some minor errors but supporting detail reveals that you are on the right track Calculations contain serious errors
Some relevant calculations are missing.
Insufficient detail given of how calculations were carried out.