Mini-Project 1 - STAT490
The ability to predict the selling price of houses, as well as to ascertain the aspects of a household which have an effect on the sales price, is very important for property evaluations, real-estate sales and municipal levies. To investigate this, you will use a dataset which includes 300 houses, each including various information regarding the features and aspects of the household. You are to use a regression model to predict the sales prices of houses and to determine which aspects of a household are of most importance when predicting the price. You are also expected to present a full report in support of your findings, including an introduction, a literature review and methodology, all analyses and discussions, and a conclusion. The guidelines for what is expected are provided in the sections that follow.
You will generate your own data set, which is linked to your student number. The data will be sampled from the Housing data set in the Ecdat package. Investigate this data set using the R help functions. You will need to understand what each of the variables in the data set represent. To obtain the dataset that you will use for analysis, run the commands below in R. The dataset you will use in your analysis will be called mydata.
# Install and load the -Ecdat- package install.packages(-Ecdat-) library(Ecdat)
# Load the data set into your environment data(Housing)
# NB!! Replace this with your student number set.seed(201234567)
# Select 300 random observations from the dataset mydata - Housing[sample(x = 1:546, size = 300),]
Please ensure that you replace 201234567 with your student number in the line set.seed(201234567). Also, ensure that you are using the latest version of R, which is version 4.0.2 (“Taking Off Again”).
Structure and Guidelines
Your report is limited to a maximum of 8 pages excluding references. Aim for approximately 2000-2500 words. You may use any font, but you must use size 12, with 1.5 line spacing and it is essential that the alignment is justified. You may use any word processor that you choose.
The structure of your report should follow the structure of an research article and include the following:
• An abstract (summarising the content and findings in roughly 120 words).
• An introduction (this will outline the problem statement and the purpose of the study).
• A brief review of the related literature. You should cite approximately 3 articles which have been used to perform similar analyses and indicate the findings of each.
• The methodology you have used for the analysis. This should include the modelling and estimation procedure with a full motivation.
• A full set of analysis (including EDA) and results using a software of your choice. Please note that data must be generated in R, analysis may be performed elsewhere. This section should include:
– no more than 3 graphical plots and 3 tabulated descriptive summaries, include all inferential results in tabulated form, and
– a discussion and interpretation of the results as they pertain to the stated objective.
• a conclusion in which you identify the limitations or concerns you have of the analysis and the implications this has for your results. The conclusion should also make it clear how your findings may be used/implemented by a practitioner in the field.
While completing this report you should adhere to the following:
• Label graphs and tables clearly and carefully – do not assume readers are familiar with the study.
• Descriptive statistics are useful to familiarise yourself with the results – however it is usually only necessary to report a selection of the descriptive statistics rather than all of them. Select those you want to include wisely.
• Be neat, the layout of a project makes an impression on anyone who reads the final report and will count towards the mark.
• Include a reference list and any acknowledgements at the end of the write-up.
• Proof read your project write-up SEVERAL times before submitting for assessment. It can be helpful to ask a friend/colleague to critique the write-up. Ask them to be very critical rather than friendly, it will improve the quality of your report.
Instructions and Marking Rubric
You are allowed one review submission for lecturer commentary/feedback, use it wisely.
Once this review is used, it will not be available for any subsequent submissions. Allow
7 days for your lecturer to review the submission and provide feedback. Adhere to the deadlines below.
Activity Date (by 15h00) Person responsible
Receive project information and data 31 August 2020 Lecturer
Submit draft for review 14 September 2020 Student
Receive feedback from review1 21 September 2020 Lecturer
Submit final report 30 September 2020 Student
Late submission deadline 9 October 2020 Student