BUS 105, Semester 2 ,2017

Instructions for the computing assignment worth 20% of your final grade due week 10

*Last semester many students got 0 out of 20 because many students got their friend to do the assignment using advanced methods the friend used in other courses but they could not reproduce the results themselves. Be aware if you use any advance methods you will HAVE to show them to your tutor you can reproduce the results in the tutorial

*There are 6 sections you must put the answers to all sections into a single document and submit to turnitin. Note that your document must be a Microsoft Word document or pdf. If you use an Apple computer DO NOT SUBMIT A “.PAGES” document save it as a pdf before submitting.

*For sections 1-4 you must use your allocated sample number available from

https://app.box.com/s/qb5gwy7z0k2acvt6so9uuh0q9whv07i8

*Note that there are 5 preparation quizzes on moodle that are also due in week 10 , these will help prepare you for the assignment so you should finish them before finishing the assignment.

*Unlike your other assignments , you do not need references you should Just use the examples given in lectures to do sections 1 to 4, For section 5 you should make up the example yourself without using a reference and in section 6 just use the references provided. if you want to use other sources or advanced software email the lecturer mattthew.maccallum@koi.edu.au

The exact instructions for each of the 6 sections is given below

Section 0(Cover page)

“title :bus105 computing assignment semester 2, 2017 ”

“Name:”

“Student number:”

“Allocated sample:”

Section 1

Use the dataset given below you must use the sample allocated to you based on your student number

https://app.box.com/s/56pb6hqu0ypcg0f3lhy6cl5szt1jgdla

Note that for section 1 the answers are provided so you can check your work, the answers will not be provided for the other sections.

A) paste in the scatterplot for your sample into your word document and give a simple comment about the relationship between the variables, (you do not need to submit the excel file)

B) Estimate the annual contribution if the income is $200,000 using the regression line from part (a)

C) Find the zscore of the estimate in part (B) note that average of the estimates is $27,000 with standard deviation $2,100, remember to show your work.

D) using the zscore from part (C) Find P(Z zscore) , you can find out the answer using www.wolframalpha.com

for example found the zscore was 1.5 if the zscore is 1.5 type in

P(Z 1.5)

into wolfram alpha.com

E) If there was a list of 10,000 estimates ranked from lowest to highest, what rank do you think your estimate would be close to?

Hint: just use the formula

expected rank = P(Z zscore)*10000, remember to show your work.

Section 2

Use the dataset given below you must use the sample allocated to you based on your student number

https://app.box.com/s/yvhk3e3oymbs3toy6j5xetid82dsjyz4

A) Use the PivotTable feature in excel to find appropriate summary statistics for your sample, This will probably require two PivotTables. You should paste both into word, you do not need the excel file.

Make sure the pivotable (or pivottables) include the following statistics

*Just considering the high risk (riskier type) investments what is the sample size n1 and the proportion of high risk investments that made a loss p ^_1

*Just considering the low risk (safer type) investments what is the sample size =n2 and What is the proportion of low risk investments that made a loss p ^_2

B) Use excel to make an appropriate graph that lets you compare the proportions found in parts A and paste this into your word document

C) Looking at your answers to parts (A) and (B) Make a simple comment about the relationship between the variables

investment type (risky or safe ) and

Made a profit (made a profit/made a loss)

D) i) Using your sample what is the estimate for p1- p2? In other words what is the difference between the sample proportions p ^_1 -p ^_2

ii) Find the zscore of the estimate in part (i) note that average of the estimates is 0.1 with standard deviation 0.0743

iii) using part (ii) find P(Z zscore) using www.wolframalpha.com

for example if the zscore is 0.5 type in

P(Z 0.5)”

into wolframalpha.com

iv) IF there was a list of 4000 estimates ranked from lowest to highest, roughly what rank do you expect your estimate to have?

Hint: just use the formula

expected rank = P(Z zscore)*4000

E) test the claim there is a difference in the proportions use a 5% level of significance

i)state an appropriate H0 and H1

ii) find the p-value Only using the answers to part (A) and the webpage

http://epitools.ausvet.com.au/content.php?page=z-test-2

Do NOT use any other method to find the p-value

Do NOT use any other software package such as SPSS or Analysis tookpak

iii) state whether or not you reject the H0

iv) give a conclusion in plain English

Section 3

Use the dataset given below you must use your own sample

https://app.box.com/s/z0mbtcfsdqxz1rm7rhw3p9sb75aq7174

A) Use the pivot table feature in excel to find appropriate summary statistics for your sample. The following sample statistics must be found

Just considering the low risk investments, what is the sample size n1 , the sample average return of low risk investments x ¯_1 , and the sample standard deviation s1

Just considering the high risk investments , what is the sample size n2 , the sample average return of high risk investments x ¯_2 , and the sample standard deviation s2

Paste the pivot table into the word document you do not need to submit the excel file

B) Give an appropriate graph that shows the relationship between variables, Note that the information in part A is NOT Suitable for a graph you have to get different information

C) Make a simple comment about the relationship between the variables using the answers to (A) and (B)

D)

i) Using your sample what is the estimate for µ1- µ2? In other words what is the difference between the sample means x ¯_1-x ¯_2

ii) Find the zscore of the estimate in part (i) note that average of the estimates -0.0256 with standard deviation 0.0173

iii) using part (ii) What is P(Z zscore), you can find out the answer using www.wolframalpha.com

for example if the zscore =-1 type in

P(Z -1)

into wolfram alpha

iv) If there was a list of 2000 estimates ranked from lowest to highest, what rank do you think your would be close to, hint just use the formula

expected rank = P(Z zscore)*2000

E) Test the claim that there is a difference between the means using a 5% level of significance

i)state an appropriate H0 and H1

ii) find the p-value using the answers to part (A))and the webpage

https://www.medcalc.org/calc/comparison_of_means.php

Do NOT find the p-value using any other method.

Do NOT use any other software package such as SPSS or Analysis tookpak

iii) state whether or not you reject H0

iv) give a conclusion in plain English

Section 4

Use the dataset given below you must use your own sample

https://app.box.com/s/kzc6ivy10gvy4vz6d0pgy0lzh929ivx9

Suppose A business has conducted an opinion poll to find out if their customers support a change to the Business

Use the PivotTable feature in excel to find appropriate summary statistics for your sample,. You should paste both into word, you do not need the excel file.

This pivot table must have the number of people that answer yes and the number of people that answer no

What is sample size and the sample proportion p ^ of people that support the change, Note that p ^ is the estimate for the population proportion p

i) Find the zscore of the estimate in part (a) note that average of the estimates 0.6 is with standard deviation 0.0357

ii) using part (i) what is P(Z zscore) you can find out the answer using www.wolframalpha.com

For example if the zscore is 2 then enter

P(Z 2)

into www.wolframalpha.com

iv) If there was a list of 1000 estimates ranked from lowest to highest, what rank do you think your would be close to, hint just use the formula

expected rank = P(Z zscore)*1000

Find a 95% confidence interval for the proportion of people that support the change

Section 5

a)You have to obtain your own dataset,

Your dataset must have the following properties.

It must be have at least 5 rows (observations)

it must have at least 2 variables, (note that the name of each thing in the data set is NOT a variable)

At least one of the variables must be categorical

There are 3 options for getting the dataset

Option 1

*Make up your own dataset, this can be about anything you find interesting, So it could be about businesses, customers, students athletes, cats, monkeys, AYTHING AT ALL.

if you make up your own dataset there is no way it will be the same as another students.

Option 2

*Find and existing data set and email the lecturer the dataset matthew.maccallum@koi.edu.au

the lecturer will email you a sample of the data set, use the sample , this will make sure there is no way your sample is the same as other students.

Option 3

*find an existing dataset and make up an extra variable email the lecturer the dataset matthew.maccallum@koi.edu.au

the lecturer will email you a sample of the data set, use the sample , this will make sure there is no way your sample is the same as other students

b) Pick two of the variables (make sure one of the variables is categorical) and summarize the variables with a suitable pivot table

c) Paste the dataset and your summary into the word file, you do not need to submit the excel file

add a very brief comment

Section 6

If you give a brief discussion (total 300 words) of any of the resources below, or pick two of the resources below or pick all 3, just make sure the total number of words in section 6 is 300 words or less. It is strongly suggested you discuss the examples given in the resources given below

1) Guide to summarizing datasets

https://app.box.com/s/jxuqhpzjrfj14xiq28x1bnywjv1iayr4

2) A students assignment from 2015

https://app.box.com/s/2a72e7i9lduyy3wp8nyd0uogsyvvnzrz

3) Discussion of how mean and standard deviation is used in finance

https://www.youtube.com/watch?v=UwO4JvB9OpE

General tips

The assignment (and all of bus105) is about datasets

If you are not sure what a dataset is you can look at the pivot table example taken from Wikipedia (Note you have to make up your own example)

https://app.box.com/s/ab7sbnjocfdspgs40xtzqsvoj5tvyoz3

The Wikipedia entry is

https://en.wikipedia.org/wiki/Pivot_table

And if you are unfamiliar with datasets you should try section 6 of the assignment first.

