PHE5HDD: Health Data for Decision Making: Epidemiology and Biostatistics

Assignment 2: Biostatistics

Instructions: This assignment consists of 10 multiple choice questions worth 2 marks each and 4 multi-part questions with marks as indicated. Submission of this assignment is via the PHE5HDD LMS site.

Question 1: Which one of the following is an example of nominal data?

a. Weight of patients

b. Blood group: A+, B+, O- etc.

c. Number of times somebody presented at an emergency department

d. Perceived health condition as good, mediocre and bad

e. None of the above

Question 2: A patient is recorded as having 3rd degree burns. What type of data is this? a. Continuous

b. Discrete

c. Ordinal

d. Nominal

e. None of the above

Question 3: Which of the following should be used to measure the spread of a negatively skewed distribution?

a. Interquartile range

b. Mean

c. Median

d. Mode

Question 4: The weights of individuals in a population are normally distributed with a mean weight of 45 KG and a standard deviation of 5 KG. What percentage of this population have their weights lie within the range of 35-55 KG?

a. 68%

b. 99.7%

c. 95%

d. 24.5%

Question 5: Which of the following is NOT a valid proportion?

a. 12.5%

b. 6/80

c. 1.09

d. 10:65

e. None of the above

Question 6: Which of the following statements regarding the sex differences for exposure to a particular chemical can be said to be TRUE from the below box plots?

a. The women have a lower median exposure than men

b. The distribution of exposure is normally distributed in men and women

c. The men in this sample have a larger range of exposure values than the women

d. There is a significant difference in exposure between men and women

e. None of the above

Question 7: The respiratory health of a sample of 25 men exposed to fumes in a factory was assessed by measuring the forced expiratory volume (FEV). The sample mean was 3.20 litres. From previous work it is known that the standard deviation of FEV is 0.5 litres. 95% confidence interval for sample mean is:

a. 2.94 to 3.46 litres

b. 2.25 to 4.15 litres

c. 3.00 to 3.40 litres

d. 3.032 to 3.368 litres

e. None of the above

Question 8: Consider a random sample of 200 females and 200 males. Suppose 20 of the females are diabetic and 15 of the males are diabetic. What is the estimated difference between sample proportions of females and males who are diabetic?

a. 2.5

b. 0.25

c. 0.025

d. 0.0025

e. None of the above

Question 9: The mean resting heart rate of a random sample of 17 women is 82 beats per minute, and the sample standard deviation is 3.5. The sample came from an infinite population. The standard error of the sample mean is

a. 0.720

b. 1.299

c. 0.848

d. 16.88

e. None of the above

Question 10: A survey regarding awareness of the Ovarian Reserve test was conducted among 600 women in a rural town which has around 13,000 people. Only 10 percent (10% or 0.10) of the women said that they were aware of this test. Which one of the following statements about this 10% awareness is correct?

a. It is a population proportion

b. It is a margin of error

c. It is the standard error of the mean.

d. It is a sample proportion

e. None of the above

Question 11:

a. The following table represents data of admission to various wards of a hospital. Draw a pie chart using the following data (4 marks):

Ward A Ward B Ward C Ward D Ward E Ward F Total

6 10 15 16 13 28 88

What percentage of patients were admitted to ward C and D? (1 mark)

b. Plot a histogram for each of the following two datasets, ensuring sensible bins and paste into your submission. (6 marks)

Comment on whether a normal distribution is valid for this data. (4 marks)

i) A sample consisting of the following 24 observations:

15 65 41 65 5 85

90 32 37 101 109 45

56 85 29 78 91 90

68 69 65 62 85 25

ii) A sample consisting of the following 36 observations:

45 38 46 34 36 45

23 27 35 29 35 41

48 41 50 37 38 32

35 39 47 46 45 40

40 37 41 34 43 20

29 39 31 49 33 26

Question 12: Using appropriate statistical language, describe the distributions of the following histograms. (15 marks)

a)

b)

c)

Question 13: The following dataset of a sample of 30 patients was collected by an investigator.

ID Sex Weight (kg) RN Score (%)

1 M 60 28

2 F 51 10

3 M 95 31

4 F 49 15

5 F 53 16

6 F 62 22

7 M 92 34

8 F 58 21

9 F 55 20

10 F 62 24

11 M 53 20

12 F 51 11

13 F 55 17

14 M 68 26

15 M 105 33

16 F 88 28

17 M 65 25

18 F 72 27

19 F 55 21

20 M 64 24

21 M 71 26

22 F 54 16

23 F 54 17

24 F 92 30

25 M 72 27

26 M 92 30

27 M 92 29

28 M 95 31

29 M 65 25

30 F 61 21

Note: M represents male and F represents female

a. Create a plot that allows you to comment on a trend between weight and RN score. Paste your plot into your submission. (5 marks)

b. Comment on the direction and strength of the trend between the variables in both statistical and plain language. (5 marks)

c. Create boxplots to show the differences in the distributions of weight and RN score by sex. Paste your boxplots into your submission. (5 marks)

d. Describe the differences and similarities in the distributions of weight and RN score by sex. (5 marks)

e. Would sex be considered to confound the association between weight and RN score? Explain your reasoning. (5 marks)

Question 14: Below are estimates of the daily intake of calcium (in milligrams) for 38 women between the ages of 51 and 80 years who participated in a study of women’s bone health. Use this data to answer the following questions.

1155 882 1062 970 909 802 374 416

651 716 1769 1420 1525 1200 1050 976

671 774 1253 549 1325 446 465 1933

802 684 437 748 1203 2433 1355 948

784 997 570 403 625 696

a. Draw a histogram by using the above dataset in excel. Use frequency on Y-axis, and daily intake of calcium on the X-axis in the following ordered groups: 251-500, 501-750, 7511000 and so on. (5 marks)

b. Calculate the mean, median, range, standard deviation, variance and the standard error. (5 marks)

c. Suppose that the recommended daily allowance (RDA) of calcium for women in this agegroup is 1300 milligrams (this value is changed from time to time based on the statistical analysis of new data). What proportion of these women meets or exceeds this intake? (5 marks)

d. By dichotomising the daily intake of calcium by RDA, you have created a binomial distribution. What is the mean, standard deviation and 95% CI of this binomial distribution? (5 marks)

e. An aged care home has 1250 residents, 897 women and 353 men. How many female residents would you expect to be meeting their RDA of calcium? (5 marks)

