Test Development & Evaluation Report Instructions
FORMAT: Written Report
TIMING: Week 6 (Test due); Week 7 (complete test survey); Week 8 (Report due)
WORD LENGTH: PSYC371: 1,750 word report; PSYC471: 2,000 word report
For this assessment task you are required to conceive of, design, develop, and evaluate, a psychological test. You will then write a report detailing this process, including the evaluation results. The basic process will involve four stages:
1. Test conceptualisation & development. At the end of this stage you will have developed a 15item self-report test, a scoring system, and interpretation guidelines. More information about what is involved in this stage is provided on p.3. A copy of your test (including scoring/ interpretation info) must be submitted through the relevant submission link on Moodle by 11:59 pm on Sunday 11th August.
2. Data collection & dissemination. Student’s tests will be combined into a number of online surveys to facilitate the data collection and feedback required for test evaluation. These surveys will each contain a number of student tests for you to complete, along with a peer review feedback section for each, and also a Big 5 personality test (for validation purposes).
Completion will be anonymous and steps will be taken to ensure that no identifying information is collected. You will be sent a link to one of these surveys in Week 6, which will need to be completed by 11:59 pm on Wednesday 21th August. It is anticipated that it will take you approximately 2 hrs to complete this activity. Your data will then be available for you to download from Moodle by the end of Week 7.
3. Test evaluation & revision. During this stage you will use the data and peer review feedback gathered on your test to evaluate and revise it, finishing with a 10-item test. More information about what is involved in this stage is provided on p.4.
4. Writing/finalising your report. From a time management perspective, it will be best if you actually work on this as you go through each of the stages above, drafting each section of the report that is relevant to the stage you are up to. The requirements for this report are explained in detail on p.6. Your report must be submitted through the relevant submission link on Moodle by 11:59 pm on Sunday 15th September.
Please note the following constraints and guidelines:
• Your test must measure a psychological construct that is appropriate for assessment via a self-report test. Typically, students find that aspects of personality, attitudes/beliefs or behaviours, with measurement based on a 5-point Likert scale (i.e., strongly agree to strongly disagree; very much like me to not at all like me; never to always or almost always, etc.), work best for this assignment.
• Your test must be appropriate for a general adult population. This is simply down to the constraints of the assignment – your test will be tried out by other PSYC371/471 students, with the resulting data being used for your test’s evaluation.
• You can develop a test to measure a new construct or develop a new test relating to a preexisting construct. Whichever way you go, you will need to keep in mind the end uses for your test: Where, when, how, and why will the test be used? Who will use the test? Who will be tested with it? What information will it provide? How will this information be used? How will your test contribute to the research field, clinical practice, and/or society more broadly?
• You must be able to develop meaningful (i.e. evidence-based) hypotheses relating your construct to at least one of the Big 5 personality traits: Extroversion, Neuroticism,
Conscientiousness, Agreeableness, and Openness. This is because you will be validating your test against an existing Big 5 personality test – a short version of the Big Five Inventory, the BFI2-S (Soto & John, 2017; see the assignment section of Moodle for a copy). Your hypotheses can be for either positive or negative relationships with the Big 5 traits. So, for example, if your review of the literature suggests that high scores on the construct measured by your test would be more common for introverts than for extroverts, you would hypothesise a negative association with Extroversion.
• Your test and its components must not be offensive, sexist, racist, or derogatory in any other way towards any people or groups of people. You must also avoid asking test-takers to provide information that: is overly sensitive or intrusive; is likely to lead to feelings of discomfit or distress; relates to criminal activities; or, is potentially identifying. Any test breaching these conditions will be returned for adjustment prior to data collection commencing. If you are unsure whether your test is appropriate or not, please contact the Unit Coordinator.
Test Conceptualisation & Development
Week 2 teaching and learning content (i.e., lectures, activity, prac and textbook readings) will cover content relevant to test conceptualisation and development. Hence, by the time you have covered this content you will have knowledge and skills relating to conceptualizing and developing a test, including how to write different types of test items and develop a scoring system. You will use this knowledge to conceive and develop your test, via the following steps.
1. Determine the construct your test will measure.
2. Search the literature on this construct (and related constructs) to determine the content for your test items.
• For example, if your test was measuring a specific phobia, you would need to find out what sorts of behaviours and/or thoughts are usually exhibited by affected individuals.
• Similarly, if your test was measuring environmentally-friendly behaviours, attitudes towards societal inequalities, conspiracy beliefs, etc., you would need to investigate what factors (thoughts, behaviours, etc.) are common to groups that are high/low on this construct to assist you in developing good test items.
• While in literature search mode, look for studies investigating associations between your chosen construct (and/or similar constructs) and Big 5 personality traits (domains & facets). This info will help you decide which BFI-2-S domain to use for the validation of your test.
3. Determine the most appropriate format for your test items and response options (i.e., what sort of rating scale, what response options, etc.).
• Please note: all of your items will need have the same rating/response scale.
• Consideration should also be given to how the test will be scored, and what these scores will mean in relation to the construct you are measuring.
4. Create a pool of approximately 20-25 draft items, of which at least 5 should be reverse/negatively worded.
• Keep in mind here that you need items that will distinguish between people with high and low levels of the construct that your test is measuring (i.e., items should tap thoughts/behaviours/etc. for people who have lots as well as little of the construct your test is measuring).
• At this point, you may like to ask someone you know to read through your items – are there any that they find confusing or hard to understand?
5. Next, choose the best 15 items to use for your test, ensuring you include at least 3 reverse/ negatively worded items.
• Finalise the overall layout/format of your test, adding any instructions that test-takers will need (e.g., ‘Please select the option that best represents your experiences during the past week’, ‘Indicate how strongly you agree or disagree with the following statements’, etc.).
• Don’t forget to add a name for your test at the top of the page.
• On the following page, place the scoring instructions and guidelines for interpretation of scores.
Submit your test and scoring document through the relevant submission link on Moodle by 11:59 pm on Sunday 11th August.
Test Evaluation & Revision
Week 3 and 4 teaching and learning content (i.e., lectures, activities, prac and textbook readings) will cover content relevant to test evaluation and revision. Hence, by the time you have covered this content you will have knowledge and skills relating to evaluating the psychometric properties of tests, including how to conduct item analysis and determine good test items.
As noted above, test data and peer review feedback will be made available by the end of Week 7.
This data will be in the form of an SPSS data file, which will contain basic demographic items, the BFI-2-S test items, your test items and peer review feedback, and the tests items and feedback for some other students.
Once you have downloaded this SPSS file, you will need to follow the following steps:
1. Clean the data set & prepare your data for analysis.
• First, delete the test items and peer review feedback of the other students whose tests were included in the same test survey as yours. Take care to ensure that you don’t accidentally delete the demographic variables or BFI-2-S items during this cleaning process – you’ll need them later.
• Next, you will need to assign item/variable names – this will help you remember what is what as you go through the rest of this process.
• The next step involves checking all of your variables to ensure the coding of data is correct (i.e., if your test’s rating scale is 0-4, but the dataset has it as 1-5, you will need to recode your data). Don’t forget that this will include reverse scoring all of the reverse/negatively worded items in your test as well as the relevant BFI-2-S domain items.
2. Calculate total scores & run descriptive analyses.
• First, calculate the total score for your test.
• Then run descriptive analyses (i.e., range, median, mode, mean, SD, skew, kurtosis) on all 15 of your test items as well as on your test’s total score.
3. Check internal consistency reliability (Cronbach’s a).
• Here you will also need to look at the inter-item correlations between your 15 test items, the a if the item is deleted for each of them, and also the correlations between each of the test items and your test’s total score.
4. Select the 10 best items to form your revised test.
• Use the quantitative results from the above steps along with the qualitative data from the peer review feedback on your test (and your experience of reviewing your peers’ tests) to select the 10 best items for the final version of your test.
• Create a new document containing your revised (10-item) test, making sure you adjust the scoring system and interpretation guidelines as appropriate.
5. Evaluate your new 10-item test’s reliability and validity.
• Calculate the total score, internal consistency reliability, and run basic descriptive statistics (M & SD) for your 10-item test.
• Repeat this process for the BFI-2-S domain scale that you are using to validate your test, and also for the domain’s three facet scales (i.e., calculate total score, check internal reliability, calculate M & SD).
• Next, test the validity of your test via a correlation analysis between your test’s total score and that of the relevant BFI-2-S domain and facet scores.
• Create a graph (scatter plot with fit line) to illustrate the relationship between your test’s total score and the BFI-2-S domain score – you will need this for your Results section (if you want to, you can also create graphs comparing your test’s total score to the three BFI-2-S facet scores, but this is not required for your report).
6. Don’t forget to also run descriptive analyses for the demographic items – you will need to report age (M, SD & range) and sex (N & % female/male) in the Participants section of your report.
The Test Development & Evaluation Report
This report will be in the format of a typical research report, containing the following sections: Title page, Abstract, Introduction, Method, Results, Discussion, Reference list, and Appendices.
Provide a succinct summary of the contents of your report consisting of the following content:
1. Introduce the topic: 1 sentence that says something relevant about the construct your test is measuring (this can often be taken from the beginning of your intro).
2. State the rationale for your test and the validation hypothesis: 1- 2 sentences; can be a shortened version of the summary/hypothesis paragraph at the end of your Introduction (see below).
3. Describe the study methodology: 1-2 sentences; e.g., ‘Participants (N, % female/male, age range and M & SD) completed an anonymous online questionnaire consisting of …. ‘
4. Report key findings: 1-2 sentences outlining whether or not test was found to be reliable and valid, reporting the key statistical results to demonstrate this.
5. State conclusions: 1 sentence that says what these findings mean in relation to the future utility of the test you have developed.
The main purpose of the introduction in this report is to provide a rationale for the development of your new test and justification for your validation hypothesis. In doing this, it must also do the usual work of an Introduction – introduce the topic, define/describe key constructs, review relevant literature on your variables, provide clear justification for hypotheses, etc.
For example, if your 10 item test measures test anxiety, you would need to… • Introduce the topic of test anxiety.
• Define anxiety and test anxiety.
• Provide a brief literature review (at least 4 journal articles) of the information/evidence that informed the content of your test anxiety test.
o So, if test anxiety contains five elements/sub-constructs that you have incorporated into you test (e.g., fear, negative thoughts, physiological changes etc.), then you would have to describe them here.
• Introduce and define the Big 5 personality domain and facets that you are using for validation, while providing a brief literature review linking it to your test construct (at least 3 journal articles).
• End with a summary of key points including the rationale for your test (i.e., why do we need this new test?) and your validation-related hypothesis.
o For example: “Existing test anxiety tests have been designed solely to measure anxiety associated with academic tests (Smith 2018; Jones, 2015). However, test anxiety is also known to be experienced in many non-academic testing situations, such as driver’s test, vision tests, and paternity testing (Lee & Hassan, 2019; York et al., 2008). As such, there is a need for a generalised test anxiety test. Given the demonstrated strong associations between academic test anxiety and neuroticism (Gill et al., 2013), and between anxiety and neuroticism (Shaw, 2018), it is hypothesised that the generalised test anxiety test will be strongly and positively correlated with neuroticism.”
This section must contain the following sub-sections:
• Report the overall number of participants, the number of female and male participants and also percentage of participants in these groups (e.g., “In total, there were 15 participants, of which 10 (66.7%) were female and 4 (26.7%) were male, with one participant (6.7%) indicating that they do not identify as either”.
• When reporting the age of the participants include the age range as well as the mean and standard deviation (e.g., “Participants ranged from 18 to 63 years of age (M = 32.1, SD = 11.42) – and don’t forget to italicise M and SD.
• There should also be a mention of inclusion/exclusion criteria (i.e., over 18 years of age) and, if you exclude any participants’ data from analysis, note the number of affected participants and reason/s you took this action (e.g., incomplete dataset).
• While your test will be included in a survey along with other students’ test, there is no need to note that here. Do, however, remember to note that demographic info was collected (i.e., age & sex).
• Provide adequate info on each of the measures – i.e., your test and the BFI-2-S domain scale that you are using. For each, you will need to report the following info (though, not necessarily in this order):
o that they are self-report measures o the construct that they are measuring o the number of items o the type of response format used and the range of response options available (e.g.,
“participants responded on a 5-point Likert scale, where 1 = strongly disagree, 2 = disagree,
3 = neutral, 4 = agree, 5= strongly agree”) o scoring info, including the maximum scoring range and how these scores should be interpreted (e.g., “the extraversion domain scale of the BFI-2-S consists of 5 items, with a scoring range or x-xx, where higher scores indicate….”)
o the Cronbach a from your sample o provide an example item.
• For the BFI-2-S, also report this information for the facet scales, include published reliability and validity info, and don’t forget to include any relevant citations.
• For your test, include a brief summary of the process followed when revising your test – this should not be a reiteration of the instructions provided to you for how to complete this process. Rather, it should describe how (and on what basis) decisions were made regarding what was/wasn’t a good item, when you culled the initial 15 items down to the final 10.
o For example, you might like to discuss each of the 5 culled items in turn, detailing the reasons for their exclusion – such reasons may have been due to poor performance indices
(e.g., low correlation with total score, increased a if item deleted, poor spread of
responses, etc.) and/or based on the peer feedback (Was the item confusing? Was it asking more than one thing? Did its answers not fit well with the available response options? etc.)
• This is will be a short section, and appropriately so – don’t feel the need to repeat info from the Participant and Materials sections here to make this section bigger.
• Things that should be noted here:
o recruitment information (i.e., how did participants find out about the study?) o how the questionnaire was accessed by participants
o how long participating in the study took on average (use your experience to guide this)
Things to include in this section:
• Use a table to report the M, SD, N & a for the total scores for: 1) your original test; 2) your revised test; and, 3) for the BFI-2-S domain and facet scales that you are using for validation.
o You must say something in the text about the contents of the table to direct the reader to look at it (e.g., “As can be seen in Table 1, ….”), but do not report the stats in the text as well as in the table (there’s no need to double up).
o This text should come before the table (i.e., don’t start your Results section with a table, introduce it first).
• For your correlation, report the direction (i.e., negative or positive) and the strength of the association – as a guide:
o .10 is very small or very weak o .10-.30 is small or weak o .30-.50 is medium or moderate o .50-.70 is large or strong o .70-.90 is very large or very strong o .90 is nearly/almost/practically perfect
• Include a scatter plot with regression line to illustrate the association between your test’s total score and that of the relevant BFI-2-S domain (you can also include figures to illustrate associations between your test and the facet scales, but this is not required).
• A few other points:
o A table’s label/title goes above the table, but a figure’s lable/description goes below the figure
o Both tables and figures are numbered (e.g., Table 1; Figure 3) even if there is only one of them included in a report
o N, M, SD, p, r, etc. must be italicised o M, SD, r should generally be reported to two decimal places, but p to three places o Exact p values are best, unless it reads .000, when you should report p .001
The purpose of the Discussion section is to:
• Report the outcomes of your test’s evaluation and what these mean in terms of the psychometric soundness of your test (i.e., was it found to be reliable and valid?).
• Discuss any issues/problems that are evident with your test and/or its scoring, and what could be done to improve it. For example: o Should additional items have been culled? Are additional items required? o Is the scoring system appropriate? What about the interpretation guidelines?
o Is there a need for further evaluation to ensure your test is reliable, valid, and user friendly for both test users and test takers?
• Note any limitations that you think may have impacted on the results obtained, and explain how these issues could be addressed in future research/studies.
• Discuss how your new test could be used within applied and/or research contexts. For example:
o What do you see as being the potential uses of your test and the information it provides?
o Who is the target population? (think test users as well as test takers) o Are there any groups for whom it would not be appropriate? Any contexts where it shouldn’t be used?
• End with a brief summary of key points, leading to a statement of your conclusions.
Include all cited sources and nothing that wasn’t cited.
Your Appendices should contain the following:
• Your original 15-item test, scoring instructions & interpretation guidelines
• Your revised 10-item test, scoring instructions & interpretation guidelines
• A copy of the peer review feedback on your test – this could be presented in table form
• A table of data from analysis of your original 15-item test, including: the M, SD, a if item deleted, and correlation coefficient (r) with the total scores, for all of the 15 items. In the table notes, please indicate the 5 items that were culled.
Submit your report through the relevant submission link on Moodle by 11:59 pm on Sunday 15th September.
The word count includes everything from the first word of the Abstract to the last word of the Discussion (i.e., title page, reference list and appendices are not included). The word limits are 1,750 words for PSYC371 students, and 2,000 words for PSYC471 students. A leeway of +/- 10% is allowable.
The various aspects of the test development and evaluation report will be marked according to the expected content provided above, with the following weighting:
Abstract 5 %
Introduction 20 % Method 20 % Results 20 % Discussion 25 %
Writing quality/style 10 %
TOTAL 100 % (max. deduction for errors in formatting & referencing = 10%)
Assessment Related Policies & Procedures
Please refer to the School of Psychology’s assignment page for information on assessment related policies, procedures, and guidelines, including extensions, late penalties, and plagiarism:
Other Queries, Concerns, Problems?
Not sure what to do after reading this document? Check the Assignment forum on Moodle to see if your query has already been addressed or asked by another student. If it hasn’t been, post your query on the forum and wait for the post to be answered (typically within 1-2 work days)
Academic, learning, or study skills related
difficulties or support needs There is a large range of learning support services and resources available at UNE – please take the time to check out these services so that you are aware of what assistance is available:
• Academic Resources: https://www.une.edu.au/currentstudents/resources/academic-resources
• Academic Skills: https://www.une.edu.au/currentstudents/resources/academic-skills
• Library Support Services: https://www.une.edu.au/library/services/support
• Student Success: https://www.une.edu.au/currentstudents/support/student-central
• IT Support: https://www.une.edu.au/current-students/support/it-services
• Oorala Aboriginal Centre: https://www.une.edu.au/info-for/indigenousmatters/oorala
• Student Access & Inclusion: https://www.une.edu.au/currentstudents/support/student-support/student-access-and-inclusion
Extension requests Email the Unit Coordinator as soon as you are aware that you may want to request an extension. Within the email, briefly explain the basis of your extension request (i.e., why you need more time) and state how many days extension you are seeking. Typically, you will have a response within 1-2 work days.
Please note: if you require a long extension or one that goes beyond the end of the trimester, you will need to complete a Special Extension of Time application and submit it, with supporting documentation, through AskUNE.
• Applying for a Special Extension of Time: http://askune.custhelp.com/app/answers/detail/a_id/131/kw/Applying%20fo r%20a%20Special%20Extension%20of%20Time
Other personal or confidential issues Email the Unit Coordinator directly in relation to any other personal or confidential issues.
Please also be aware that a range of support services, including counselling, are provided free of charge to UNE students. Information on these services can be found via the following link:
• Student Support: https://www.une.edu.au/currentstudents/support/student-support
Want to talk about it in person? If you would like to discuss your personal or confidential issues with the Unit Coordinator in-person or on the phone, please email to book mutually suitable time.