Score Equivalency Questions
Q: We accept a minimum TOEFL score of 100. The scale shows a range of 95-106 as equal to a score of 5 under the new scale, but I don't want to accept students who would have scored a 95, so I'd prefer to set a threshold of 5.5. Is this the right interpretation?
A: Choosing a score of 5.5 might reduce the chances of accepting applicants with an equivalent score of less than 100, but it will increase the chances of not accepting qualified candidates with an equivalent score of 100 or even higher.
Score comparison tables are useful as general guiding tools, but they cannot predict the exact score each individual test taker would receive on two different language tests.
The two score scales also follow different approaches when it comes to the range of the reported total scores (1-6 with half-band increments, as opposed to 0-120). As a result, multiple scores from the 0-120 scale will correspond to the same half band on the 1-6 scale.
Based on score concordance analysis, a band score of 5 corresponds to a range of scores, specifically 95-106. A score closer to the middle of the range of scores for the same half band is a more reliable indicator of the same level of ability.
Recommendation: Choosing a band score of 5 should work well for institutions that previously required a total score of 100. We would also recommend that the same requirement is established for the section scores, not just the total score, to help ensure that the requirement is met across all language skills.
Note: This same analysis applies for institutions that accept a TOEFL score of 90 (which is now equivalent to a 4.5 on the updated scale), as well as those that accept a TOEFL score of 80 (now equivalent to a 4 on the updated scale).
Technical Questions
Q: How did you determine which 1-6 scores map to each CEFR level?
A: The mapping of TOEFL section test scores was established by combining information from several separate steps.
For Reading and Listening: We conducted a vertical linking study, whereby test takers participating in a field test responded to questions from the updated TOEFL iBT and the previous version of the test, whose score had already been mapped to the CEFR level. Through this vertical linking, we established CEFR score mapping for the reading and listening sections of the updated TOEFL iBT.
For Speaking and Writing:
We first compared task requirements and scoring rubrics to CEFR subscales and level descriptors for different aspects of language to confirm that the content of the test was relevant to language ability as described in the CEFR, and therefore that alignment of test scores to CEFR levels was justified (Davis et al., 2023).
This was followed by an ETS-internal standard setting study that identified minimum scores for each CEFR level, using the performance profile method (Fleckenstein et al., 2020). In this exercise, we selected test takers representing different levels of performance and then created a portfolio for each individual which contained the responses they produced in the test.
ETS language experts then compared the portfolios to performance descriptors from relevant CEFR scales to establish the minimum score for each CEFR level (Davis et al., 2023).
Finally, the score profiles of the test takers in the field test were examined statistically to establish the relationship between the CEFR levels of the students across the selected-response sections (Reading and Listening) and the CEFR levels of the same students across the constructed response sections (Speaking and Writing).
Q: How did you determine which 1-6 scores aligned with the equivalent 0-120 scores?
A: The TOEFL iBT 1-6 scores were aligned to the 0-120 score scale through various statistical methods.
For Reading and Listening: We conducted a vertical linking study, whereby test takers participating in a field test responded to questions from the updated TOEFL iBT and the previous version of the test. This combination of questions allowed for direct linking of the reading and listening score scales.
For Speaking and Writing: We examined the score profiles of test takers in the field to inform the relationship between the selected-response section scores (Reading and Listening) and the constructed response section scores (Speaking and Writing). These statistical analyses also consider previous research on mapping TOEFL iBT scores to the CEFR levels.
Important Notes:
Non-equal intervals: The statistical methods do not assume equal interval score points. This means that two half bands on the 1-6 score scale can contain different numbers of score points on the 0-120 score scale.
Measurement error: All assessments carry an inherent degree of measurement error. On the 0-120 scale, this margin is approximately four points—meaning that small score differences fall within expected variation and should not be overinterpreted.
Additional Resources
Q: Where can I get more details on the updates to the TOEFL exam?
A: You can read an in-depth analysis of all TOEFL test content—including the addition of multistage adaptivity to our Reading and Listening sections, which will enhance the accuracy of our exam and ensure that test takers are receiving questions at a suitable level for their English proficiency—in our technical manual.
