This brief article discusses several oral assessments used for
ITA screening. Although they have some similarities, they each differ in
important respects. Table 1 gives some information on the assessments
mentioned in this article.
Table 1: Some Oral Assessments used for ITA Certification
Name |
Primary Purpose |
Publisher |
Format |
Score range |
TOEFL iBT Speaking |
University Admissions |
Educational Testing Service |
Test Taker (TT) responds to
computer prompts; responses recorded |
Speaking scores from 0-30 |
SPEAK |
Mostly for health care and ITA certification |
Retired (formerly Educational Testing Service) |
TT responds to paper / audio
prompts; responses recorded |
0 – 60 in 5 point increments |
Versant |
Mostly for initial employment certification |
Knowledge Technologies Pearson, in. |
TT responds to computer
prompts; responses scored immediately |
Total scores 20-80; Sentence
Mastery, Vocabulary, Fluency, Pronunciation
subscores |
Four studies have examined the TOEFL iBT Speaking test with
respect to ITA screening. Wylie and Tannenbaum (2006) conducted
standard-setting sessions to establish minimum recommended TOEFL
Speaking cut scores for ITA screening, and to establish a TOEFL Speaking
equivalent to the Test of Spoken English (TSE) score of 50, from a
possible range of 20 to 60 on the test. Xi (2007) compared TOEFL
Speaking scores with scores on locally administered ITA exams at four
universities and investigated the potential for various cut scores to
minimize false negative and false positive results. She found a wide
range of correlations between TOEFL Speaking and locally administered
exams and concluded that the degree of correlation between TOEFL
Speaking and the locally administered tests was partially a function of
whether the local exams attempted to measure aspects of teacher
competence. Farnsworth (2013) examined the construct validity of TOEFL
Speaking for this purpose by comparing scores on TOEFL Speaking with an
in-house teaching performance test, the Test of Oral Proficiency (TOP)
at the University of California, Los Angeles, finding that the two tests
indeed measured the same speaking factor to a great extent. Finally,
Lim et al. (2012) looked at the validity of using TOEFL Speaking scores
for ITAs. Their criterion measure was the SPEAK test. They found
moderate correlations between the two tests, but they did not find a cut
score at the high end accurate enough to exempt candidates from the
SPEAK requirement.
In order to investigate the current state of practice regarding
TOEFL iBT Speaking use in ITA programs, Farnsworth (2012) conducted an
online survey upon which this article is based. Coordinators of ITA
assessment programs were asked about the makeup of their institutions
and the size of their ITA population. They were then asked to describe
their ITA certification policies and specifically their policies
regarding the TOEFL Speaking test. Seventeen participants responded to
the survey. Participants responded from overwhelmingly research-oriented
institutions, with only one participant reporting his or her
institution as more teaching oriented. Most respondents were from large
research institutions.
The institutions very often used the TOEFL Speaking score as a
prescreening measure to exempt high-scoring students from an in-house
performance test. Nine institutions (of seventeen) implemented the TOEFL
Speaking test in this way, with cut scores ranging from 23 to 28
points. An example of a typical response was the report on the policies
of Purdue University. Purdue accepts scores of 27 or higher on TOEFL
Speaking as evidence of ITA language competence. Students with lower
scores must take an in-house teaching performance exam to be certified.
Oklahoma State University, another large public research university,
reports the same policy (a high TOEFL Speaking score exempts students
from the local performance test) but with a cut-off score of 26 instead
of 27. Both participants reported the cut-off score “working well” as an
initial measure. Cornell University has an identical policy but
requires a 28, based on a perception that lower scores are “all over the
place” but that a very high TOEFL score is a reliable indicator of ITA
proficiency.
Only five survey participants reported SPEAK test use, and only
two of these relied exclusively on SPEAK to make these decisions, with
the other three SPEAK users also accepting TOEFL Speaking scores. Since
the SPEAK test has been the primary tool used for ITA assessment over
the past two decades, this may represent a fairly major change. One
institution reported interest in moving from SPEAK to the Versant
English Test, a fully automated computer-scored oral assessment that has
been the subject of much debate in the testing literature during the
past decade.
Respondents to the survey reported mixed impressions of TOEFL
Speaking use in practice. Some respondents, who utilized the scores,
reported that “it seems to work” or “as an initial measure, it works
well,” whereas others reported that the TOEFL did not measure the
appropriate skills. For example, one participant reported:
The iBT cannot replace our test since the iBT does not look at
teaching skills, awareness of U.S. classroom, ability to use their
language skills to successfully convey information to learners. However,
based on our own analysis of past ITA tests, we now allow students with
iBT ≥ 26 to be tested by just one rater rather than 4.
Other respondents reported varying degrees of satisfaction with
and confidence in TOEFL Speaking scores, saying that scores in the
middle range are less useful as predictions of ITA communicative
success. For example, one participant said, “We see that there are
correlations at the higher levels of the iBT with oral proficiency tools
such as the OPI. Anything below a 24 is all over the place.” Overall,
the survey results indicate that TOEFL Speaking is in fact widely used
to make these decisions.
Of course, the practical advantages of using TOEFL iBT Speaking
scores for ITA certification will be obvious to any ESL or testing
program coordinator; TOEFL scores will in most cases be already
available from the institution’s admissions department, and elimination
of in-house performance testing could save substantial resources. One
major advantage of using TOEFL iBT Speaking would be that incoming
students and departments could make ITA decisions in advance of student
intake. Clearly, though, not enough is known about how using these
scores instead of an in-house measure may impact programs, students, and
departments. All but one (Lim et al., 2012) of the TOEFL-specific
studies described in this paper rely on experimental testing, and
operational TOEFL Speaking scores may well have properties quite
different from those derived from experimental studies, due to practice
effects or other issues.
In terms of practical recommendations to ITA program
coordinators, the following may be tentatively concluded. The TOEFL
Speaking test seems to measure the language needed by ITAs to a certain
extent, enough so that very high iBT scores may be useful for ITA
certification or that very low scores might be sufficient evidence to
prevent candidates from teaching. There is no definitive answer as to
the ideal cut score, however, and different institutions may decide on
more lenient or more stringent cut scores depending on demand for ITAs,
local resources, and other factors. Available research in addition to
the anecdotal evidence gathered from the survey indicates that cut
scores of 26 or higher seem to minimize the danger of false positive
classifications (candidates who pass but are not truly qualified). A cut
score of 28 would probably result in very few false positives (Xi,
2007), but relatively few candidates are likely to reach this high bar.
The limited available research and anecdotal evidence suggest that
scores in the range between 22 and 25 do not predict ITA success with
sufficient accuracy (Lim et al., 2012) and local performance measures
should be used.
References
American Psychological Association, American Educational
Research Association, & National Council on Measurement in
Education. (1999). Standards for educational and psychological
testing. Washington, DC: American Educational Research
Association.
Farnsworth, T. (2012, April). TOEFL iBT Speaking for
ITA certification: State of practice and outstanding validation
questions. Paper presented at the annual meeting of the
Language Testing Research Colloquium, Princeton, NJ.
Farnsworth, T. (2013). An investigation into the validity of
the TOEFL iBT Speaking test for international teaching assistant
certification. Language Assessment Quarterly, 10,
274–291.
Lim, H., Kim, H., Behney, J., Reed, D., Ohlrogge, A., &
Lee, J. E. (2012, March). Validating the use of iBT Speaking
scores for ITA screening. Paper presented at the TESOL Annual
Convention and Exhibit, Philadelphia, PA.
Wylie, E. C., & Tannenbaum, R. J. (2006). TOEFL academic Speaking test: Setting a cut score for
international teaching assistants (ETS Research Memorandum
RM-06-01). Princeton, NJ: ETS. Xi, X. (2007). Validating TOEFL Speaking and setting score
requirements for ITA screening. Language Assessment
Quarterly, 4, 318–351.
|