March 2011
TESOL HOME Convention Jobs Book Store TESOL Community

ARTICLES
THE MACHINE SCORING OF ESSAYS: REDEFINING WRITING PEDAGOGY?
Deborah Crusan, Wright State University, Dayton, OH, USA

For there is nothing either good or bad, thinking makes it so."
~ William Shakespeare (1564-1616), Hamlet, II.ii

It has been suggested that whenever a writing teacher is asked about the most difficult of his or her duties, a common answer is “Grading!” Assessment, particularly writing assessment, is hard work, especially if it is done well. In recent years, new machines and programs have promised efficiency to writing teachers; these machines purport to assess writing. Machine scoring has several aliases: automated essay scoring, automated writing evaluation, automated essay evaluation (Burstein, Chodorow, & Leacock, 2004), and automated essay grading (Grimes, 2005). Researchers (Crusan, 2010; Haswell, 2006) report that arguments supporting the use of these machines to score writing usually mention the difficulty and burden of writing assessment. Some (e.g., Landauer, Laham, and Foltz, 1999) even claim that although writing is educationally crucial, many teachers hesitate to assign a great deal of writing because grading large numbers of assignments is a struggle.

Though Crusan (2010), Ericsson and Haswell (2006), and Shermis and Burstein (2003) offered a more thorough treatment of machine scoring in general, in this article, I concentrate on one program―MY Access! (Vantage Learning, 2007)―briefly describing it and discussing a small study conducted in a graduate writing assessment seminar at a midsize Midwestern university in which graduate students examined second language writers’ attitudes about using the program as a feedback and assessment tool for their writing in a sheltered ESL writing class.

BACKGROUND

MY Access! is a “web-based instructional writing product that provides students enrolled in grade 4 through higher education with the opportunity to develop their writing skills within an electronic portfolio-based environment” (Vantage Learning, 2007, p. 1). MY Access! grades students’ responses to select writing prompts and offers suggestions for improvement to the text in a matter of seconds. MY Access! (as well as other programs) makes two interesting claims: (a) that the program decreases writing instructors’ grading burden and (b) that the almost instantaneous feedback provided by MY Access! motivates students to revise more. These claims are enticing; composition teachers do have considerable grading (burden?) and certainly look for ways to encourage students to rework their writing multiple times. However, several problems are inherent in machine scoring.

First, though Ferris (2003) claimed that students will improve over time if they are given appropriate error correction and that students use teacher-generated feedback to revise things other than surface errors, students rarely use programs like MY Access! to revise anything other than surface errors (Warschauer & Grimes, 2008); paragraph elements, information structure, and register-specific stylistics are largely ignored. Second, although teachers can create their own prompts for use with the program (more than 900 prompts are built into MY Access! to which students can write and receive instantaneous feedback.), MY Access! will score only those prompts included in the program. Third, regarding essay length, in many cases, MY Access! seems to reward longer essays with higher scores; consequently, it appears that MY Access! assumes that length is a proxy for fluency.

As scholars, we recognize the need to investigate new pedagogical tools, including automated scoring, in order to control their use in our classrooms and allow teachers to define what happens in the classroom. With the above issues in mind, we decided to investigate the ways second language writing students used feedback given to them by MY Access! as well as examine their attitudes toward the program in general and the ways they responded to the instantaneous feedback provided. The students’ responses reinforced and expanded our understanding of the reported problems inherent in machine scoring.

During the study, graduate students worked with English language learners, asking them to respond three times to two different prompts. English language learners then responded to a survey about their experiences with the program. In addition, graduate students interviewed consenting English language learners regarding their attitudes toward MY Access!

RESULTS

What follows are excerpts from six student interviews. Overall, students’ opinions regarding MY Access! were mixed; students found useful aspects as well as aspects they termed less helpful.

Hannah (all student names are pseudonyms) reported that she was excited about the program, but because she had little computer experience, she had difficulty learning the program. She had problems remembering what to do each time she logged on because she did not use the program very often. She found the MY Access! prompts “not very interesting.”

Xiao confessed to finding it very easy to locate and insert information from other sources into his writing―even if they were not his ideas. He found MY Access! feedback useful for making surface corrections―the program pointed them out, so he could correct them―but he was not motivated to revise his work beyond fixing the surface errors.

Heecheon believed that he could get the same kind of information from Microsoft Word. Initially, he liked the instant feedback MY Access! provided, but he began to feel overwhelmed as he read through the massive amount of information offered to him by the program every time he submitted an essay draft.

Mohamed believed that MY Access! would be beneficial in the classroom. He found it to be a great tool to use during the writing process because it gave him a variety of tools and activities to aid in the revision of his writing.

Farin reported that, after a first draft, which he believed was scored too low, he went on the Internet and found some material loosely related to his topic. When he inserted the material in his essay, his score went up even though the essay was not as coherent as it had been when he first wrote it.

Wafa felt that MY Access! helped her get into the habit of doing multiple drafts. She liked the editing tools and found some of the feedback very helpful in the revision of her essays.

Clearly, the results are mixed. Some students found working with the program very helpful in discipline, encouraging multiple revision. Others liked working with the many tools provided, finding them very helpful in the revision process.

On the other hand, some students, lacking basic computer skills, found the program stressful and unusable. Others were discouraged by the seeming overabundance of feedback; in some cases, writers found it overwhelming, so they tended to disregard it. Our most disheartening finding: When some of the students were unhappy with their scores, they found ways to raise them by simply inserting unrelated text to their essays. Though it may be a stretch to say that automated writing programs encourage plagiarism, the fact that several students in the study did borrow chunks of text from the Internet to boost their scores is testament to the fact that it is possible―there are no safeguards against this action, and the program did not pick up on the juxtaposition of two unrelated topics. (In this study, other students besides the few who were interviewed admitted to blatantly plagiarizing from the Internet to increase the length of their essays, which in almost every case increased their scores.)

In other findings, the six students who were interviewed (above) along with the 35 students who completed the Survey Monkey survey reported that, generally, they enjoyed writing with a computer, as most were already accustomed to doing so in their composition classes. They appreciated the help MY Access! offered in finding grammar errors, but they were not always sure how to fix them. Further, the program offered no positive comments about what students were doing well, which could negatively impact student motivation. In addition, after working on a prompt once or twice, many became bored and wanted to switch to another prompt. Many of the student writers used MY Access! for surface editing only and rarely used it for revision. In general, students in this study did not use features in MY Access! (e.g. My Portfolio, My Editor), possibly because their teachers did not explicitly assign them.

DISCUSSION

This small study was undertaken to help me make an informed decision about the usefulness of machines that score writing, but the big question remains: Should teachers use MY Access! or something like it in the writing classroom? Before addressing this question, I want to note one caution about the study: MY Access! is only one program. There are many more. Further, it is important to remember that these are only a few students and though it is dangerous to generalize to a larger population, I do believe that what they reported is somewhat representative. That said, from what I have found, I have to say that the answer is a resounding “It depends!” (Johnson, 1999). However, we need to think about what it depends on.

Because this technology is here to stay, teachers need to decide when it makes sense to use programs such as MY Access! and for what purposes. Teachers must consider ways the use of programs like MY Access!® might impact their pedagogy. Are there instances in which the autonomy offered by these programs could benefit students? Will their use wrest control from the teacher and result in hegemonic outside influence? With so much control over assessment coming from outside the classroom, especially through No Child Left Behind, but also more recently in suggestions from states regarding exit testing for university students, administrative pushes for the use of assessment tools developed from the outside should be viewed with skepticism. Locally controlled assessment is important; when assessments are created from within, they are specific to one context―they are developed with a very specific group of students in mind, considering what those students have learned in their classes and what they are expected to be able to do as a result of what they have learned in that context. Standardized tools such as the many machine-grading programs available today cannot address this specificity. However, as this small study illustrates, there are students who can benefit from using automated writing programs and times in class when they can be beneficial.

It is up to teachers to decide. I hope it is always so.

REFERENCES

Burstein, J., Chodorow, M., & Leacock, C. (2004). Automated essay evaluation: The criterion online writing service. AI Magazine, 25(3),27-36.

Crusan, D. (2010). Assessment in the second language classroom. Ann Arbor, MI: University of Michigan Press.

Ericsson, P. F., & Haswell, R. H. (2006). Introduction. In P. F. Ericsson & R. H. Haswell (Eds.), Machine scoring of essays: Truth and consequences (pp. 1-7). Logan: Utah State University Press.

Ferris, D. R. (2003). Response to student writing: Implications for second language students. Mahwah, NJ: Lawrence Erlbaum Associates.

Grimes, D. (2005). Assessing automated assessment: Essay evaluation software in the classroom [Electronic version]. Published with proceedings of Computers and Writing Conference, Stanford University, Palo Alto, CA. Retrieved July 16, 2007, from http://www.ics.uci.edu/~grimesd/

Haswell, R. H. (2006). Automatons and automated scoring: Drudges, black boxes, and dei ex machina. In P. F. Ericsson & R. Haswell (Eds.), Machine scoring of essays: Truth and consequences (pp. 57-78). Logan: Utah University Press.

Johnson, K. E. (1999). Understanding language teaching: Reasoning in action. Boston: Heinle & Heinle.

Landauer, T. K., Laham, D., & Foltz, P. W. (1999). The intelligent essay assessor: Applications to educational technology. Interactive Multimedia Electronic Journal of Computer-Enhanced Learning. Wake Forest University. Retrieved October 12, 2007, from http://imej.wfu.edu/articles/1999/2/04/printver.asp

Shermis, M. D., & Burstein, J. C. (Eds.). (2003). Automated essay scoring: A cross-disciplinary perspective. Mahwah, NJ: Lawrence Erlbaum Associates.

Vantage Learning. (2007). MY Access!® school edition: Because writing matters efficacy report. Retrieved November 21, 2007, from http://www.vantagelearning.com/school/research/myaccess.html

Warschauer, M., & Grimes, D. (2008). Automated writing assessment in the classroom. Pedagogies: An international Journal, 3, 22-36.


Deborah Crusan, deborah.crusan@wright.edu, is associate professor of TESOL/applied linguistics at Wright State University, where she works with students in TESOL and composition and rhetoric. She has served as chair of the Second Language Writing Interest Section at TESOL and as a member of the 2010-2011 TESOL Nominating Committee, and is a past member of the Conference on College Composition and Communication Committee on Second Language Writing. Her primary research interest is the politics of assessment, particularly writing assessment. She has published about writing assessment in recognized journals in the field such as Assessing Writing, English for Specific Purposes, andLanguage Testing and has published chapters in The Norton Field Guide and edited collections about second language writing. Her book, Assessment in the Second Language Writing Classroom, was recently published by the University of Michigan Press.

« Previous Newsletter Home Print Article Next »
Post a CommentView Comments
 Rate This Article
In This Issue
LEADERSHIP UPDATE
ARTICLES
Tools
Search Back Issues
Forward to a Friend
Print Issue
RSS Feed

Member of TESOL and an RPCV...

We would love to meet you!
Join us at 2011TESOL Convention

The RPCV Reception
(Returned Peace Corps Volunteer)
Friday, 18 March 2011
6:30 P.M.
Newberry Room, Hilton
 
A place to mingle and meet other RPCVs who love being members of TESOL.