More recently, the former MIT professor teamed up with some students to create BABEL, a computer program that can create gibberish essays that other computer programs score as outstanding pieces of writing.
Robo-scoring fans like to reference a 2012 study by Mark Shermis (University of Akron) and Ben Hamner, in which computers and human scorers produced near-identical scores for a batch of essays. The full dismantling is here, but the basic problem, beyond methodology itself, was that the testing industry has its own definition of what the task of writing should be, which more about a performance task than an actual expression of thought and meaning.
It's cheap, it's quick, and it makes it easy to hoover up a ton of data about each student.
But that sort of automated scoring only works reliably for bubble tests, assessments based on superficial objective questions.
Automation has always broken down when it comes to machine-scored writing. Pearson released a white paper entitled "Pearson's Automated Scoring of Writing, Speaking, and Mathematics" way back in 2011, and hardly a year goes by that some media outlet doesn't publish an article along the lines of "Whizbang Corporation Announces Computers Can Grade Essays." It's never true, but the dream is so beautiful that test manufacturing companies can't stop trying.
better done than most, but still highlighting all the reasons computers are not ready to grade your student essays (even if you live in a state where it's already happening).The essay question type provides the option of answering by uploading one or more files and/or entering text online.(For longer essays, text or file uploads, you may wish to consider using the Assignment activity rather than this question type.) Essay questions are created in the same way as other quiz question types.Says the senior research scientist at ETS, " In other words, rather than trying to make software recognize good writing, we'll simply redefine good writing as what the software can recognize. In states like Utah and Ohio where it is being used, we can expect to see more bad writing and more time wasted on teaching students how to satisfy a computer algorithm rather than develop their own writing skills and voice to become better communicators with other members of the human race.We'll continue to see year after year companies putting out PR to claim they've totally got this under control, but until they can put out a working product, it's all just a dream."For some education reformers, the dream is to have the computer grade everything.Fans of robo-graders like the one in the NPR piece talk about how the AI can "learn" what a good essay looks like by being fed a hundred or so "good" essays. The first is that somebody has to pick the 100 exemplars, so hello again, human bias.The second is that this narrows the AI's view by saying that a good essay is one that looks a lot like these other essays.In the past, the only way to grade a paper or an essay, was to take every sheet of paper and analyze it from top to bottom while making notes and correcting mistakes.This is a highly inefficient way to deal with this task, but as computers intelligence is rapidly evolving, now we have a great range of PC tools with an ever greater range of features.The point is not that robo-graders can't recognize gibberish. And the people selling this baloney can't tell the difference themselves.The point is that their inability to distinguish between good writing and baloney makes them easy to game. That's underlined by a horrifying quote in the NPR piece.