By Joshua Benton
McGill University professor David Harpp has come up with one of the more straightforward statistical methods for teasing out cheaters. Here’s how he might catch two students, Jack and Jill, who are copying answers off each other on a 100-question multiple-choice exam. Let’s say both got C’s on the test – Jack a 75, Jill a 72.
Step 1: Determine how many questions the students answered differently.
Jack and Jill missed roughly the same number of questions – but that doesn’t mean they missed the same questions. Let’s say that there were five questions Jack answered correctly that Jill missed; two questions that Jill got right that Jack missed; and three other questions that both missed but in different ways. That would equal 10 questions answered differently.
Step 2: Determine how many questions the students answered incorrectly and identically.
Upon examining the answer sheets, it turns out that Jack and Jill had 20 questions they both answered incorrectly – and in the same exact way.
Step 3: Determine the ratio between the two numbers.
The magic formula is EEIC/D. That means “exact errors in common divided by differences.” In this case, that would be 20 exact errors in common divided by 10 – a ratio of 2.0.
In Dr. Harpp’s analysis, anything over 1.0 is considered highly suspicious. To decrease the chance of a false positive, a school could use a higher cut score, like 1.2 or 1.5. But using either setting, it looks like Jack and Jill were cheating.
Step 4: Determine the probability that students could produce such similar answers independently.
It’s possible that the professor who wrote the test simply did a bad job. If he wrote a few questions poorly, he might have unwittingly pushed many students to choose the same wrong answers – which could artificially inflate the ratio in Step 3.
So Dr. Harpp checks to make sure that the wrong answers selected by Jack and Jill were statistically unlikely – in other words, that most other students weren’t fooled into answering the same wrong way they did. That calculation (too complex to include here) produces a measure of how unlikely Jack and Jill’s answer patterns would be, based on how other students answered.
If the calculation shows the chances that the strange answer patterns occurred naturally are very small – about 1 in 30 million or more, Dr. Harpp says – Jack and Jill will get called to the dean’s office.