tips for scantron tests at ucla

20 December 2012

Throughout I will be using "scantron" as if it were not a proper noun. My sincerest apologies to the Scantron Corporation.

Given class sizes at a large public institution like UCLA, multiple-choice exams are an inevitability. While they cannot test learning as thoroughly as pure free-response exams, the feedback they provide is arguably more valuable, since there is not a sleep-deprived, indifferent graduate student attempting to minimize the effort put forth. Regardless, an improperly administered scantron exam is just as terrible as any other — time is spent manually poring over responses, attempting to figure out which is circled; since many are unaware that UCLA provides services to make this easier, I'll mention them here for posterity.


The world's finest abbreviation.

The UCLA Office of Instructional Development, in addition to hosting webpages about each and every classroom on campus, runs a service called the Evaluation of Instruction Program Test Scoring Service (EIP TSS). This provides two useful things:

Use is pretty simple: submit a request for a grading appointment; go to the office and ask for an appropriate number of scantron forms; return for your appointment. Life is good.


EIP TSS will email you a file (or multiple files) containing results for all of the graded exams. Usefully, the results will contain each student's raw score; more usefully, the results will contain each student's list of responses, as well as any other information entered on the form.

Prior to grading, EIP TSS will request that you separate exams by version and submit a unique key for each; so if there are four versions of a final exam with 30 questions apiece, you will need to fill out 120 tiny bubbles without making a mistake. This is not difficult, but I have discovered that it's really chump work: since EIP TSS will email you the raw responses made by each student, you can just as well grade after the fact. More than just saving time filling bubbles, this method is also better because if there is an error in the answer key it can be fixed and grades adjusted without having to schedule a new scoring appointment.

client-side grading

Step 1: assign each version of the exam a unique 6-digit code. Require that students enter this code in the "SPECIAL CODE" section of the scantron form.

Step 2: bring the stack of exams to OID for your grading appointment; fill out a single dummy key, where all answers are A. (or B, or random, it doesn't matter) It is also possible that all blanks will work, which will save you a fair amount of time; I have not tested this. Request that a tab-delimited data file be emailed to you.

NOTE: the file that is emailed to you will contain scores for all students. Remember that these scores were generated using a fake answer key, and they should not under any circumstances be uploaded to Gradebook.

Step 3: download this Perl file:; be sure to remove the .txt suffix. If necessary, download a Perl interpreter. (if you use a Mac you should be okay) I appreciate that Perl requires an interpreter and as such is kind of a pain, however this is a much more portable system than compiling and distributing up-to-date binaries for the various systems graduate students run.

Step 4: create a key file. As an example, a recent exam we gave had the following entered into a file named key.txt:

/000000/ badec decad  eabba dcccd  bebab c[ABCD]CDA
/333333/ dcbbb deeda  ccabb ccded  aadbe d[ABCD]ACD
/666666/ deeaa abddb  ccebb aabcd  baccc e[ABCD]DCA
/999999/ bacdd eabde  bcadd eaaad  bbadd b[ABCD]CAB

The first entry /[digits]/ says that any exam with a special code which matches [digits] should be assessed according to the key which follows. If you like regular expressions, they will work here; otherwise, just enter the entire exam code.

The remaining entries — following the exam code and a space — specify the version key; whitespace here is ignored, so keep it in for legibility. In this particular key, the answer to question 1 of version 000000 was B, the answer to question 2 was A, etc. Capitalization is not important. Notice that things are funny in question 27: the bracket says that all contained answers are acceptable, hence answering A, B, C, or D to question 27 results in a point. Of course, this can be specified for any question.

If you'd like to skip a question — say, if the exam is discovered to have a critical fault — enter a - in place of an actual letter. The grading algorithm will then ignore all responses to this question.

Step 5: go to MyUCLA and download the full class roster as a text file (the "Download Roster" link on the class roster page).

Step 6: run the Perl file, using the key you generated and the results that OID emailed to you. An example run is:

>perl -f -r classroster.tsv
  -k key.txt -d "\t" 1>

To break it down:

For help, run it as perl -h.

Step 7: examine output for errors. If the script determines something amiss — a misspecified version number or a faulty student ID — it will warn you with "ALERT: [xxx]" To fix these, I recommend saving a copy of the OID file and modifying it; for example, if the script says that it cannot locate the student ID "111-111-112" and you know that a particular student has ID "111-111-111", you should change the OID results so that the student's ID matches the known ID. People mis-enter identifying information far more often than is appropriate.

Once the errors are fixed, return to Step 6 and repeat until no [fixable] errors arise; any students who did not take the exam will show up as errors, but this isn't a problem that you can really fix.

Step 8: once all errors are cleared, take the returned Gradebook dump (in this case, and upload it.

Job done! 8 steps, but they go quickly once you've learned. Since I do this two or three times per quarter, I'm reaping pretty steady dividends.

...Step 9 (new): in the event that your exam has errors and you need to determine whose exams to regrade — for example, the proper answer to a question is , but both and “None of the above” are possible responses — use the help documentation to explore the -L parameter. Shortly, -L [version] [question] [answer] will dump a list of people with test version [version] who responded [answer] on question [question] (in addition to performing the regular grading duties); for example, if C is not a correct response to question 15 of version 000000 but could be construed as one, adding -L 000000 15 c will give you a list of students with exams you should re-check. You can then go through and read the exams, checking to see if their response of C is justified or just wrong. In the above case, this amounted to seeing if people wrote something like, “1/6=0.17, therefore none of the above.”

If it turns out that a question is incredibly unclear, it is probably more useful to allow all reasonable answers instead of only the correct one — key [cd] instead of d — or to simply skip grading the question altogether — key - instead of d.


Since there is no manual labor involved in sorting the exams, and no manual labor involved in scribbling down a hard copy of the key, there is essentially no cap on the number of tests which can be submitted for grading. In particular, a course with 300 students could have 300 unique exams produced and automatically graded: a true deterrent against cheating.

As it stands now, creating these 300 exams is fairly time-consuming. In a later post, I'll script up that process as well; then the entire thing can be automated and we proctors can ease off the death stares a little bit.

Included \(\LaTeX\) graphics are generated at LaTeX to png or by MathJax.

contemporary entries


there are no comments on this post

Sorry, further commenting on this post has been disabled. For more information, contact me.