Disclaimer: Opinions expressed are solely my own and do not represent the views or opinions of my employer.
High-stakes tests have a reputation for making an interesting subject boring. Test items tend to contain the bare minimum of information needed, they lack interesting characters, and they are devoid of humor. Sometimes our item writers try to liven things up a little, but we just make the items boring again . Are test developers just dull and humorless people who don’t care about keeping students interested?
Well, possibly, but there are actually valid reasons for keeping test items “boring.” First, remember that test items are made to test particular knowledge, skills, or abilities. We don’t want anything to interfere with that measurement. Many of the things that might make an item interesting would also make it longer (more interesting details take more words), which might impose a heavier reading load. Unless the item is testing reading ability, we try to keep the reading as simple and short as possible.
“Humanizing” an item by giving people names or backgrounds introduces bias, which might also interfere with measurement. Consider the question: “Which action by an RN is an appropriate way to integrate spirituality into clinical practice?” Now suppose the RN had a name. Test-takers might approach the task with different assumptions depending on the perceived gender or ethnicity of the nurse’s name. Those assumptions might affect the answer the test-takers might choose, or at least the amount of time they spend thinking about it.
Another reason for keeping test items boring is test security. In high-stakes testing, most items are used for more than one administration. Therefore, it is crucial to keep those who take the test at one time from sharing information with others who might take it later. Of course, most test-takers are honest and would not deliberately disclose secure test items, but when an item is funny or particularly interesting, it is not only much easier to remember, but also harder to resist sharing. The result is a higher probability that people taking the test will have knowledge of the item and will be answering it based on memory rather than a full understanding of what is being tested.
Finally, people tend to think of high-stakes tests as serious undertakings. They can have a significant impact on test-takers’ lives, and the public does not take kindly to what they may see as trivializing or making light of the test. A well-known example is a reading passage from a 2012 New York State test for eighth-graders that generated so much outrage that it had to be pulled from the test. The passage was a humorous story about a race between a hare and a pineapple, and the controversy began with people paraphrasing the passage from memory on anti-testing comment boards (an example of the problem of being memorable) and ballooned into outrage that something so silly would be used on an important test. (See http://ideas.time.com/2012/05/04/what-everyone-missed-on-the-pineapple-question/ for a summary of the controversy.)
So though they may seem “boring” and uninteresting, test-items are written that way on purpose. Providing good measurement is more important than providing interesting reading.
Read other posts in Assessment / Competency | Posted on December 13, 2017.
About Mika Hoffman, PhD.
Expertise: Assessment / Competency
Mika Hoffman is Executive Director for Test Development or the Center for Educational Measurement at Excelsior College. She came to Excelsior from the Department of Defense, where she served as the Dean of the Test Development Division at the Defense Language Institute Foreign Language Center in Monterey, California, managing the high-stakes Defense Language Proficiency Testing program. She earned a B.A. with High Honors from Swarthmore College, an M.A. with Distinction in the Teaching of Foreign Languages (French) from the Monterey Institute of International Studies, and a PhD. in Linguistics from the Massachusetts Institute of Technology.
Read Full Expert Profile