Avoiding Multiple Choice Mediocrity

At some point in our lives, we’ve all been faced with tackling multiple-choice questions. Whether that’s in a high school math test, a survey following a big purchase, or even an online quiz to determine which movie best suits our preferences. There are so many wide-ranging applications for this style of assessment; in fact, multiple-choice questions are used heavily in learning evaluation because the results are not subjective, and scoring can be automated. At HC, we have written thousands of multiple-choice questions, or test items as we call them, for certification preparation over the years and know that the quality of testing is critical to learning retention.

At the core: learners highly value practice exams and want lots of practice prior to taking their official certification exam. Great questions help them feel prepared. Mediocre or confusing questions can give a false sense of security or hopelessness. They waste students’ time and leave an overall bad impression.

However, many people think the item writing process will be easier than it actually is. We’ve seen this in action when providing item writing training seminars for subject matter experts (SMEs). Most leave with a much greater respect and understanding of the difficulty associated with item writing than they had when they started.

Items can be written by staff writers, contractors, or SMEs. While staff writers can be expected to improve their item writing quality over time, contractors and volunteer SMEs often face a steep learning curve to learn item writing and write items in a short period of time. Getting volunteers or temporary project team members up to speed quickly requires a clearly defined process and expert item writing training. Additionally, any of these parties could use AI to get the process started, given good enough prompts and a willingness to share the source content with AI, but getting good end results requires a human touch that can discern good from bad questions and edit towards that goal.

So how can we ensure we’re getting quality items? The most important thing to teach new item writers is that good items test on key points in the content or learning objectives. Encourage writers to explicitly indicate which learning objective their new item is testing and where in the source material it can be found. Item writers can be assigned a specific area to ensure coverage of the material is in line with the exam weightings. Use a peer review of items with constructive criticism to speed up the group’s learning curve.

In our experience, a great foundation for streamlining this process is to organize items using Bloom’s taxonomy levels. While the six levels shown below may be too many levels, especially for persons new to item writing, we’ve broken down each level and the pros and cons for items included within it. From most basic to most abstract, Bloom’s taxonomy levels are:

Knowledge (recall): Item writers often follow the path of least resistance. Without guidance, they will primarily write recall questions. Many items will be something along the lines of “What is the definition of X?” This example is of obvious low quality, is unlikely to test on learning objectives, and is better tested using flashcards. However, these types of items can be a great starting point to build item writing confidence, before moving to the desired higher question levels. But once confidence is acquired and to minimize the “only recall” tendency, instruct writers to focus on the learning objectives and to exhaust all options for higher question levels before writing recall items.
Comprehension (explain): Comprehension questions show that the learner understands the material. Writing these types of questions is not much harder than recall questions, but the quality tends to be much higher. A question such as “What are the consequences of X?” shows comprehension but might still be based on a comment to that effect in the content.
Application (solve): Application items require the learner to apply principles or decision rules. These questions are difficult to write as they typically require creating a scenario to which the principle will apply. A good scenario question will be easier for an SME to write due to their on-the-job experience, but they may need to collaborate with an experienced item writer to ensure the incorrect answers are plausible yet incorrect. To get more value out of each scenario, consider making one scenario to which several questions apply.
Analysis (compare/contrast, calculate): This level of question is often easier to write than even the prior level. Comparing or contrasting items is straightforward because these discussions are often in the text. Questions that rely on math are straightforward to write as you just provide the variables and so on. However, check whether a different answer can be calculated if the student rounds values at each step or only at the end.
Synthesis (create, plan) and Evaluation (judge, predict): Avoid instructing people to write items at these two levels. Writing a scenario-based item could require the user to select the best plan or apply situational judgement. This might elevate an item to a synthesis or evaluation level, but the distinction is unnecessary.

Now we’ve seen the types of items in action. Beyond that, another way to simplify getting item writers up to speed is to focus on what to avoid. We tend to guide our item writers not to be:

If you hadn’t read the materials, would the correct answer still be obvious?
Dissimilar in answer lengths or styles. Often a correct answer is longer than the incorrect answers because it needs to specify exactly when it is and is not true. Some of this could be put in the question instead or incorrect answers could get some fake qualifications.
Too finely detailed. Test difficult items against the learning objectives to see if they are testing at this level or on a much more detailed point.
Trying to do too much. Break longer questions down and test on fewer parts of a process.
Testing on data that changes. The best items apply principles rather than testing on data that changes from year to year (which is recall). This impacts test maintenance.
Using made-up terms that seem plausible. Incorrect answers that are synonyms of the correct answer are basically trick questions. Instead, use terms from elsewhere in the materials. While false, they seem plausible because they were encountered while reading.
One good reason never to write “all of the following are true, except…” items is that they are essentially a double negative and this confuses learners. Item writers tend to write them because they cannot think of three incorrect answers. Maybe try AI?
Giving more than one correct answer. Writing good incorrect yet plausible answers is the true skill of item writing. Having multiple correct answers is especially a risk for scenario-based questions. If the strategy is to deliberately allow some incorrect answers to be true but be incorrect because they are not the “best” answer (i.e., to apply higher-level testing), know that this will be a source of contention. In addition to careful SME review, provide a rationale for why the correct answer is best.

While the item writing process can seem daunting, if you keep the basic principles we’ve discussed in mind, you’ll likely find your team will create better, more effective learning and retention. And if you don’t have the resources to reach your certification preparation or professional development program goals, we’ve got solutions that can help. Check out what we can do at HolmesCorp.com.

Author: Chad Dykoski, PMP, CTP

Get In Touch

Avoiding Multiple Choice Mediocrity

Author: Chad Dykoski, PMP, CTP

Related Posts

Get In Touch