Heuristic Evaluations vs. Usability Testing
Part II
by Dr. Bob Bailey
March, 2001
I received many emails and had many discussions after last month's article where I suggested that heuristic evaluations have limited usefulness in the design of systems.
Here is my main problem with most of the heuristic evaluations that I have seen over the past several years. There are some influential people in the user experience community who have convinced many others in the community, and most people outside the community (managers, system designers, programmers, etc.) that heuristic evaluations (and other low-level evaluation techniques) are good enough on their own for achieving acceptable levels of human performance. I do not believe that this is true!
Jakob Nielsen and (very few) others studied heuristic evaluations for a few years in the early 1990s. Since then, he has summarized many of his findings in his books. He always assumed that heuristic evaluations were a valid way to identify usability problems. He never (not once) tried to find out how good they really were when compared against actual usability problems. Nevertheless, through his large number of publications, he has convinced just about everybody that they are a cheap way to build effective systems "at a discount"!
A good evaluation of human-computer interactions is very difficult to do, requires considerable expertise, and in many cases the final payoff can be very low. As I mentioned last month, heuristic evaluators can end up missing some serious problems, and causing designers to go to the expense of making many changes that make no performance or preference differences at all. In fact, some of the new design changes most likely will introduce new usability problems.
I believe that the research is clear on this, but unfortunately, it is not consistent with trying to design and develop websites quickly in a real-world environment. This poses a serious problem for usability specialists. Computer professionals believe that they are getting much more from the typical heuristic evaluation than they really are. This is important because it is only after they see the fallacy of relying so heavily on heuristic evaluations that they will change the way they approach usability in systems. We cannot move forward, we cannot create more usable systems, as long as most designers and managers believe that heuristic evaluations are providing more information than they really are.
I suspect that the limitations with heuristic evaluations are understood by few, and for this reason they will continue to be used almost exclusively as the main test for system usability. Unfortunately, it is like debugging a program by looking through the program code, rather than running the code on a computer. Looking for coding problems is fast, and works sometimes, but none of us would rely on it as the major means for finding coding errors. But that is exactly what we do with heuristic evaluations.