Does Pairwise Testing Really Work? Evidence, Data, and Case Studies

This lesson provides empirical evidence gathered on multiple real-world projects that compared the effectiveness of Hexawise tests to manually selected tests. The data shows that, as compared to manually-selected test scripts, Hexawise-generated tests are both faster to create, and are more thorough and efficient.

IEEE study of 10 real-world projects: 30-50% faster to create tests

Any tester familiar with using Hexawise will confirm that generating tests with the push of a button is much faster than selecting and documenting tests by hand. Results from a four-month-long study at banks and insurance firms confirmed that savings here are typically 30-50%.

IEEE study of 10 real-world projects: Detect more than twice as many defects in tests per tester hour

A 10-project study published in IEEE found that tests generated by Hexawise resulted in testers finding an average of 2.4 X as many defects per tester hour as compared to experienced testers' manually selection.

While testers are often surprised that the benefits from packing more coverage into fewer tests can be so large, hundreds of additional projects have shown similar results.

IEEE study of 10 real-world projects: Consistently more thorough testing coverage achieved

The same IEEE Computer study also found that using Hexawise-generated tests increased testing thoroughness every time. Testers using Hexawise consistently found more defects. On average, the much smaller quanity of Hexawise-generated tests found 13% more defects than the larger quantity of hand-selected tests. In order to get an "apples to apples" comparison, both sets of tests were designed to be equivalent to one another to test (a) the same systems and (b) at the same times.

The chart above shows the testing thoroughness of 21 Hexawise-generated tests (in green) compared to the thoroughness of 51 real tests used by a financial services firm (in blue).

8-project BCBS study: Testers using Hexawise created tests in much less time.

Testers at BCBSNC found that once they put their test inputs into Hexawise, creating tests was much faster than selecting and documenting tests by hand.

For example, one tester who had recently spent more than one full day putting test cases together by hand had attended her first Hexawise training session shortly afterward. During that training session, she generated a powerful set of tests with Hexawise in less than an hour. Her Hexawise-generated tests were also more thorough than the previous manual tests.

8-project BCBS study: Hexawise-generated tests found three times as many defects per tester hour.

In a project involving insurance claims testing, BCBSNC testers had already selected 48 test scripts to execute. Another tester used Hexawise to create a smaller set of 16 tests for the same system that packed as much coverage as possible into each optimized test. Both sets of tests were executed. They revealed the same two defects but the Hexawise tests took only a third as long to execute.

This result was repeated in 4 other projects: on average, testers only required one third as many Hexawise-generated tests to achieve the same level of testing thoroughness as compared to manually-selected tests.

8-project BCBS study: 69% reduction in the number of required tests

Multiple projects confirmed that the hoped-for efficiency benefits from Hexawise were in fact consistently achieved on real-world projects. Testers using Hexawise were able to create small sets of unusually powerful tests. On average, testers only needed one third as many tests to achieve the same level of testing thoroughness.

8-project BCBS study: Hexawise-generated tests were consistently much more thorough than prior testing methods.

The charts above show the thoroughness of two different sets of tests designed to test the same system. The coverage of the 58 Hexawise tests (in orange) is far superior to the thoroughness of the 72 manual tests recently used by BCBSNC’s test automation team (in blue). The slope of the Hexawise coverage chart also shows how Hexawise front-loads coverage to find defects as early as possible.

There were 12,088 pairs of values within this system to be tested. The optimized Hexawise tests tested all of them. The original BCBSNC tests, however, had failed to test more than 5,000 of those pairs. In other words, the Hexawise testshad 5,000 fewer small gaps in coverage with fewer tests.

Consistent findings from more than 3,500 testers using Hexawise at our largest client: "Hexawise just plain works."

Our largest client has more than 3,500 testers designing tests with Hexawise. The vast majority of those testers decided to sign up one-at-a-time to use their company's unlimited-use enterprise license of Hexawise. That says a lot. In other words, the number of Hexawise users did not grow from dozens to hundreds to thousands of users because of a top-down mandate that imposed a new tool on testers. The testers had to individually choose to sign up for their Hexawise licenses.

If you ask the hundreds of new testers who sign up each month why they decided to create their Hexawise accounts, do you know what they will tell you? Their most popular answer, by an overwhelming margin, is that they heard good things about Hexawise from other testers! Enthusiastic word of mouth recommendations for Hexawise drove global adoption throughout the firm for the obvious reasons that Hexawise works well and it is enjoyable to use. More specifically:

Software testers recommending Hexawise say that it really helps them design tests faster, execute fewer, more powerful tests, and remove many tedious, error-prone steps introduced from manually testing before.
Managers recommend Hexawise to other managers for similar reasons: they see Hexawise helps get higher quality products to market in less time. They see fewer, more powerful tests created faster; they receive more objective and insightful reporting on testing coverage achieved. They are better able to assess "how much testing is enough?"

Previous: Hexawise Role in the Toolchain

Next: What Limitations Does Hexawise Have?