Exploratory Testing vs. Traditional Testing: An Experiment for Measuring The Efficiency
Exploratory Testing has been found to be cost-effective and defect detection efficiency in software testing by integrating design, execution, and analysis of tests during a testing session. At Meu Solutions, exploratory testing has been promoted to most of the current projects at the company level. This article reports the Experiment for Measuring The Efficiency of Exploratory Testing, its effectiveness compared to the Traditional Testing, and the experiences of the testers who are the participants of Exploratory Testing sessions at MEU. Experiment for Measuring The Efficiency of Exploratory Testing
The results show that the Exploratory Testing has good effectiveness and higher efficiency than the Test Case-based Testing, measured in the number of bugs found, various types of bugs as well. Additionally, the testers who joined the sessions saw benefits in especially in the debriefing session, in that they can discuss and learn and understand the application under test more.
Traditional Testing (Test Case-based Testing) is shaped based on a simple model where almost every test case developed beforehand covering all possibilities which are identified from upfront requirements, specifications, designs. After all those test cases are baselined, they are executed many times later. Testers in this practice run their tests (execute test cases) and look for the same outcome each time. Opposite this approach, Context-Driven is a combination of skills, technique, documentation that depends on specific situations. Testing is a solution to given problems. It must be suitable to the context of the project, and therefore testing is a human activity that requires a great deal of skill to do well (from “context-driven-testing.com”)
Context-driven testing promotes critical & creative thinking process on the application under test at the particular period with a specific context. By this nature, performing the Exploratory Testing (an instance of Context-Driven Testing) is somewhat different from conventional one. It requires exploratory testers to build test models, use them with all skills, heuristics, oracles, practices to explore, reasoning and evaluating and finding hidden things from what they test. With this approach, all tests are designed, executed, reported in parallel from rapid learning.
Considering the claims stated above about the test effectiveness and efficiency, we decided to experiment to answer the questions:
- Do testers perform the manual functional testing with pre-designed test cases find more bugs than the testers working with Exploratory Testing?
- How are Exploratory Testing testers experienced?
In the next section is structured as follows. Part III we introduce the team and project background. The results are described in Part IV. In the last Part V, we present the conclusions.
III. TEAM AND PROJECT BACKGROUND
The experiment was carried out at MEU, where there are six testers joined in two teams. Each team performed manual functional testing on HR System (built & customized from an open source),
The Red Team will test the Leave Management (Module A) by using the Test Case-based Testing then this team will test the Task Management (Module B) by using the Exploratory Testing.
In opposite, The Blue Team will test the Task Management (Module B) by using the Test Case-based Testing then, in turn, this team will test the Leave Management (Module A) by using the Exploratory Testing.
The software under test was used as the experiment, is open-source software that testers don’t have much knowledge about it. Hence, the selection approach for Module A is that the tester is familiar with the structure of software from the technical perspective, nor the specifications of the software. Via versus, Module B is that the tester is not familiar with the structure of software from the technical perspective, nor the specifications of the software.
The testing is planned with 3 phases:
|Phase||Red Team||Blue Team|
|Test Case Development||Develop test cases for Module A||Develop test cases for Module B|
|Testing Execution 1||Test case based testing for Module A||Exploratory testing in the form of session-based for Module A|
|Testing Execution 2||Exploratory testing for Module B||Test case based testing in the form of session-based for Module B|
In the Test Case Development phase, each tester designed and developed test cases in Excel file, and they had to complete the test case development within three days given. The source document for the test case design was the User’s Guide which was available online on the website of the software.
All of these phases were executed in sequence, and each phase took three days for its execution. For Test Case Based Testing, the goals and expected results are boiled down into a series of testing steps that the testers have to take. A tester needs to read the overall goal defined in the scenario, ensures that any qualifying conditions for that action are met, the proceeds the next steps. While exploratory testing, in session-based form, allows testers to be method actors. They get in the heads of a user for each session. A session is structured as follows:
|Session setup||5-15 mins||Preparation before test execution|
|Test Execution & Product Learning||60-90 min||Focused testing following the exploratory or test case-based approach. Taking note all found defects.|
|Bug Report||10-20 mins||Defect reports and test logs collected.|
At MeU Solutions, we were using One2Explore to perform exploratory testing. With One2Explore capabilities, the exploratory testing was more effective (kept track all software behaviors that associate with its status and relevant data). It helped us to structure, plan, track, report and debrief the session in a picture to ensure the goal covered. One2Explore was integrated with One2Test (Testing Dashboard) and JIRA to convey the quality messages to key stakeholders (The quality info such as bugs per features, bug per charters, numbers of charters (executed, planned, blocked), bugs trend chart, … were visualized into a dashboard which helped us to make decisions on product quality.
In this section, we present the collected data and the results of the experiment based on our analysis of the data.
The overall number of found defects in the experiment as the table below. In addition to this, there are two invalid and one duplicate defects found by both teams. They were neglected to keep the data simple and easy to show on the charts.
|Red Team (TC->ET)||Blue Team (ET->TC)||Additional defects detected using ET|
The defects categorized based on their severities. The quality of the defects was labeled either as;
- Trivial: no importance for the use of the software or potentially not a bug.
- Minor: little importance for the use of the software.
- Major: great importance for the use of the software.
- Critical: prevents the use of the software.
- Blocker: cannot move forward until this bug fixed.
From the table below, we can see that ET found 80% more trivial defects, 50% more minor defects, 17% more major defects and 50% more critical defects.
Additionally, the data is viewed with the defects category that based on their approaches. From the table below, we can see that there are no radical differences in the number of defects with different Functionality types. ET found 100% more Performance and reliability defects, 100% more Security defects, and 100% more Enhancement than TC.
|Performance and reliability||2||0||0%||2|
In summary, during the experiment, we observed that:
- The test cases showed roughly the amount of documentation needed to be written for finding 11 defects: 64 test cases are written, but 9 test cases were not executed. Because some of the test cases could be easily combined into one, but they were separated for making a clearer distinction between the expected results. Besides, some of the test cases hold a high level of complexity that requires a large test data have to be prepared beforehand. Hence, that was costed a lot of time in exchange for new test case development, and in the execution phase, testers did not have enough time to run all test cases. This situation may be leading to missing defects in our experiment. For, ET there is 26 test charter was created to guide the testing, most of them are high-level cases.
- When applied ET for both Module A and B, the number of defects found in Module B by ET much greater than TC comparing to Module A. As stated, at the beginning, testers were not familiar with the structure of software from the technical perspective, nor the specifications of the software in module B. Therefore, Blue Team had to take more time to read the documentation, played with the software to understand it and spent much time for test data preparation, especially in complex requirements, so many cases that had not covered in test case. Via versus, ET supported testers in learning about the system while testing and the ET approach enabled testers to explore areas of the software that were overlooked while designing test charters based on system requirements in the first round.
- Furthermore, in software testing, we have several other factors, which may cause the success of the trial. These factors include, e.g., the expertise of testers, support, and guidelines provided for testing, reserved time for a testing session, available tools for testing, and this project is small it could not tell all stories about the testing world. For example, time pressure has been shown to increase the efficiency in software engineering experiments and the limited time and possible time pressure might have affected the effectiveness of testing.
- Too much exploratory testing may result in too much time spent on discovering edge case scenarios and more time lost on regular testing.
Without a doubt, exploratory testing brings out the creative side in the testers and is extremely useful in improving the quality of the product. As the goal of the article was to compare the cost efficiency of Traditional Testing – test case based testing compared to Exploratory testing, the experiments attempted to point out the complexity of the test cases required to find relevant bugs; the numerous potential variables of the test cases to be taken into account. The time spent on testing and writing the test cases combined with test case complexity was used as a measurement to point out potential differences in effectiveness. Higher complexity and details were expected to lead to increased time consumption, which would convert into higher software testing costs.