How Rainforest Algorithms evaluate test results

The Rainforest algorithm is constantly evaluating and learning from collected behavior and results. Rainforest recruits at least 3 testers for every test. Each tester’s actions—such as mouse activity, time spent on each step, or correct site navigation—are all monitored by our system and compared to one another to ensure consistency.

How are Acceptance and Rejections determined?

Based on our algorithm and overall evaluation of a result, we will either Accept or Reject the test results of each tester. Accepted and Rejected results are not the same as the "Pass" or "Fail" results for test steps. Instead, this is the means by which we determine our confidence in the accuracy of the submitted result.

  • Results determined to be correct are Accepted and are taken into account when reporting the final state of the test (Pass/Fail).
  • Results that cannot be passed with confidence are Rejected. These results are not taken into account in the final (Pass/Fail) determination, but the result and reason for rejection are still displayed for you to evaluate.

Why do some tests have more than 3 assigned testers?

If a result is rejected, we may recruit additional testers until we have a consensus on whether the test has passed or failed. Although it is not typical, tests can sometimes recruit up to 12 testers.

Update to our Algorithm

Our current algorithm: Simple Vote Multi-Fail (SVMF)

  • Our current algorithm looks for 2 matching results (consensus) or seeks out 3 conflicting failures.
  • 2 matching results, either 2 passes or 2 failures, is translates as a "consensus" result. When two testers successfully execute a step in a browser and answer with a 'yes', this step is deemed as having "passed". If 2 testers fail the same step by answering 'no' , that step and the test as a whole in that browser will be judged as having "not passed".
  • For 3 conflicting failures, if 3 testers fail a test at 3 different steps, then the test result is deemed to not have passed. If this were Simple Vote, the system would wait until 2 matching results came up, which typically translates into longer run times.

Our previous algorithm: Simple Vote

  • The algorithm looks for 2 matching results, a consensus within a step, from testers and declares the result of the test. For example, if two testers pass a step, then the result of the test is “pass” provided all steps pass. If two testers fail step, then the step considered a “fail” and will lead to the test failing.

What changes can I expect?

  • There are new UI elements in the results section. Specifically, the test results page will show multiple "X"s on a test that failed at multiple steps. This will indicate where testers failed a test which leads to the overall failure.
  • Since there is less need to recruit more testers in a given test, tabular variables will be less likely to be exhausted during a run, preventing unnecessary failures.

Why are some results rejected?

This is a common question, and you may notice some tester that completed the test had their results rejected. Results may be rejected for the following reasons:

  • The tester’s results do not agree with the other testers (includes Simple Vote and SVMF).
  • (Only for accounts set to ‘Paranoid’ mode) If a single tester has reported a failure, all passing results will automatically be rejected.
  • The tester completed the test too quickly for us to be confident in the result.
  • The tester did not navigate to the URL defined in the test steps.
  • A Rainforest admin rejected the tester’s results after reviewing their work.
  • The tester failed to perform the expected actions (clicks, scrolls, mouse movements, etc) over several steps.
  • The tester failed a built-in quality control test.

Why do I see different numbers of testers in my results?

Rainforest tests will always pull in a minimum of 3 testers per platform a test is run against to produce a result. In some instances, more than 3 testers are required to qualify a result. Some reasons for pulling in extra testers include:

  • Rainforest detects ambiguity in test results
  • Rainforest detects disagreement between 3 original testers for a platform
  • Rainforest detects that a tester has exited their task prematurely
  • Rainforest detects that a tester has been idle for longer than
  • Rainforest detects that a run is taking longer than expected to complete
  • Rainforest detects significant discrepancy in the meta-data (time on step, button clicks relative to other testers, etc...) between testers sent to a run

If you have any questions regarding the results of a particular test, please reach out to!

Did this answer your question?