Handling flaky GUI tests correctly in a Jenkins-Junit-mvn-Selenium setup.
A wise man, once said, “Tests should be trustworthy, quick and should run at the right time. If not, there is little value in them”
On note of trustworthy, GUI tests are flaky in nature. That’s a reality.
I have worked in backend projects and had zero tolerance for flaky tests. That was until, I had to work in a frontend GUI project.
Now there is flakiness that is purely because of a bad test automation design where say we are not synchronizing our actions enough and which we must strive to get rid of but then there is other part that is just out of our control (such as network delays, browser related issues etc.).
A full list of why GUI tests are flaky can be seen here in this great article.
Question arises, how can we make our tests more trustworthy and why does it matter?
The best use of test automation is when the automated regression tests are run in a CI pipeline. When a new PR (pull request) is created, goal is to run the automated regression tests as a part of pipeline, so that if there are any breaking changes, we find them ‘before’ the code is merged in develop (the third point of the quote at the beginning of our article “tests should run at the right time”).
This preventive approach, adds a lot of value by failing fast and maintaining code quality. Thus also reducing time to market by reducing test time to only testing ‘new changes’ after the PR is merged.
Now if your tests are flaky, and if they give ‘false-negatives’ in the pipeline, the very same ‘useful tests’ will lose all their value. If post analysis the failures are found to be ‘non-failures’, these tests will now lose their credibility.
If you can not trust your tests, what good are they to your team and project?
Now the good news is, if you rerun failed flaky tests, they tend to pass in the next one or two cycles. And it is less costly (in terms of time) to rerun flaky tests then to run the whole suite again in pipeline.
For example, if you are using mvn, then the surefire plugin with its “rerunFailingTestsCount” set to a value in pom file can do this for you.
This in most cases, will pass your flaky tests, so that only the real errors are left (if any) at the end of the run. Sounds like a good news!
This makes your test more trustworthy!
The problem is, even though now the tests are more trustworthy, the Junit plugin in Jenkins pipeline will still show them as failed (Summary junit), thus failing the whole purpose of running flaky tests in the first place. Not only that, it will throw numbers at you, that you cannot relate to your actual test counts.
A request to change this is raised on Jenkins here.
For example, in the above pic, there were total 31 tests, each of them passing in rerun giving correct count if you see the mvn report. But if you see the report from junit, it tells you that there were 34 tests, out of which 3 failed (which although are true numbers since the flaky tests could have run a few times before passing) but this doesn't serve the purpose that we are trying to achieve from these tests in the first place.
A solution that I came up with for both these problems (until Junit fixes it at their end) was to parse the results using mvn results and not Junit report results in Jenkins. A solution to achieve this is as below.
First use a shell script to parse the mvn results.
Then use this script in the Jenkins pipeline Stage(‘Results: mvn results’)
The result is a pipeline that gives you correct notifications on Slack (Summary mvn).
Thus making your tests more valuable and trustworthy. Saving you time from analyzing false positives and keeping your focus on doing what matters! That is, learning new stuff and building cool things😊!
If you enjoyed this article, then you can let me know by throwing some 💗on it 😊!