Why should Continuous Integration-tests have randomness?
Randomness will allow you to have more test coverage after you have executed test more than once. Test coverage meaning coverage over input space (allowed input values).
Randomness gives your tests possibility to catch annoying input data (if input data is randomized) specific bugs that are unlikely to be covered without randomized data.
There is no point executing same unit test against same unchanged code -- and this is what is usually happening when running all your tests.
And then the math:
- lets assume that there is a bug that shows up in 10% of the input space (for example when the tested functions integer argument is dividable by 10.. TODO: think a better example)
- this means that a test executions possibility to not detect this bug is 90%
- if the test uses randomized inputs and it can choose any input then the possibility that the test misses the bug in N executions is 0.9^N (the test will find the bug in 100 executions with a possibility grater than 99.99%)
- if the test hasn't randomized input it doesn't matter how many times it will be executed - it will either find the bug first time or not find it at all (the test will find the bug in 100 -- or million executions wit a 10% possibility)
Problems with randomness
- Repeatability
- Readability
- More work than with a simple nonrandom data
How to handle these problems
- Repeatability - log the seed of the random number generator
- Readability
- More work than with a simple nonrandom data - one randomized test can cover in multiple executions many nonrandom test cases, data generators can be used in many tests if they are organized nicely