The Importance of Real Test Data

Today I had to debug a pretty serious issue. The bug was likely being produced in one of two places. I eyeballed the code that could be affecting this. In the first case I explicitly added to the tests for the first piece of code to check that it treats my case correctly and does not produce the bug - it passed the test. So I moved on to the next piece of code which also looked fine. Eventually I added logging after these points in the code in both places.

After running the code against real environmental data I saw that the first place in my code was actually causing the bug. This was a huge surprise as I explicitly tested for the bug and it was not meant to be happening :/. The problem it seemed was that the data was dummy data for the tests and was not being treated by the code the way it handled real data. Luckily as described in a previous post I wrote code that takes web service results and converts these to data that could be used in tests. I wired this real test data into my tests and voila my test broke. After tweaking the code the tests were finally green and I knew that the bug was sorted.