How exactly did you expect shifting testing left to benefit you without shifting the test data left, too?
Much has been written about the “shift left” principle, meaning that testing becomes a consideration much earlier in the software delivery process than it had. In traditional approaches, testing and even the planning for testing begins much later in the development process (to the far right). That thinking has been replaced by the move to shift testing to the left. If you need some grounding on the shift left concept; check out some of these great DZone articles: What Is Shift-Left Testing? by Arthur Hicken and How Shift Left Testing Works by Urvashi Babaria.
The whole “shift-left” principle makes a lot of practical sense. When you are creating just about anything, it doesn’t make sense to wait until you’re done to figure out if there are any problems. For example, if you are planting a vegetable garden, it makes sense to test the soil first before tilling and planting the entire plot. When you are building an application, it makes sense to begin testing before all the code is written.
In the realm of software and applications, in order to do any testing (early or late in the process) test data is needed. It kind of goes without saying that in order to test earlier, you are going to need that test data earlier. Unfortunately, test data is often an after-thought that ends up causing delays that eat up productivity gains made by agile and DevOps investments. Additionally, bad test data can cause misleading and unreliable test results that undermine your quality efforts and cause expense rework and ultimately weakens your delivery credibility. The solution is to shift test data management (TDM) to the left, too!
Making the Shift.
Let’s illustrate what I mean by using a realistic scenario. In the article How Testing Processes Should Change When You Shift Left, Lisa Crispin sets forth a typical shift left scenario:
“Your team has decided to build Feature ABC. The product owner has written an epic for it and the team has sliced that into small, testable stories. You’re having a specification workshop or a discovery session with three amigos (or more) to discuss the stories before the planning session with the whole delivery team. I’ve found that kicking this type of discussion off by asking “How will we test this?” leads to a productive discussion. We can talk about how to test a feature and its individual stories at all levels including unit, service level/API, UI, or other levels as appropriate.”
Shifting TDM to the left means specifying test data needs so that when you ask, “How will we test this?” it becomes, “How will we test this?” AND “What specific data do we need?” This is the perfect time to begin determining the specific test data needed to satisfy those “small, testable” stories. By creating test cases for those stories AND defining the test data requirements, you are already shifting everything to the left.
Having conversations about test data needs earlier in the process is great, but unless you are able to really specify your needs you are not really moving the bar. Sounds easy, but in reality, like most everything, there can be challenges.
Defaulting to making an entire copy of a database available to the team at the start of development is not a great solution. The first issue is the lead time associated with making and loading the copy. This alone can take longer than expected and delay your team from the start (or more likely, the testing is on hold, creating a backlog). Secondly, this sets up a stale data scenario; data ages (quickly), and the customer record that satisfied your test case today might not satisfy it tomorrow, leading to the need to continually refresh/reload which takes you back to my first point.
Developers and testers usually have first-hand experience working with the data and database. However, specifying the test data to satisfy a specific test case, requires more than an understanding. It requires a concise set of data which maybe be hard to specify without being able to be hands-on with the data or visualizing the data set in some manner.
Acquiring and preserving the actual data is a place where most organizations stumble, regardless of their approach. Once you have defined and acquired these data sets, what do you do with them? How do you accurately document what they are?
Loading/reloading the data into your test environments is the next challenge. Given the whole reason behind shifting left is about saving time, ideally, the developer or tester can load the data to the intended environment on-demand without assistance of other resources. This needs to be accomplished in hours and days, not in weeks or months.
Further insights about test data challenges with DevOps can be found here: 5 Most Common Test Data Pitfalls for DevOps.
Overcoming the Challenges.
Like most things, having the proper tooling, is essential to making the job easier and more successful. Testing software is not different, especially when you are looking to shift your test data to the left too. Test data management (TDM) tools are designed help make the process of provisioning test data easier and safer.
To successfully overcome the challenges posed by shifting your testing and test data to the left, the following capabilities are essential.
First is the ability to use filtering to define the specific data set required to satisfy the story. For example, a new feature for claims processing may require a very specific data set such as: active female members, over 40 years old, from the East Region, with a diagnosis code of AA or AB, with a procedure of XYZ with the last 6 months. Your TDM solution should be capable of capturing that specification and of returning a set of data matching that criteria which can be reviewed.
Once the tester has affirmed that the data set is complete and meets the needs of the test, the data set and criteria must be saved until the data is needed for the actual execution of the test. The TDM tool should allow the data set to be tagged to the story, test case, or code branch so that when it comes time to run the test, it is clear what data needs to be used and it is already available. In our example above, there were some data related restrictions on the data set which would also require the TDM tool to periodically refresh the data set.
The ultimate solution for on-demand, hands-frees loading/re-loading of test data is to automate the process. Find a tool that has this function enabled via a REST API. That way you have the ability to provision the test sets to the database in the test environments in the same manner that you would provision other components. Creating a test-data-as-a-service.
Shifting your testing to the left definitely makes sense but, unless you address the test data, you will not realize the full benefit of productivity gains.
About the author.
Carl’s insights are grounded by a 30 year career in the technology field. Starting as a tester for a large system integrator, he has written software; managed software development teams and led large technology organizations. As a manager and leader Carl enbraces new technology, productivity tools and processes to help teams exceed customer expectations. Carl leads the professional services organization at Semele Data; ensuring that our software is deployed to enable our clients provision the best test data possible while eliminating test data bottlenecks.
You can follow Carl on LinkedIn. www.linkedin.com/in/camiller000