In my last column, we explored one of the potential starting points for an information governance journey: the creation of a glossary of terms that are critical to the business. This month, let’s consider another first step that some organizations choose: managing the data lifecycle.
Data lifecycle management is the process of managing business information from requirements through retirement. In this cycle, the data is used for application testing, moved to a production system, moved to an archive system, and ultimately removed entirely as it completes its evolution from a corporate asset to a corporate liability.
Why do some organizations make data lifecycle management their first priority when they launch an information governance initiative? Risk reduction can be one reason. Data lifecycle management can help lessen application downtime, improve adherence to data retention requirements, and reduce the risk of exposing confidential information. Another reason is the opportunity for increased agility in the form of improved application performance. Last, but certainly not least, is the opportunity to reduce costs associated with storage and with application defects.
Let’s explore data lifecycle management capabilities during the testing phase as well as during data growth and retirement and see why they may be worth considering as early steps in your information governance process.
An insurance company with 40 different back-end systems was implementing a new application. As is often the case, the project was behind schedule by the time the process had moved from requirements and scoping to development and testing—so the company cut a corner. The IT department did a thorough job of unit-testing the new application, but it didn’t include system tests because doing them might have meant missing the committed go-live date.
But why were those tests expected to be so slow that they would have jeopardized the project? The problem was a lack of viable test data. Unfortunately, the missed tests had dire consequences. The company found itself selling products in states where the sale wasn’t legal, and it had to take quick and costly action to rectify the situation.
In another scenario, a company did test its new application thoroughly, but it encountered a different issue: the unintentional disclosure of confidential data to a third-party QA team. In an effort to move the process along and provide realistic data for thorough testing, the DBA had provided real production data without masking names, account numbers, and other sensitive data.
With these stories in mind, it’s easy to see that managing the testing process and creating realistic, appropriate test data where any sensitive information is masked can reduce risk and accelerate project completion.
It wasn’t too long ago that more data meant more trust. Business people wanted more information about their customers, their products, and their markets to support sound decisions.
What we’re finding today, though, is that having more data is useful only up to a point. Instead of helping companies make informed decisions, too much data can mean more confusion, less trust, greater risk, and higher costs. All of this extra data has actually become a drawback.
According to Forrester, organizations spent 15 percent of their total IT budgets on storage in 2012.1 However, this number doesn’t tell the whole story. The total costs of storage—including labor, space, power, and cooling—can be many times the cost of procuring it. In addition, DBAs often spend a significant portion of their time on hardware capacity–related performance issues—time that could be spent on revenue-generating projects instead.
What can be done with all of this data? Some of it—the data that has passed its retention date—should be deleted entirely. Much of the rest—whatever isn’t needed regularly for production operation—can be intelligently archived. I say “intelligently” because it is important that the data continue to be searchable and retrievable in context when it’s needed for e-discovery or other reasons.
To increase efficiency, most organizations are going beyond archiving older data and actually consolidating data from similar applications, retiring those that are redundant or out of date. The benefits? One is productivity, since DBAs can spend less time managing excessive amounts of data in old applications and more time implementing new applications. Another benefit is cost reduction, since much of the data can be moved to lower-cost platforms. A third is reduced business risk, as less data typically means minimal disruptions, fewer missed service-level agreements (SLAs), and increased adherence to data retention requirements.
To address these information lifecycle challenges, organizations need a solution that helps them to:
The IBM® InfoSphere® Optim™ portfolio is designed to meet these requirements. InfoSphere Optim solutions help improve application performance, lower IT costs, and support data retention compliance programs. They enable organizations to archive historical transaction records from mission-critical database applications, relocate application data to a secure archive, and retain broad access to archived information.
Which approach is right for you? What is your information governance priority? If application efficiency, storage costs, and risk reduction are at the top of your priority list, you may want to consider data lifecycle management as your first step down the information governance path.
If you have chosen this approach, I would be very interested in hearing about your experience and your progress. Please share them in the comments!
1 Forrester Research, Inc., “Forrsights Hardware Survey, Q3 2012.
Forrester report: Extract business value from social content
IBM white paper: Could your content be working harder—smarter?
And take advantage of open source InfoSphere Streams components
Podcast: Build a business case for real-time analytics
White paper: Deploy Hadoop to gain insights from mainframe data