By Tom Deutsch
By Nancy Kopp
By Paula Wiles Sigmon
By Joe Borges
By Stuart Litel
By Lester Knutsen
By James Kobielus
By Cristian Molaro
By Leon Katsnelson
By Susan Visser
By Bernie Spang
By the DB2 Guys
By Fred Ho
By Louis T. Cherian
By Shweta Shandilya
By Lawrence Weber
By Serge Rielau
By Dwaine Snow

In Part 1 of this article, we explored two different strategies for provisioning test data using the IBM® InfoSphere® Optim™ Test Data Management solution—cloning and subsetting production data. In part two, we’ll see how InfoSphere Optim can help privatize sensitive data so test groups can use production data without sacrificing compliance with multiple regulations.
Regardless of the test strategy chosen to create the gold master, it’s almost certain that at least part of the data will have to be privatized. Organizations must be sure that they remain in compliance with a growing number of regulations such as the Sarbanes-Oxley Act (SOX), the Health Insurance Portability and Accountability Act (HIPAA), Payment Card Industry (PCI) data security standards, and others.
InfoSphere Optim users can maintain that compliance in three ways. First, the Optim Data Privacy Providers (ODPP) package allows the privatization of the most sensitive types of data such as names, addresses, credit card numbers, national IDs, dates, and more. In addition, InfoSphere Optim enables administrators to create powerful yet easy-to-write scripts that perform specialized privatization. Finally, administrators can create low-level privacy functionality written in a compiled language such as C, COBOL, or Assembler.
InfoSphere Optim provides graphical user interface elements that allow administrators to easily associate a privacy function to a data element when they use the ODPP package or create scripts.
InfoSphere Optim offers a number of privacy capabilities for different types of information (such as name, date of birth, national ID, and so on) and data type.
Information such as names and addresses are not easily privatized using an algorithm, so InfoSphere Optim supports looking up a different name or address (see Figure 1).
Figure 1. The InfoSphere Optim interface enables administrators to randomly select names or addresses.
With InfoSphere Optim, dates can be aged—in other words, administrators can adjust dates forward or backward in time by a certain number of days, weeks, months, or years (see Figure 2).
Figure 2. Administrators can add or subtract time to dates.
Credit card numbers are particularly sensitive, but most applications expect credit card numbers to be valid; that is, they must have the correct starting digits as well as a valid Luhn checksum digit. InfoSphere Optim provides the ability to generate credit card numbers that are of a random type (issuer) or based on the original value (see Figure 3).
Figure 3. InfoSphere Optim enables administrators to generate randomized yet valid credit card numbers.
National IDs, such as US Social Security numbers, are also very sensitive and must always be privatized when used in development and testing environments. InfoSphere Optim can generate random yet valid Social Security numbers that optionally preserve the area encoding (see Figure 4).
Figure 4. Administrators can generate randomized yet valid national ID numbers.
Using production data in test environments is vital for improving testing accuracy. Yet organizations must adhere to a large and growing array of regulations designed to protect sensitive information. By incorporating a range of privacy capabilities into its easy-to-use graphical interface, InfoSphere Optim helps simplify the process of privatizing data as administrators build test environments from production data.
What are some of the toughest data privatization challenges that you face? Let us know in the comments.
IBM Big Data, Integration and Governance 2013 Forums
Attend an event near you to learn how leading organizations are making sense of massive amounts and new types of information to create value
DB2 TechTalk: Deep Dive on BLU Acceleration in DB2 10.5, Super Analytics Super Easy
Thursday, May 30: 12:30 – 2:00 PM ET
Informix Chat with the Lab: Primary Storage Manager (PSM) a Parallel Backup Alternative to Ontape
Thursday, May 30: 11:30 – 1 PM ET
Big Data Executive Summit
June 7 (Dallas) and June 10 (San Francisco)
Big Data Seminar 2013, Featuring Krish Krishnan
June 14 in New York City
Hadoop Summit North America
June 26-27
Big Data and the Enterprise: A Perspective from Featured Gartner Analyst Donald Feinberg
July 11: 11AM ET
marcus evans Pharma Data Analytics Conference
July 10-11 in Philadelphia
IBM Smarter Content Summit 2013
Register now!
Big Data at the Speed of Business
Broadcast event replay now available
Information on Demand 2013: Early Bird Registration Now Open
November 3-7 in Las Vegas