By Tom Deutsch
By Nancy Kopp
By Paula Wiles Sigmon
By Joe Borges
By Stuart Litel
By Lester Knutsen
By James Kobielus
By Cristian Molaro
By Leon Katsnelson
By Susan Visser
By Bernie Spang
By the DB2 Guys
By Fred Ho
By Louis T. Cherian
By Shweta Shandilya
By Lawrence Weber
By Serge Rielau
By Dwaine Snow

There is a lot of buzz about big data—so much, in fact, that some would even call it hype. Many organizations have appointed individuals to lead their data governance programs. In addition, these companies have business data stewards with deep subject-matter expertise who are accountable for data quality and business definitions. Although it is easy for data governance leads and data stewards to dismiss big data as a fad, they do so at their own peril because they must govern all data, regardless of its size.
These individuals must take three critical actions as their roles evolve over the next 24 months to accommodate big data governance:
Most organizations already have big data—they just don’t know it. I spoke to an insurance carrier recently that was starting up its data governance program. Their team told me they wanted to focus on information such as customer phone numbers and addresses that had severe data quality issues. However, when I probed a bit further, I learned that they were considering a telematics pilot. As part of this pilot, the insurer was going to offer lower rates to policyholders in exchange for permission to put on-board sensors on vehicles that monitored drivers’ behavior on the road. For example, the insurer could offer a 20 percent discount on an auto insurance premium if the sensor showed that the policyholder drove no faster than 60 miles per hour.
The insurer anticipated that it would be overwhelmed with a large amount of data from the vehicle sensors, so it had to establish a policy regarding the retention period for telematics data. Other industries have a lot of other types of big data such as social media, clickstreams, and unstructured information. Data governance leads and data stewards need to get their arms around all of this data.
A number of regulations are starting to address privacy concerns about the use of big data. For example, utility smart meters collect information about the use of electricity at intervals of an hour or less—which can be compiled to create detailed profiles of households. As a result, American public utility commissions and the Article 29 Data Protection Working Party in the European Union have rolled out legislation regarding the appropriate use of smart meter data by utilities. Data governance teams need to understand the impact of these regulations on a utility’s data management environment.
In the life sciences industry, the United States Food and Drug Administration has also issued detailed guidelines for specific processes companies must adopt when responding to public, unsolicited requests on social media for off-label information.
Banks also need to consider the implications of regulations such as the United States Fair Credit Reporting Act, which governs the type of information that can be used to make credit decisions on individuals using social media. If banks do so, it can be hard for them to later prove that they did not use prohibited information to make those decisions.
The list goes on and on. In many cases, the impact of these regulations is not always fully understood. Data governance leads and data stewards need to work with key stakeholders from the business, legal, and privacy areas to establish policies regarding the acceptable use of big data.
The core data governance disciplines of data quality, metadata, privacy, and information lifecycle management also apply to big data. However, application of these principles works differently than it does with small data. For example, organizations that use tweets to conduct reputation analysis also need to consider whether the data set is truly representative of their customers:
The social listening department at one high-end retailer had to address senior management’s concerns about whether Twitter users were in a different demographic from their traditional customers, who were primarily female, over 30, and with a household income over $100,000. The social listening department conducted marketing surveys and found, to their surprise, that the demographics of their Twitter users were actually very similar to their traditional customers. Armed with this survey data, the social listening department attracted more attention and a bigger budget from senior management.
Big data is here to stay. The use of big data is only going to become more pervasive within organizations. At the same time, more and more enterprises are appointing full-time data governance leads and data stewards. These companies need to embrace big data and extend it to derive maximum value and avoid being left behind.
Do you agree? Which issues do you see as most critical for governance leads and data stewards as they begin working with big data?
IBM Big Data, Integration and Governance 2013 Forums
Attend an event near you to learn how leading organizations are making sense of massive amounts and new types of information to create value
DB2 TechTalk: Deep Dive on BLU Acceleration in DB2 10.5, Super Analytics Super Easy
Thursday, May 30: 12:30 – 2:00 PM ET
Informix Chat with the Lab: Primary Storage Manager (PSM) a Parallel Backup Alternative to Ontape
Thursday, May 30: 11:30 – 1 PM ET
Big Data Executive Summit
June 7 (Dallas) and June 10 (San Francisco)
Big Data Seminar 2013, Featuring Krish Krishnan
June 14 in New York City
Hadoop Summit North America
June 26-27
Big Data and the Enterprise: A Perspective from Featured Gartner Analyst Donald Feinberg
July 11: 11AM ET
marcus evans Pharma Data Analytics Conference
July 10-11 in Philadelphia
IBM Smarter Content Summit 2013
Register now!
Big Data at the Speed of Business
Broadcast event replay now available
Information on Demand 2013: Early Bird Registration Now Open
November 3-7 in Las Vegas