By Tom Deutsch
By Nancy Kopp
By Paula Wiles Sigmon
By Joe Borges
By Stuart Litel
By Lester Knutsen
By James Kobielus
By Cristian Molaro
By Leon Katsnelson
By Susan Visser
By Bernie Spang
By the DB2 Guys
By Fred Ho
By Louis T. Cherian
By Shweta Shandilya
By Lawrence Weber
By Serge Rielau
By Dwaine Snow

Graph analysis is developing rapidly into one of the most promising new segments in the big data market. The vogue for graph analysis was boosted by Facebook’s recent beta of the graph search feature for its online community. Graph search builds on the social graph Facebook announced three years ago, which maps the explicit and implicit relationships among members based on their profiles, timelines, and behaviors within that community.
Graph analysis is, at heart, a mathematical approach for mapping complex relationships among networks of nodes. In the business world, graph analysis has various applications, the most noteworthy being mapping social relationships (as exemplified by Facebook’s offering) and mapping semantic relationships (which are at the heart of what the World Wide Web Consortium’s long-running Semantic Web initiative is all about). A social graph maps relationships that are partly or entirely behavioral in nature (e.g., among individuals within social groups), while a semantic graph maps relationships among words, concepts, and other linguistic constructs within human languages.
Graph analysis is hot these days in the big data arena, but it is not a new technology within the disciplines of data science and advanced analytics. Graph modeling is an established branch of statistical modeling that focuses on mining, mapping, visualizing, and exploring connections, interactions, and affinities. What distinguishes graph analysis is a focus on “graphs,” which are abstract networks of relationships (known as links) among nodes (which may be individuals, groups, companies, products, systems, objects, concepts, words, and other entities). In addition to applications in social and semantic applications, graph analysis has well-established uses in scientific, engineering, and other domains.
Of the technology’s many uses, social graph analysis is most popular, thriving on the gusher of customer intelligence flowing from online communities of all shapes and sizes. In addition to customer profiles and other contextual data, modelers may incorporate a huge range of behavioral information into social graph models. The behavioral data sources might include Facebook status updates, tweets, portal clickstreams, geospatial coordinates, transaction records, interest profiles, call detail records, and usage logs. Social graphs may also incorporate diverse streams of big data—structured and unstructured, user- and machine-generated, and so on—that issue from social media as well as from B2C communities, B2B supply chains, and enterprise applications.
In the enterprise, social graph analysis powers anti-fraud, influence analysis, sentiment monitoring, market segmentation, engagement optimization, experience optimization, and other applications where complex behavioral patterns must be rapidly identified. Graph models are powerful enablers for fine-grained predictive modeling of human behaviors because they help identify the likely behaviors of individuals in their fuller context of groups, relationships, and influence. These models offer microscopically detailed views of the customer experience by focusing on human actions and interactions.
Semantic graph analysis is also a well-established discipline and a substantial focus of many big data initiatives. It is fundamental to search optimization, content analytics, and other cutting-edge applications of advanced analytics against unstructured data. Data scientists explicitly build semantic graph models as ontologies, taxonomies, thesauri, and topic maps using tools that implement standards such as the W3C-developed Resource Description Framework (RDF).
Whether you’re doing social, semantic, or some other form of graph analysis, this approach is outside the core scope of traditional analytic databases and even beyond the ability of many Hadoop and NoSQL databases. Graph databases are an embryonic (but potentially huge) segment of the big data arena. However, that doesn’t mean you have to acquire a new database in order to do graph analysis. You can, to varying degrees, execute graph models on a wide range of existing enterprise databases. Nevertheless, where social graph analysis is concerned, there is a growing market for graph databases or graph stores, which are specifically optimized for it. And where semantic graph analysis is concerned, you can do it on specialized RDF triple-store databases or on enterprise databases, such as DB2 v10, which provide triple-stores extensions.
But if you’re serious about graph analysis, you’re going to need to ramp up all three big data Vs—volume, variety, and velocity—to do it effectively. Depending on the amount of data, the complexity of models, and the range of applications, graph analysis can be a huge consumer of processing, storage, I/O bandwidth, and other big data platform resources. And if you’re driving the results of graph processing into real-time applications, such as anti-fraud, you need an end-to-end, low-latency database architecture.
What do you think? Let me know in the comments.
IBM Big Data, Integration and Governance 2013 Forums
Attend an event near you to learn how leading organizations are making sense of massive amounts and new types of information to create value
DB2 TechTalk: Deep Dive on BLU Acceleration in DB2 10.5, Super Analytics Super Easy
Thursday, May 30: 12:30 – 2:00 PM ET
Informix Chat with the Lab: Primary Storage Manager (PSM) a Parallel Backup Alternative to Ontape
Thursday, May 30: 11:30 – 1 PM ET
Big Data Executive Summit
June 7 (Dallas) and June 10 (San Francisco)
Big Data Seminar 2013, Featuring Krish Krishnan
June 14 in New York City
Hadoop Summit North America
June 26-27
Big Data and the Enterprise: A Perspective from Featured Gartner Analyst Donald Feinberg
July 11: 11AM ET
marcus evans Pharma Data Analytics Conference
July 10-11 in Philadelphia
IBM Smarter Content Summit 2013
Register now!
Big Data at the Speed of Business
Broadcast event replay now available
Information on Demand 2013: Early Bird Registration Now Open
November 3-7 in Las Vegas