By Leon Katsnelson
By Susan Visser
By Bernie Spang
By the DB2 Guys
By Fred Ho
By Louis T. Cherian
By Shweta Shandilya
By Lawrence Weber
By Serge Rielau
By Dwaine Snow
Big data pushes the scalability frontier. But where does the frontier begin and end?
When people think about big data, they generally envision macroscopic scaling: in other words, bigness that spans the globe or, at the very least, some large data center. People generally associate big data platforms with server farms that sprawl across huge tracts of real estate, and which are populated by ever-larger racks of processing, storage, memory, and interconnect nodes arranged in endless rows.
However, big data platforms must also push the nanoscopic frontier of scalability. It’s useful to think of a consolidated big data platform, architecturally, as a fractal structure. What that means is that big data must be self-similar on all platform scales—from macro to nano—and leverage the full parallel processing resources that are available at each level.
Big data’s self-similarity comes from one of its central architectural features: pervasive parallelization. This architectural principle should operate concurrently at on three levels of scalability in the architecture of your big data platform:
Scale-in will steadily increase its importance in your scalability strategy. Miniaturization remains the juggernaut uber-trend, and subatomic density is its frontier. We all know that today’s handheld consumer gadgets have far more computing, memory, storage, and networking capacity than the state-of-the-art mainframes that IBM and others were selling back in the Beatles era. And we’re all starting to get our heads around quantum computing, atomic storage, synaptic computing, and other “scale-in” approaches that will keep pushing Moore’s Law forward for the foreseeable future.
For enterprises that are serious about pervasively scaling their consolidated big data architectures, a balanced strategy should incorporate all three approaches—scale-up, scale-out, and scale-in—depending on the workloads you’re trying to optimize.
You will need to scale big data elastically and concurrently toward both the infinite and the infinitesimal.
What do you think? Let me know in the comments.
DB2 TechTalk: Deep Dive on BLU Acceleration in DB2 10.5, Super Analytics Super Easy
Thursday, May 30: 12:30 – 2:00 PM ET
Big Data Seminar 2013, Featuring Krish Krishnan
June 14 in New York City
marcus evans Pharma Data Analytics Conference
July 10-11 in Philadelphia
IBM Smarter Content Summit 2013
Big Data at the Speed of Business
Broadcast event replay now available
Information on Demand 2013: Early Bird Registration Now Open
November 3-7 in Las Vegas