Introducing Fit-for-Purpose Architectures
The idea behind fit-for-purpose architectures is a deceptively simple—yet extremely powerful—one, and it is expected to greatly impact IT infrastructures over the next 10 or more years. Going forward, enterprise IT is likely to have many more options than it has historically been available to tightly pair a given compute request to an execution capability that is optimized for that task. The advantages of this tighter pairing have become compelling to a point where they cannot be ignored, and enhanced performance is being experienced by firms that have mastered this pairing.
Many of these gains have been driven by big data use cases that spurred development of new technologies to solve unique problems. In some cases, there is an emphasis on sheer performance. In others, the technology is focused on data type manipulation. In every case, however, users need to be able to take the highly specific and optimized execution environment for a given compute task and relate it back to the rest of the existing enterprise architecture.
Fit for Purpose Architectures do not replace today’s largely relational-oriented systems but instead augment them, interoperate with them, and allow them to do what they do best. This pairing of traditional and new approaches has enabled novel solutions, such as real-time ad placement in the digital media space. We are now seeing never-before-possible solutions move into the traditional enterprise.
So why now? Over the past twenty years, enterprise architects have seen significant changes in how applications are delivered, integrated, and scaled. They have not, however, seen major shifts in where most of this data is stored, or the type of analytics being performed on the data.
The fields of structured data management and unstructured information management have matured, but matured separately. Persistence has largely meant relational data stores or a separate Enterprise Content Management (ECM) repository. The current model requires separating structured and unstructured data and treating them as separate endeavors. Unstructured data goes in the systems that have largely been designed to handle documents. Structured data goes in databases.
Most of the innovations of the past ten years have focused on ubiquity of access and developing web-based architectures, rather than on the base storage and compute paradigms. Analytics only happens once the data is stored—not while the data is being generated, and typically after a time-intensive ETL process. Our core paradigms of where information gets stored, how it’s queried, and how it’s managed throughout its lifecycle have remained relatively unchanged.
What we have seen in the last four years, however, is the emergence of technologies that are revolutionary rather than evolutionary in their approach to solving problems. The new components in a Fit for Purpose Architecture reject some of our long-held notions about atomic handling of transactions, immediate consistency, and the need to strictly structure the data in return for excellence in handling a specific compute problem.
By now, the stories of companies like Google, Facebook, LinkedIn, eBay, and Yahoo are well-known. These firms dealt with data challenges whose scale and scope was previously unheard-of—so they had no choice but to create and adopt revolutionary technologies.
Some of these revolutionary technologies include Hadoop (for ultra large-scale data-agnostic storage and jobs), Cassandra (for extreme write performance), Neo4J (for graph analytics), and InfoSphere Streams (for ultra-high performance and efficiency on in-motion data). All of these technologies provide new capabilities by making trade-offs where conventional database or ECM systems do not. In doing so, they provide levels of flexibility, scalability, and sheer performance that have not been available previously.
It is important to note, of course, that these revolutionary technologies do not render existing solutions any less critical. Relational databases will be the most broadly deployed technology because the two use cases where existing databases excel—high-performance complex queries on structured data and high transaction rates with strong transaction consistency guarantees on structured data—are extremely important for a broad set of application and data handling needs.
The long-term ramifications of the NoSQL or new-SQL movements that these organizations initiated and incubated are just now being felt in the enterprise. In the future, conventional enterprises will follow suit by tightly matching the underlying compute problem to the best platform for handling it. Relational databases will continue to be extremely important, but they will not be the default choice; they will be the most common for sure, but not presumed. The best way to solve a particular problem, including considerations such as future uncertainty over information sources and scale requirements, will drive selection of the technology. We are already seeing some of these technologies—mainly Apache Hadoop—be broadly deployed for sandbox or experimentation purposes. The next shift will see Hadoop and the other technologies move into production-oriented use cases.
We’ll cover this shifting paradigm in much more detail in future articles, with a specific eye on how big data technologies can be deployed in a Fit for Purpose Architecture model in conventional enterprises.
In the meantime, let me know what you think in the comments. Do you see this trend gaining traction in your organization? How is it affecting your technology selection processes?