Sitting at the desk, staring at the blinking PuTTY cursor, you feel the weight of expectations. Users have seen the Netezza appliance at work, and they are salivating for its power. The CIO bought the technology and now wants ROI. All these desires are embodied in the form of the Dark Lord of Expectations1 in long, flowing robes, eyes glowing red, probing… the force is strong in this appliance, but you are not a Jedi yet.
Take heart. All you really need is Structured Query Language and your data. Armed with these knowledge bases, everything else you’re about to do will seem like magic because Netezza is already prepared to deliver the vast majority of what you would otherwise have to build out and painstakingly tune.
The best part: we can make mistakes. Significant mistakes. Netezza allows us to build out whole data models, fill them with terabytes of data, test them and toss them out—and in only a day or two. With any other technology, by the time the data model is in place and filled with data, weeks or months will have passed, and there’s no time to start over.2 But with the power of Netezza, we can really analyze the data to find out what we want to know.
There already may have been one false start, however. During the proof-of-concept stage, data probably came over “as is.” Now, take a step back and accept some realities, because while “as is” can be good enough for testing, it’s also a lousy way to jump-start a system. Here’s why:
Take the time to build out the desired target data models you want to keep. Align them with the physics built into Netezza. In other words, why settle for 10x improvement when you could have 100x? Focus on de-engineering throughout the migration process, and learn which Netezza features will help you achieve your goals.
I repeat: do not migrate data “as is” from the old machine to the new, no matter what you may have been told. Frankly, most of the functionality in the old system is not as valuable as you might think. Take the data and leave the rest. De-engineer.
1 Compleat Netezza, Copyright © 2012 VMII ISBN: 978-1-4610-9574-3. Excerpted by permission.
2 Netezza Data Integration Framework, Copyright © 2007-2012 Brightlight Consulting, All rights reserved.
Visit the Hadoop Dev site for information on Apache Hadoop and InfoSphere BigInsights
Blog: IBM is expanding its Hadoop commitment
Case study: See how data governance can enhance employee autonomy
IBM Press book: Discover how IBM realizes value from big data and analytics
Infographic: IBM Insight 2014 by the numbers
Big data in a minute: The composable business