Big Data and Warehousing

Netezza Migration Kung-Fu: Part 1

Why de-engineering should be your guiding principle

Sitting at the desk, staring at the blinking PuTTY cursor, you feel the weight of expectations. Users have seen the Netezza appliance at work, and they are salivating for its power. The CIO bought the technology and now wants ROI. All these desires are embodied in the form of the Dark Lord of Expectations1 in long, flowing robes, eyes glowing red, probing… the force is strong in this appliance, but you are not a Jedi yet.

Gulp.

Take heart. All you really need is Structured Query Language and your data. Armed with these knowledge bases, everything else you’re about to do will seem like magic because Netezza is already prepared to deliver the vast majority of what you would otherwise have to build out and painstakingly tune.

The best part: we can make mistakes. Significant mistakes. Netezza allows us to build out whole data models, fill them with terabytes of data, test them and toss them out—and in only a day or two. With any other technology, by the time the data model is in place and filled with data, weeks or months will have passed, and there’s no time to start over.2 But with the power of Netezza, we can really analyze the data to find out what we want to know.

There already may have been one false start, however. During the proof-of-concept stage, data probably came over “as is.” Now, take a step back and accept some realities, because while “as is” can be good enough for testing, it’s also a lousy way to jump-start a system. Here’s why:

  • The data in the old system will need to move to the new system prior to migration, so plan for this and start it now. Do not underestimate the amount of time this process will take.
  • The functionality in the old system is probably buried under a mountain of performance props that engineering has dutifully applied ever since the system started showing signs of stress.
  • The complexity in the old system is artificial, which requires you to question everything first. Just because complexity exists doesn’t mean it’s necessary.
  • The former data model is suspect because it carries the constraints and weaknesses of the old system.

Take the time to build out the desired target data models you want to keep. Align them with the physics built into Netezza. In other words, why settle for 10x improvement when you could have 100x? Focus on de-engineering throughout the migration process, and learn which Netezza features will help you achieve your goals.

I repeat: do not migrate data “as is” from the old machine to the new, no matter what you may have been told. Frankly, most of the functionality in the old system is not as valuable as you might think. Take the data and leave the rest. De-engineer.

 

References

1 Compleat Netezza, Copyright © 2012 VMII ISBN: 978-1-4610-9574-3. Excerpted by permission.

2 Netezza Data Integration Framework, Copyright © 2007-2012 Brightlight Consulting, All rights reserved.

Previous post

Going with the Flow

Next post

IBM Smarter Analytics Signature Solutions

David Birmingham

David Birmingham is a senior solutions architect with Brightlight Analytics, a division of Sirius Computer Solutions, and an IBM Champion. David focuses on solutions using the IBM Netezza® appliance. He has two books on the subject: Netezza Underground and Netezza Transformation, available on Amazon.com, and he drives the best practices sessions at the Enzee Universe. David has more than 25 years of experience in very-large-scale solution deployment. Connect with David on IBM developerWorks through his profile, the Netezza Underground blog, or meet him in person at IBM Insight conferences in Las Vegas for the Enzee Best Practices sessions.