Didn’t someone say that the road to migration would be a fairly easy one? But did they mean easy to implement, or easy to test? Actually, we could square away our implementation cycle very quickly, and accelerators can make this process even faster.1 However, you may remember from Part 1 of this article that we are de-engineering the prior implementation into a new one—which does not detract from the simple fact that the majority of the work will involve testing, not development.
This may shock the project manager with a carefully laid-out implementation plan who may casually add a day or two of testing just to be safe. But beware the 80/20 rule here: you will spend 20 percent of your time in development and 80 percent in testing. And if you short-shrift testing, quality will suffer as well.
So let’s say you have two brand-new Netezza appliances—one for production and one for development. It’s important to put these high-powered machines to work as quickly as possible to start getting a return on the investment. But before installing them, consider the task ahead; you’re facing an enormous amount of one-time migration work and testing that will allow you to create the foundational data for all operational data to follow.
Imagine this scenario: at one site, you were promised a high-powered machine that could perform a migration in two months. Instead, you got a machine with only a quarter of the power because company policy required you to put that hardware inside the production enclosure, where it became unavailable to you.
Since Netezza scales linearly, your original two-month timeframe became eight months. Even if you scrambled to do eight months of work in five months, this wouldn’t impress clients who were promised a two-month project timeline. The client will realize later what you already knew; there’s absolutely no benefit in attempting to build, test, QA, and performance-tune the entire solution on a lower-powered machine while the higher-powered one sits behind smoked glass doing nothing.
But wait! The functional port is only one slice. What about the legacy data? You’ll need to backfill it to the new machine, too. It’s important to get an early handle on this process. That data is behind a physical and functional fortress, and nobody ever intended for it to be completely migrated to another system. Still, you can be assured that the surrounding infrastructure (network, disks, and so on) isn’t ready for its mass-egress. It’s fine to extract and load small tables, but for larger tables, plan to chop them into multiple files that each contain some portion of the table. Do not attempt to extract the data with “join” logic; the former system’s tables usually are not indexed to handle this method. Just dump the tables and get them to the Netezza appliance. Also, don’t try to offload the data as a single file or “pipe” it from the old to the new systems as a monolithic extract. You should always break apart time-protracted processes because it’s too expensive to keep restarting upon failure.
Migration projects are, by nature, a continuum of discovery. When the migration team in an organization looks into the existing system, the subject-matter experts will almost always find stuff they never knew about. This process comes with three high-level knowledge risks:
You can mitigate the first two risks with research. The last one, however, might trap you. How much did you think you knew, but forgot? Those little time bombs of compromise that your IT staff created long ago with good intentions—well, they’re back.
Here are a couple of things that probably caught you by surprise:
Address these problems with interviews. Recognize them formally in a specification document. Deliver what you know users expect. Remember that specifications adapt under control, but not under a cloud of the unspoken.
You want the users to experience mind-blowing speed. But now you can stun them with rapid delivery, which gives you breathing room to deliver even more. How’s that for a twist in delivery protocol?
1 Netezza Data Integration Framework, Copyright © 2007-2012 Brightlight Consulting, All rights reserved.
Open access broadens Hadoop analytics accessibility for InfoSphere BigInsights
IBM Redbook: IBM information governance solutions can enhance information control
IBM Redbook: Apply information governance principles and practices in a big data landscape
See a video series on the chief data officer and other data professionals at this Big Data & Analytics Hub blog
Discover a holistic approach to risk management
IBM Big Data in a Minute: Learn more about gaining human insights from data