In the dynamics of today’s business world the success of any organization depends largely on organizational decisions made at the appropriate times. And a reliable data warehouse environment plays a pivotal role in helping drive the business decisions that can move an organization toward successful outcomes.
IBM® DB2® 10.5 with BLU Acceleration offers an innovative database and highly advanced technology for business intelligence (BI) applications that is designed to provide cost-effective, optimized performance and ease of use in the following ways:
Migrating data from an existing warehouse to DB2 10.5 with BLU Acceleration can be handled with ease and provide several benefits at the same time.
The data migration task is easy to implement in DB2 10.5 with BLU Acceleration. The only requirement is to have an understanding of the application to be able to decide whether to convert the tables from row organize to column organize (see Figure 1).
|Schema||Purpose and Activity||Table Organization Decision|
|STAGING||Store data extracts as is from an enterprise resource planning (ERP) application and other source systems. The month-end activity extensively uses the Range Partitioning DETACH, LOAD, and ATTACHES processes.||No conversion of the table from row organize to column organize to avoid making changes to the month-end processes.|
|DATASTORE||Intermediate transformed data and mostly tables with no Multi Dimensional Clustering (MDC) and Range Partitioning features.||Convert tables from row organize to column organize.|
|DATAMARTS||Presentation layer tables with no additional table features, except materialized query tables (MQTs), facilitate colocation joins.||Drop MQTs and convert base tables from row organize to column organize.|
Figure 1. Decision making schema for table conversion based on the intended application
For this performance case study between DB2 10.5 with BLU Acceleration and the existing software stack, the source system has a processor that is 33.6 times faster than the target DB2 10.5 with BLU Acceleration test server. The source system also has 16 times more memory than the target system.1
The following five steps can be used to perform the data migration task:
1. Set the workload and create a 32 KB page size database.
Note: The ANALYTICS workload setting will automatically set INTRA_PARALLEL (DBM) to YES, DFT_DEGREE (DB) to ANY, and DFT_TABLE_ORG (DB) to COLUMN.
2. Extract the table definition from the source system, and create the similar table in the target with an additional ORGANIZE BY ROW clause in the create table statement—without any constraints and secondary indexes.
3. Load data into the DB2 10.5 with BLU Acceleration database tables using the OS pipe, or using flat files.
4. Convert the data and index table spaces from DMS to AUTOMATIC STORAGE, if any.
5. Convert the tables from row organization to column organization using the db2convert utility:
The IBM DB2 10.5 with BLU Acceleration database tables were compressed by 83 percent of the base uncompressed data—no index—on earlier versions of DB2.2 The storage-savings graph shown in Figure 2 provides an eye-opening comparison—but the best part is yet to come.
Figure 2. Substantially increased compression for DB2 10.5 tables compared with earlier DB2 versions
Overall storage savings, including the indexes on the source system, resulted in about 93 percent, which represents a highly significant 14 times reduction.3 What do these results mean? Primarily, they demonstrate efficient use of storage on the server, significantly smaller data volume and hence performance benefits (see Figure 3) and considerable cost savings for backup media, and an enhanced end-user experience.
Figure 3. Dramatically enhanced performance for integrated IBM Cognos BI reporting
The entire system was successfully migrated, integrated, and tested with the application in less than two days. The process was particularly seamless and required significantly reduced effort. The performance of the new system demonstrates that DB2 10.5 with BLU Acceleration is designed to provide optimal performance that helps deliver big data analytics solutions at the speed of thought.
What do you think about these results? Share any thoughts or questions in the comments area.
1,2,3 Results based on testing performed May 2013 under the guidance of IBM Hursley Lab — and Simon Woodcock — in the UK by Mohankumar Saraswatipura on a Reckitt Benckiser D5100 Data stage server.