Big Data and Warehousing, Databases

Introducing iKnow Accelerator for IBM InfoSphere MDM Reference Data Management

Providing data integration process automation and delivering trusted information in complex environments

With the IBM® InfoSphere® MDM Reference Data Management (RDM) Hub, customers can apply a master data management (MDM) approach to managing their reference data. The RDM hub replaces silos of code tables spread across multiple applications with an efficient and centralized point for authoring and approval. It also provides MDM functionality such as governance, process, security, and audit control.

As in any initial MDM implementation, a large part of the effort goes into integrating the RDM hub with existing applications and processes. Code tables are referenced by applications across the enterprise IT infrastructure. To reap the benefits of mastered reference data, an RDM project ultimately has to provide an integration solution for each of these applications. Depending on implementation style and method of use, this can involve anything from feeding an existing database table with codes from the new authoritative source (analytical method of use) to rewriting the application (transaction style, operational use).

In the case of reference data where the number of managed entities (code tables) easily reaches three digits or more, but the format (code table metadata) varies only slightly, opportunities exist for automation of reference data publishing. Specifically, many analytical, in-line analytical, and operational reference data consumers such as data warehouses, decision support applications, and operational data stores are regularly serviced by standard ETL tools such as IBM InfoSphere DataStage®, further increasing the opportunities to standardize reference data distribution.

Two data integration use cases for reference data are particularly prevalent:

  • Distributing code tables and code table updates to target systems
  • Using RDM code tables for direct lookups in ETL processes

There is a compelling case to be made for making consumption of RDM content as easy and as automatic as possible in these scenarios.

 

iKnow

The iKnow application, which is offered by Norwegian IBM Premier Business Partner Intelligent Communication AS, provides functionality for operating the IBM InfoSphere platform. iKnow modules address different aspects of data integration process automation and work together with the InfoSphere tools to enable trusted information in complex data integration environments. Features include process automation (scheduling), monitoring, analysis, DataStage usage accounting, and automatic metadata-driven DataStage job generation, which is designed specifically to work with RDM.

 

iKnow Accelerator for RDM

To speed up RDM uptake across the enterprise, an easy path to code table distribution is needed. iKnow Accelerator for RDM takes advantage of the metadata and data publishing features of RDM to feed target systems with mastered reference codes. Metadata-driven DataStage job generation allows customers to automate the process of pushing RDM content, including full extracts and code table deltas to any target reachable by DataStage.

 

Reference data publishing

iKnow Accelerator for RDM can be set up to listen to metadata (schema) and data change notifications issued by RDM and feed these to subscribing systems. The schema change process makes new code tables available for manual inspection, mapping adjustment, and automatic DataStage job generation. DataStage jobs generated for known code tables are automatically updated when new metadata versions are detected on RDM. The system can handle multiple concurrent versions for each code table. The only additional required inputs are the DataStage table definitions for targets and a manual mapping operation. Targets that match the RDM code table format are automatically mapped.

The data change process sends data change messages (full extract, insert, update, delete) received from RDM to the correct DataStage process, which feeds downstream targets.

iKnow Accelerator for RDM uses standard iKnow alerts to signal a need for operator or data steward intervention, such as schema changes that could not be processed automatically or data that did not process correctly.

Metadata-driven reference lookups

To support direct RDM code table lookups in ETL processes, iKnow Accelerator for RDM offers a simplified view into the RDM repository. Users can browse the code table repository, pick the right code table and version, and copy the SQL code needed for direct database access into DataStage or any other ETL tool.

Figure 1. iKnow Accelerator for RDM process overview

 

Delivering trusted information cost-effectively

Intelligent Communication AS is one of the IBM Business Partners that have grasped the importance of data governance and data quality initiatives. By introducing standardized, quality-assured reference data into data integration processes, iKnow Accelerator for RDM helps IT deliver trusted, business-managed information to end users across the organization.

Previous post

Large-Scale Referential Integrity in IBM PureData System for Analytics: Part 1

Next post

Why You Can’t Afford Not to Have a Data Model

Petter Enqvist

Petter is an information management professional with broad experience in data integration projects, mainly in the financial services industry. Petter holds a master's degree in business strategy and international business, and he is currently responsible for master data integration and metadata-driven product features for the iKnow product development team.