With the IBM® InfoSphere® MDM Reference Data Management (RDM) Hub, customers can apply a master data management (MDM) approach to managing their reference data. The RDM hub replaces silos of code tables spread across multiple applications with an efficient and centralized point for authoring and approval. It also provides MDM functionality such as governance, process, security, and audit control.
As in any initial MDM implementation, a large part of the effort goes into integrating the RDM hub with existing applications and processes. Code tables are referenced by applications across the enterprise IT infrastructure. To reap the benefits of mastered reference data, an RDM project ultimately has to provide an integration solution for each of these applications. Depending on implementation style and method of use, this can involve anything from feeding an existing database table with codes from the new authoritative source (analytical method of use) to rewriting the application (transaction style, operational use).
In the case of reference data where the number of managed entities (code tables) easily reaches three digits or more, but the format (code table metadata) varies only slightly, opportunities exist for automation of reference data publishing. Specifically, many analytical, in-line analytical, and operational reference data consumers such as data warehouses, decision support applications, and operational data stores are regularly serviced by standard ETL tools such as IBM InfoSphere DataStage®, further increasing the opportunities to standardize reference data distribution.
Two data integration use cases for reference data are particularly prevalent:
There is a compelling case to be made for making consumption of RDM content as easy and as automatic as possible in these scenarios.
The iKnow application, which is offered by Norwegian IBM Premier Business Partner Intelligent Communication AS, provides functionality for operating the IBM InfoSphere platform. iKnow modules address different aspects of data integration process automation and work together with the InfoSphere tools to enable trusted information in complex data integration environments. Features include process automation (scheduling), monitoring, analysis, DataStage usage accounting, and automatic metadata-driven DataStage job generation, which is designed specifically to work with RDM.
To speed up RDM uptake across the enterprise, an easy path to code table distribution is needed. iKnow Accelerator for RDM takes advantage of the metadata and data publishing features of RDM to feed target systems with mastered reference codes. Metadata-driven DataStage job generation allows customers to automate the process of pushing RDM content, including full extracts and code table deltas to any target reachable by DataStage.
iKnow Accelerator for RDM can be set up to listen to metadata (schema) and data change notifications issued by RDM and feed these to subscribing systems. The schema change process makes new code tables available for manual inspection, mapping adjustment, and automatic DataStage job generation. DataStage jobs generated for known code tables are automatically updated when new metadata versions are detected on RDM. The system can handle multiple concurrent versions for each code table. The only additional required inputs are the DataStage table definitions for targets and a manual mapping operation. Targets that match the RDM code table format are automatically mapped.
The data change process sends data change messages (full extract, insert, update, delete) received from RDM to the correct DataStage process, which feeds downstream targets.
iKnow Accelerator for RDM uses standard iKnow alerts to signal a need for operator or data steward intervention, such as schema changes that could not be processed automatically or data that did not process correctly.
To support direct RDM code table lookups in ETL processes, iKnow Accelerator for RDM offers a simplified view into the RDM repository. Users can browse the code table repository, pick the right code table and version, and copy the SQL code needed for direct database access into DataStage or any other ETL tool.Figure 1. iKnow Accelerator for RDM process overview
Intelligent Communication AS is one of the IBM Business Partners that have grasped the importance of data governance and data quality initiatives. By introducing standardized, quality-assured reference data into data integration processes, iKnow Accelerator for RDM helps IT deliver trusted, business-managed information to end users across the organization.
IBM big data in a minute: Bringing the power of Hadoop to the enterprise
Video: The right tool for the job
Nature of analytics video: IBM and the swan of all fears
IBM redesigns its Big Data & Analytics website with IBM Watson Foundations capabilities
Visit a website with comprehensive resources dedicated to the chief data officer role
Podcast: Learn about the InfoSphere Streams project at GitHub