Friday, January 4, 2008

On customer data integration (4)

This is post 4 of a 4 part series on the concept and application of Customer Data Integration (hereafter referred to as CDI). The first post dealt with the definition of a number of concepts that make up the field of CDI. The second post, dealt with applying these concepts and defining an overall CDI approach. The third post dealt with key success factors in implementing CDI. This, the fourth post, will highlight some of the application solutions that provide CDI specific solutions.

Types of CDI applications

Two distinct types of CDI applications exists:

1. Data Quality Tools, aimed at improving data quality by providing cleansing and deduplication functionality

2. Master Data Management Tools, aimed at providing a single repository of customer data, made available to other applications through SOA functionality

This post is primarily aimed at the data quality tools (see table below). I will post on Siebel UCM and other MDM tools next week, outside of this series.

Table 1. DQ / Customer MDM vendors

VendorSolutionType
InformaticaInformatica Data QualityDQ
OracleSiebel UCMMDM
IBMCustomer Information FileMDM
SAS / DatafluxData Quality Integration SolutionDQ
IBM / WebsphereWebsphere Quality StageDQ
Trillium SoftwareTS Quality Series 7

TS Discovery 5

TS Enrichment Series 7

DQ
Human InferenceHuman Inference DQ SuiteDQ

Informatica

Comprehensive suite of Data Quality solutions, IDQ (based on acquired Similarity Systems functionality), can be used for both online and off line cleansing and deduplication, provides profiling and migration tools through Powercentre functionality

Key characteristics

  • Flexible, allows for creation and maintenance of specific DQ rules
  • Single repository, easily distributed, simplifies maintenance
  • Ease of integration with both Oracle and SAP products, due to open architecture / adherence to SOA standards

Drawbacks

  • Only a small subset of rules is provided standard, one must build the DQ rules, leveraging functionality provided by the tool
  • Does not provide standard cleansing functionality (address / zipcode checks, naming conventions etc.)

IBM / Websphere

IBM's Websphere suite provides standardised data quality solutions, aimed at both packaged applications, as well as to be used within custom application development.

Key characteristics

  • Supports multi language data
  • Easily import and export meta data
  • Pre-built objects and tables to define and customize data quality processes
  • Easy integration within J2EE custom built applications

Drawbacks

  • Requires Websphere background and programming experience
  • Perhaps less obvious choice when the MDM solution is an SAP or Oracle based packaged solution.

SAS / Dataflux

Dataflux Data Quality provides a single repository with which one can both improve quality of data, profile data to identify areas for improvement and deduplicate existing data in customer data systems. Dataflux is a wholly owned subsidiary of SAS.

Key characteristics

  • A single repository, with flexibility to customize Data quality ruling
  • Provides international support
  • Seamless integration with SAP

Drawbacks

  • Although internationally oriented, limited presence, relevance outside of US
  • Unclear what integration is provided with Oracle based products

Trillium

Provides applications that are used to both improve data quality as well as ensure integration and migration of customer data across the enterprise

Key Characteristics

  • Best–of–class status for global name and address cleansing.

  • Extensive automation of data profiling.
  • SAP Partner, easy integration

Drawbacks

  • Limited use for non-customer data

Human Inference

Human Inference provides a comprehensive suite of DQ tools that focus on compliance (SOx, Basel II, Anti-Terrorism) and deduplication and standardisation of customer data. The products HI delivers provide a rich set of out of the box functionality that can easily be leveraged.

Key Characteristics

  • Best–of–class status for global name and address cleansing.

  • Anti-terrorism specific functionality for financial services industry

  • Comprehensive algorithm for semantic comparison of name and address data

  • Provides out of the box functionality, which lowers the time to implement the solution

Drawbacks

  • Limitations in flexibility

Vendor conclusion

Over the years that I've been active in implementing CRM applications I've been involved in two CDI implementations that involved CDI solutions, one based on Informatica, the other using Human Inference. Whilst Human Inference provided a comprehensive and easy to use solution for the financial services industry in particular, I've found that IDQ is the best solution for companies looking for a flexible solution in which they can implement their own standards for matching, cleansing and deduplication.

2 comments:

Arun Chearie said...

Hi Wouter,

Its a pretty neat post on Data Management, an area that needs to be given deserved importance in Enterprise Marketing Management/Automation endeavors.

The objective behind this comment is to provide a small correction under Drawback for Dataflux solution.

As mentioned in your post, relevance of Dataflux solution for Data Quality and Management is not limted to the US geography alone. To cite an example, for Indian Geography Dataflux is the only solution that has adequate richness and capability to cleanse the local Data.

Indian Data structure is a bit difficult one, unlike United States and other developed nations, names/addresses and other key data elements does not have any fixed standard in India. Yet, its nicely cleansed/managed by Dataflux Solution.

I would think the richness of "Indian Locale" is the one of the key reasons for the soluion to enjoy the leadership position and utilization of it by an array of large enterprises in India.

Regards,
Arun Chearie

Wouter Trumpie said...

@ Arun, thanks for your comment, interesting to learn that Dataflux has expanded its offering outside of the US geography. I will certainly follow up and update my post.