Deduplication

Posidex offers a unified, powerful and scalable platform for all de-duplication needs, providing a 360 degree view of customers across products, consolidated across system silos and all lines of business thereby significantly enhance the customer experience. The process involves identifying and resolving duplicate records. Entity resolution based on a combination of attributes like Name, Father's name, Date of birth, Country of birth, Address etc with high precision, recall and speed is most crucial for effective solution.

One time bulk deduplication provides a duplicates free Single and best view of your Customers. This in turn helps improve your engagement with your customers better.

Challenge?

Some of the challenges that are encountered in a typical deduplication process are:

  • Gigantic task involving trillions of comparisons
  • The process gets complicated while working with names and multiple addresses while remains simple dealing with such parameters like Date of Birth, Mobile Number, Phone number etc.
  • Highly resource intensive.
  • Leads to network clogging.
  • Existing de-dupe in the core application capturing system unable to perform dedupe and throwing lot of false positive and slow response time.
  • Predictable SLA of Single View of the Customer was a challenge.

One-time deduplication of large data cannot be accomplished through the conventional matching process of matching record one by one. Our Clip engine from Posidex facilitates for bulk deduplication/matching. Clusters are then formed based on deduplication results and Entity Master (referred as Golden record) table is generated.

Clip employs innovative approach to deal with this problem, the salient features of which are:

  • Based on set theory.
  • Cache‚Äôs the essential inputs of matching by means of persistent java objects.
  • Clusters records of identical features of measure and builds nested sets.
  • Speed is phenomenal compared to conventional matching.All the demographic attributes available for the Entity across all the system are considered to determine the extent of match.

One Time (Day Zero) Data Processing

This process involves the following:

  • Retrieve the data (As on a cutoff date) from the staging area, where the organization has made the data available in the agreed format.
  • Data Profiling to identify suspicious/junk value of different attributes and clean them as per the directions of the organisation.
  • Cleaning, Standardization, Extraction, Enrichment wherever possible with inputs from the organisation at various stages.
  • One-time deduplication (Day Zero) processing.
  • Grouping/Clustering the matches.
  • The organisation to manually decision the LPC (Less Probable Matches) matches
  • Merging of the multiple records of a Entity to form a golden record of the Entity based on the merging rules decided by the organisation

Clip based on the incremental data volume and the base data. SetMatch for large volmes and PrimeMatch for lower volumes.

As against a conventional approach of matching record by record, for bulk processing the data is formed as nested sets on different data elements based on the proximity between the records and processed in bulk. As a result, the processing speed for bulk deduplication is very high.

  • High precision and recall
  • The major bottleneck in the process, I/O operations with the database are almost completely avoided.
  • Uses the proven and validated Proprietary name and address matching logic of Posidex which is very efficient.
  • Dynamic configuration of matching rules and weightages to suit underlying data quality.

Problem

Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.

Solutions

Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.

Benefits

Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.

Problem

Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.

Solutions

Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.

Benefits

Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.

A major General Insurance company needed help in building of Master Customer Data.

Problem

  • Existing system was having scalability issues
  • Lot of false positives
  • Lost lot of time for manual verification

Solution

  • Solution developed on PrimeMatch right from uploading of EOD feed to the clustering and building of master customer data. Both Auto Verification and Manual verification of matches.
  • Concept of Least Probable Clusters and Most Probable Clusters.
  • No change in UI

Benefits

  • Very fast implementation of master customer data. (months of activity reduced to few days)
  • Now moving towards real time check on the customer master through web services etc. Achieved the Single view of the Customer by End-Of-Day (EOD) with efficient Cross- Matching. Inventory of the batches for matching was piling up. Predictable SLA of Single View of the Customer was a challenge.

Problem

Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.

Solutions

Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.

Benefits

Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.

Problem

Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.

Solutions

Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.

Benefits

Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.