Products   »   Set Match
Bulk Matching / Deduplication (SetMatch)
SetMatch is the next level innovative search engine technology that aggregates voluminous data into multiple sets of clusters for efficient and super fast matching. It is based on PrimeMatch® (URL here) and leverages all the benefits of alternative identity searching and matching.

What is so different about SetMatch as compared to traditional data matching?
Traditional data matching performs matching of records in the database sequentially. First record is searched against all other records in the database. Then the second record is searched against all other records except the first record and so on. The sequential nature of the linear search operations are very expensive resulting in slower response and almost impossible when the data dealt is beyond certain volume. For instance even when a query is very fast and fetches results in 1 sec against a data of 10 millions, the estimated time for deduplication of this 10 million data is 4 months. As the matching rules increases, it becomes almost impossible to dedupe. Even though indexing helps searching to certain extent, matching partial identities across heterogeneous databases to find duplicate records can be made achievable only by specialized software like SetMatch.

First, SetMatch aggregates the data and classifies the data into clusters of sets based on identical features of measure and builds nested sets that are closely related in matching. Next, SetMatch performs the match among the sets within each cluster. Lastly, the sets of similar matches are compared against similar sets across clusters. By highly parallelizing the matching of sets within and across clusters, SetMatch allows scaling to millions of records and respond with high accuracy matches in minutes or even seconds as compared to several hours by competitors' solutions. Here is the concept visualization flash (URL here) that aides in understanding SetMatch. Within the sets, the exhaustive record to record comparision for identity matching still happens on the PrimeMatch® Technology.

Benefits of SetMatch
SetMatch conducts searching and matching across voluminous databases in real time without any additional hardware infrastructure requirement. It is used in bulk de-duplication searching and matching within a large database or across different large databases significantly improving the processing performance. The major bottlenecks in the process, I/O operations with the database are almost completely avoided. SetMatch leverages all the benefits of PrimeMatch® technology.

For demo, product and solution evaluations and pricing details please contact us.