Skip to main content

Symon.AI help center

Smart Matcher

Abstract

The Smart Matcher tool matches two sets of data that do not have a common ID between them.

The Smart Matcher tool matches two sets of data that do not have a common ID between them. When analyzing data or moving it between systems, inconsistent data can cause problems. The reason for inconsistencies can vary. Maybe the data comes from different systems that use different naming conventions. Or maybe it's human error.

By using the Smart Matcher, we can train a model to match an inconsistent data set with a "cleaned-up" data set. The tool works by taking 2 inputs. You can think of the top input as the transaction data and the bottom input as contact or ID data.

In this example, we have data about hospitals. If you look at the top input Smart Matcher Transactions (Data), there are duplicate entries. For example, one hospital name shows up as "MAYO CLINIC HOSPITAL" in one row but as "MAYO CLINIC" in another. If we brought this data into Symon.AI as-is, it would treat these as two different contacts.

The bottom input Smart Matcher Lookup (Data) is our "answer key" data. It contains the unique IDs used to match the top input. The Smart Matcher uses a many-to-one method to match any number of rows in the top data set to one unique ID in the bottom data set.

Parameters for Smart Matcher
  1. Target columns: This is how the Smart Matcher maps the two sets of data together when you train the tool by using the Update function.

  2. Matching columns: These columns determine how and if the rows are matched between the two sets of data.

Output

The Smart Matcher maps rows from the top data set to a row in the bottom data set.When looking at the results, you can think of them as two halves: the first half is from the top data set, and the second half is from the bottom data set (the answer key). If the tool couldn't make a match, there will be empty columns in the bottom data set. As you add more data to train the model, you may see fewer missing values.

The Smart Matcher also adds a probability column. A higher probability indicates a better chance that the rows match correctly.

Running

When you created this Blueprint, you trained the model using the default data already in the pipe. If you want to use your own data, go to the Run tab and then select new sources for the pipe.

You can also use the Smart Matcher TEST sample data to practice.