Skip to main content

Symon.AI help center

Transaction Outliers

Abstract

Identify transactions where one attribute of that transaction is inconsistent with other attributes.

Finding outliers in your sales transactions can help your organization identify potential problems. Problems can range from data entry errors to a suspicious deal that needs a review. This blueprint works by identifying transactions where one attribute of that transaction is inconsistent with other attributes.

Legend

The tools in this blueprint are color-coded based on their function:

  • Yellow: These tools contain the input data or results that are ready for export. These are what you’ll be mapping to in ICM for the Symon.AI Calc object. The Transaction Data (Data) tool contains the transaction data you want to find outliers in.

  • Pink: The core of the outlier analysis that this blueprint provides.

  • Red: The Outlier Type (Case) tool contains the business logic to combine the results of the outlier analysis.

  • Blue: These tools manipulate and transform data for visualizations.

  • Turquoise: These tools are used to manipulate data for import and exports.

  • Lime Green: These are the output tools which contain results that are ready for export and used in visualizations.

Pipe inputs

This blueprint uses the Adapt (Adapt) tool to map the columns in your data. The required columns are:

  • QUANTITY: The number of units sold in the transaction.

  • TOTAL_SALES: The total transaction amount.

  • PRICE_EACH: Unit price for the items sold.

You can look at the sample data to see how the tool renames columns. Whatever column names you use, you can use the Adapt tool to change them to names the pipe recognizes. In the sample, the remapping looks like this:

  • QUANTITYORDERED to QUANTITY

  • SALES to TOTAL_SALES

  • PRICE_EACH to PRICE

Pipe outputs

There are 3 outputs that this blueprint generates.

  • Export Outliers (Export): This output contains only the transactions identified as outliers. The Outlier_Type column is appended at the end to identify which attribute caused the transaction to be flagged.

  • Export for Outlier Bar Graph (Export): This output contains the count totals for the different types of outliers identified. This is primarily used for visualizations.

  • Export for Over Time Visuals (Export): This output contains the total amount of outlier transactions over time.

How to interpret the results

The Export Outliers (Export) tool contains the set of transactions that Symon.AI has identified as outliers. You can identify the reason why a transaction is an outlier in the Outlier_Type column.

When determining whether an outlier is valid, it’s helpful to check the reason in the Outlier_Type column. For example, if a transaction is identified as a Sales Outlier, it's because the amount in the SALES column was large when compared to other columns. The other columns, which might affect final deal size, were not large enough to be consistent with the final deal size.

Visualizations

This blueprint comes with visualizations and dashboards, no configuration required. Here are just two of the visualizations that provide a useful overview of transaction outliers:

  • Transactions Flagged as Outliers: This visualization shows that 1% of all transactions are potential outliers.

  • Outliers as a Percentage of Revenue: This visualization shows that the small group of outliers makes up approximately 2.9% of total revenue.

How it works

This blueprint looks at three different attributes in a sales transaction to identify whether or not it should be flagged as an outlier. The Blueprint analyzes each of these attributes using three different methods. Finally, the results are tied back together in the Outlier Type (Case)tool.

Let's take a look at how we figure out if the values in the QUANTITY column are outliers. The three tools which look at the QUANTITY column are Quantity Outlier - Z Score (Outlier), Quantity Outlier - PCA (Outlier), and Quantity Outlier - kNN (Outlier). Each of these tools are good at finding outliers in data sets that have different distributions. For example, Quantity Outlier - Z Score (Outlier) is great at finding outliers in normally-distributed data. It doesn't perform as well when data is randomly distributed. That’s where the other tools come in - each tool can find outliers in different scenarios.

If Symon.AI flags a row as an outlier doesn't mean that it's an outlier in the context of the other attributes. If a sales person closes a large deal, looking at the amount alone would flag it as an outlier. But if the reason the deal was so large was because they sold more units than normal, then it's safe to say that the transaction is legitimate and not an outlier. This is what the Outlier Type (Case) tool analyzes: it takes in the raw outlier analysis results and applies business rules to eliminate false positives.