Classifier

Varicent ELT Help Center

Classifier

Abstract

Train the model to predict a column with a fixed set of values.

Important

The Classifier tool is only available to users on Varicent Advanced Algorithm Library plans. If you are interested in using this tool, please contact your Varicent Customer Success Manager.

Train the model to predict a column with a set of values.

This tool checks multiple classification algorithms or models and chooses the best one when model_type is set to automatic.

Note

The speed and quality slider is more of a spectrum. This setting determines the number of machine learning models considered. Models that do not support the training data are automatically excluded.

This tool adds two columns to your data: a prediction and a probability. It's the model's certainty about the prediction.

When you run the tool, the data is automatically split: 80% of the data is used for training. The remaining 20% is used for testing. Each model being considered is trained and evaluated to select the one with the best score. This is done 5 times to predict the test values (the 20% of your data). The final score is the average of all 5 scores.

Tip

You can configure this tool without using the configuration menu.

In the Add tool menu, start typing the first few letters of the tool name and press tab to auto-complete. Then start typing the name of the column you want to use and press tab to auto-complete.

When to use this tool

Use when you want to predict values.

How to read the data in this tool

In the row viewer, there are three tabs: Data, Stats and Tool.

The Data tab consists of your imported data. View all of the imported data in one spot.

The Stats tab consists of the statistics for your data. View all of the top values for each column.

The Tool tab visualizes additional insight into the tool and the data. The following columns are available:

Accuracy score: Displays the performance measure based on your data set.
Column importance: Displays the columns in order of importance.
Smart excluded: Displays the columns that don't predict the target column.

Underlying data

In the row viewer, on the Tool tab, there is feature called Underlying data. Click to expand and explore information about your model. The data answers the following questions for your model:

How good is your model? This matrix helps explain the strengths and weaknesses of your model.
How balanced is your model? This section explains how a well-balanced model is more robust and does not have an overly significant impact from a single column. The visual shows how balanced the top 10 columns are.
What is my model using to make predictions? This section shows which columns are used to make predictions and the reasons why.

Configuration

Use the following configuration options to configure the Classifier tool.

Configuring the Classifier tool

Go to the Pipes module from the side navigation bar.
From the Pipes tab, click an existing pipe to open, or create a new pipe. To create a new pipe, read the Creating a pipe documentation.
In your pipe, add your data sources.
Click + Tool.
In the Tools modal, search for Classifier in the search bar.
Tip
You can also find the Classifier tool in the Learn section.
Click + Add Tool.
Connect the tool to your data set.

In the configuration pane, enter the following information:

Table 82. Classifier tool configuration

Field	Description
Train type	Select the train type that you want to use to train your data.
Target column	Select the target column to train your data against.
Advanced
Performance Measure	Select the type of performance measure to use: Automatic: Let Varicent AI make the decision for you. Accuracy: How often the model is correct about which class it predicts. Area under the curve: The area under the curve (AUC) is the measure of the model's ability to separate two values. The higher the value, the greater the concentration of positive case in the top scores that the model produces. Precision: How often an observation which is predicted to be this value has this value in reality. Recall: How often the model correctly predicts observations with a given value.
Speed versus Quality slider	Use the slider to indicate if you want speed versus quality when the Classifier is working.
Exclude columns	Select the column(s) that you want to exclude from the Classifier.
Smart exclude	elect this option if you want to have Smart Exclude identify and automatically exclude columns that don’t help predict the target column after you build.

Was this helpful?

Would you like to provide feedback? Just click here to suggest edits.

Varicent ELT Help Center

Classifier

Important

Note

Tip

When to use this tool

How to read the data in this tool

Underlying data

Configuration

Tip

Search results