Skip to main content

Symon.AI help center

Regressor

Abstract

Train the model to predict a column with many possible values.

Train the model to predict a column with many possible values.

This tool looks at all the available regressor tools and picks the best one.

Note

The speed and quality slider is more of a spectrum. This setting determines the number of machine learning models considered. Models that do not support the training data are automatically excluded.

This tool adds two columns to your data: a prediction and a probability. The probability is the likelihood that the prediction is accurate.

When you run the tool, the data is automatically split: 80% of the data is used for training. The remaining 20% is used for testing. Each model being considered is trained and evaluated to select the one with the best score. This is done 5 times to predict the test values (the 20% of your data). The final score is the average of all 5 scores.

Tip

You can configure this tool without using the configuration menu.

In the Add tool menu, start typing the first few letters of the tool name and press tab to auto-complete. Then start typing the name of the column you want to use and press tab to auto-complete.

Data profile chart visuals

Abstract

From the Row viewer, you can access the Data profile link to open the column details and compare column visuals.

From the Row viewer, you can access the Data profile link to open the column details and compare column visuals. The results are available when you select one of the Classifier, Predictor or Regressor tools.

Accessing Data profile from the Build tab
  1. In your selected pipe go to the Build tab and select the classifier node in a built pipe.

  2. Click on the Data profile link in the row viewer.

    The Data profile page opens with column details and compare column visuals.

Note

If you have an explainable tool upstream, you can still get an error message with one of the following issues:

  • The schema has changed in the export. For example, a missing column or an extra column is present.

  • There are multiple explainable tools in the pipe upstream.

  • The pipe changed and the calculation is now invalid.

If there is no Data profle link, there is no explainable tool selected in the pipe upstream.

When to use this tool

Use when you want to predict values.

Regressors solve continuous value problems. For example, if the values in the target column are 1 and 6, the predicted answer could be 5.

What is Smart exclude?

Following a successful build using the Regressor tool, Smart exclude identifies and automatically excludes columns that don’t help predict the target column. Smart exclude will only consider columns not already manually excluded. If you want to disable this setting to troubleshoot, test, or run a calculation that is taking too long, go to the Advanced settings under the Configure tab.

Performance measures

You can select a model performance measure that dictates how Symon.AI selects a winning model. The performance measures that you can choose from are:

  • Automatic: Select this measure to let Symon.AI choose for you.

  • Pearson Correlation: How correlated the predictions are to the actual labels being predicted. The higher the better (maxes at 1).

  • Mean Absolute Error: How close each guess is to the actual value on average.

  • Root Mean Squared Error: How close each guess is to the actual value on average. A square operation is used to punish larger differences more.

  • Symmetric Mean Absolute Percentage Error: An accuracy measure based on percentage (or relative) errors. Relative error is the absolute error divided by the magnitude of the exact value and will be scaled to 0-2.

How to read the data in this tool

In the row viewer, there are three tabs: Data, Stats and Tool.

The Data tab consists of your imported data. View all of the imported data in one spot.

The Stats tab consists of the statistics for your data. View all of the top values for each column.

The Tool tab is a visualization of additional insight into the tool and the data. The following columns are available:

  • Column importance: Displays the columns in order of importance.

  • Smart excluded: Displays the columns that don't predict the target column.

Full details

In the row viewer, on the Tool tab, there is a feature called Full details. Click to expand and explore more information about your model.