Text classifier
Train the model to label text columns from examples.
Use the Text classifier tool to predict values and train the model to label text columns from examples.
This tool evaluates all available text classifier tools and selects the most suitable one.
Note
The speed and quality slider is more of a spectrum. This setting determines the number of machine learning models considered. Models that do not support the training data are automatically excluded.
This tool adds two columns to your data: a prediction and a probability. It's the model's certainty about the prediction.
When you run the tool, the data is automatically split: 80% of the data is used for training. The remaining 20% is used for testing. Each model being considered is trained and evaluated to select the one with the best score. This is done 5 times to predict the test values (the 20% of your data). The final score is the average of all 5 scores.
Configuration
Use the following configuration options to configure the Text classifier tool.
Go to the Pipes module from the side navigation bar.
From the Pipes tab, click an existing pipe to open, or create a new pipe. To create a new pipe, read the Creating a pipe documentation.
In the Pipe builder, add a data source to your pipe. For more information on adding a data source, see the Data Input tool.
Click
+ Tool.The Tools modal opens, where you can add tools, such as the Aggregate tool, to your pipe.
In the Tools modal search bar, type Text classifier, and then click + Add tool.
Tip
You can also find the Text classifier tool in the Learn section.
Click the tool node and drag the line to the next tool to connect the tools. If you need to undo the action, click the line and then click Unlink.
In the configuration pane, under Text classifier type, choose which text classifier type to use:
Automatic
CNN Text Classifier
BOW Text Classifier
Ensemble Text Classifier
Under Text column, select the text column to use the Text classifier for the prediction and probability.
Under Target column, select the target column to use the Text classifier for the prediction and probability.
Use the Speed versus Quality slider to indicate if you want speed or quality when the Classifier is working.