Skip to main content

Varicent ELT Assistant

Word Frequency

Use the Word Frequency tool to perform a word frequency count in a particular column across your data set.

The Word Frequency tool accepts the following variables to search for text or phrases:

  • Exact text match, such as red.

  • Double count, such as red or red pepper.

  • Case-insensitive, such as Red or red.

When to use this tool

Use the Word Frequency tool to perform a word or phrase frequency count in a particular column across your data set.

Input

The Word Frequency tool requires two data inputs. The first one is the data source where the tool will look for specific words or phrases. The second is a data source containing the words or phrases.

Note

The Word Frequency tool only accepts text format.

Configuration options

Use the following configuration options to help create your rules.

Configuring Word Frequency
  1. Go to the Pipes module from the side navigation bar.

  2. From the Pipes tab, click an existing pipe to open, or create a new pipe. To create a new pipe, read the Creating a pipe documentation.

  3. In the Pipe builder, add a data source to your pipe. For more information on adding a data source, see the Data Input tool.Data Input

    Note

    The Word Frequency tool requires two data sources.

  4. Click symon_add_icon.png + Tool.

    The Tools modal opens, where you can add tools, such as the  Aggregate  tool, to your pipe.

  5. In the search bar, search for Work Frequency. Click + Add tool.

    Tip

    You can also find the Word Frequency tool in the Calculate section.

  6. Click the tool node and drag the line to the next tool to connect the tools. If you need to undo the action, click the line and then click Unlink.

  7. In the configuration pane, enter the name of the new column for the frequency count.

  8. Under Column to extract from, select the column in the data set to extract the data from. This column is where you want to look from.

  9. Under Target text column, select the column in the data set that you want to target the data from. This column is the words or phrases that you want to look for.

  10. Expand the Advanced section, and under Match type optionally select the Exact match option for exact matches only.

  11. Under Case insensitive, select the case-insensitive option to use any case form.

    Important

    By default, the search is case sensitive.

  12. Click on the tool name to rename your tool node to a meaningful name. Name your tools in a way that describes the function, not the object or the data action. For example, use “Look up rate” instead of “Join to rate table”.

Usage example

Use this tool to find how many times each word is showing up in a specified column. Then, the tool sums up how many instances of each word for you.

For example, if you want to find out how many times the words Apple, Peach and Orange are listed in a particular column. Enter the target column and run the tool for the sums:

Example of Word frequency